Skip to content

GOSHROW/TutorialExtractor

Repository files navigation

Tutorials Extractor

This project was built on an asfterthought to accumulate the various pages of an online tutorial blog into one for easy and offline access.

Thereby I used general standards, semantics and certain Weird Specifications to put up a Web App that accepts tutorial URLs, a chain of pages to scour and scrape through and finally accumulate it if possible by PDFKit through wkhtmltopdf.

It has been specially optimized for TutorialsPoint (since it has clearest syntax to soup) and works great with JavaTPoint and can work for any generic Tutorial even if Sub Optimally.


How to use?

For easiest access, the project has been uploaded to Heroku

Also the Repo can be cloned and accessed by

pip3 install -r requirements.txt

python3 app.py-

The repo is under MIT License. Mail