The scraper provides an input box where you can provide a URL. The web scraper will return the list of tags from the given URL response. This implementation relies on the Nokogiri library for parsing the HTML content, so the order of the tags is based on the Nokogiri's parser behavior, which is post-order (you can check its implementation) because the parent tags are registered when the parent closing tags are visited.
The files that contain the juice of the app are (in order of relevance):
- app/controller/url_tag_lists_controller.rb
- app/views/url_tag_lists/search.html.erb
- config/routes.rb
- db/migrate/20170111230240_create_url_tag_lists.rb
- git clone [email protected]:carlos-peru/scraper.git
- cd scraper
- bundle install --without production
- rails s
https://rails-html-tags-scraper.herokuapp.com/
This project is licensed under the MIT license, Copyright (c) 2017 Carlos Castro.