Topic | Description | Link |
---|---|---|
Lesson | Part 1: APIs | Here |
Lesson | Part 2: Webscraping | Here |
Please note: This lesson makes use of datasets from the
requests
library. The webscraping-ii module requiresselenium
andgeckodriver
and is best taught using Google Chrome.
After this lesson, students will be able to:
- Identify relevant HTTP Verbs & their uses.
- Describe Application Programming Interfaces (APIs) and know how to make calls and consume API data.
- Access public APIs and get information back.
- Read and write data in JSON format.
- Demonstrate how to use the
requests
library.
After this lesson, students will be able to:
- Revisit how to locate elements on a webpage
- Aquire unstructured data from the internet using
BeautifulSoup
. - Discuss limitations associated with simple requests and urllib libraries
Before this lesson(s), students should already be able to:
- Interpret and use Python dictionaries
- Build Pandas DataFrames from dictionaries
- Perform simple data manipulation on Pandas objects\
- Build
for
andwhile
loops in Python - Use
pip install
for package management
- Introduction to APIs
- What is an API?
- Web APIs
- Making API calls
- HTTP
- Independent practice: HTTP
- HTTP Request
- HTTP Request methods
- HTTP Request structure
- HTTP Response
- Response types overview
- JSON
- Guided practice: pulling data from APIs
- Example: Star Wars
- Closing questions
- Introduction to Web Scraping
- Building a web scraper
- Retrieving data from the HTML page
- Retrieving the restaurant names
- Challenge: Retrieving the restaurant locations
- Retrieving the restaurant prices
- Retrieving the restaurant number of bookings
- Summary
When running this lesson, please check the following environment requirements:
- Have Beautiful Soup installed:
pip install bs4
For more information on this topic, check out the following resources:
- Using
selenium
to scrape rendered text: Selenium docs - Using Selenium to enter website information: demo
- Python regex tester: here