Skip to content
/ adm Public

Lab material for Algorithmic Methods for Data Science course 2015-2020, Data Science master at Sapienza University

Notifications You must be signed in to change notification settings

ichatz/adm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The repository contains support material for the lab classes of the Algorithmic Methods for Data Science course taught as part of the Data Science master at Sapienza University. The class was tought during the following years:

The material for earlier years is available under the 2015-2020 branch, tought during:

Introduction to Python Basic & Compound data types and Pandas using Rome's Municipality Open Data Portal

The lab provides a quick introduction to basic operations on Python data types and on Pandas Series and Dataframes. The lab concludes with some hands-on data engineering based on a dataset (in CSV format) provided by the Open Data portal of the City of Rome. In particular, the lab uses the dataset listing all hotels, hostels and in general all the structures that can receive guests within Rome, that are active during January 2019.

The lab material is available as an iPython notebook:

Data Visualization based on MatplotLib and Pandas using Rome's Municipality Open Data Portal

The lab provides a quick introduction to basic operations of the MatplotLib, a Python 2D plotting library. The lab uses the dataset listing all hotels, hostels and in general all the structures that can receive guests within Rome, that are active during January 2019 in order to visualize certain aspects.

The lab material is available as an iPython notebook:

Data Engineering with Pandas and Document Databases using Rome's Municipality Open Data Portal

The lab carries out a series of data engineering tasks on the dataset listing all hotels, hostels and in general all the structures that can receive guests within Rome, that are active during January 2019 in combination with the dataset listing all hotels, hostels and in general all the structures that can receive guests within Rome, that are active during January 2018 to compare the growth between the 2 years.

In the sequel, a document database, such as MongoDB, to store & organize the data.

The lab material is available as an iPython notebook:

Web Scraping using the ParkRun Web site

The lab focuses on retrieving data from web pages by examining the HTML documents using the BeautifulSoup. The process of fetching documents from the world wide web is also known as Web Scraping and in this lab we use the ParkRun web site.

The lab material is available as an iPython notebook:

About

Lab material for Algorithmic Methods for Data Science course 2015-2020, Data Science master at Sapienza University

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published