An introduction to the data science workflow in Python

DC PyLadies | Thursday, 22 March 2018

by @angelaambroz

O hai!

These are the tutorial materials for the 22 March 2018 DC PyLadies meetup, An introduction to the data science workflow in Python.

These notebooks can be viewed and interacted with Binder: (note: this may take a while to load, as it installs all the libraries)

You can also view these notebooks here in GitHub (without interaction).

You can also download these notebooks and install everything on your local machine. To do so, choose a directory where you want to put this (path/of/your/choosing), go to that directory (cd path/of/your/choosing), clone these materials (git clone), install the Python packages with pip, and then launch the Jupyter notebooks (jupyter notebook).

cd path/of/your/choosing
git clone [email protected]:angelaambroz/2018_03_pyladies.git
pip install -r requirements.txt
jupyter notebook

0_Introduction - Welcome to the tutorial! How to get data from files, from databases, and from APIs. An introduction to pandas.
1_EDA - Exploratory data analysis, using pandas and matplotlib. An introduction to visualizations.
2_StatsML - Fitting a linear regression using three different libraries: statsmodels, scikit-learn, and numpy. Comparing the results. Discussion of other libraries for machine learning and statistics.
3_Etc - Some recommended resources for learning more. Other things to learn.

Data sources

The Duke University dataset found in the two files in data/ is from github/Chrissymbeck. I found it via the Data is Plural newsletter.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data		data
imgs		imgs
.gitignore		.gitignore
0_Introduction.ipynb		0_Introduction.ipynb
1_EDA.ipynb		1_EDA.ipynb
2_StatsML.ipynb		2_StatsML.ipynb
3_Etc.ipynb		3_Etc.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

An introduction to the data science workflow in Python

DC PyLadies | Thursday, 22 March 2018

by @angelaambroz

O hai!

Table of Contents

Data sources

About

Releases

Packages

Contributors 3

Languages

angelaambroz/2018_03_pyladies

Folders and files

Latest commit

History

Repository files navigation

An introduction to the data science workflow in Python

DC PyLadies | Thursday, 22 March 2018

by @angelaambroz

O hai!

Table of Contents

Data sources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages