Skip to content
Anna Sapienza edited this page Jun 15, 2022 · 44 revisions

Intro

Welcome to the wiki for the course Social data analysis and visualization (02806) offered by the Technical University of Denmark. This is the main page, where you can access the weekly exercises. If you take a look in the side-bar, you can read about the administrative details (including a very useful course overview), assignments, books, and more.

The class is taught flipped classroom style, where the the lecture and homework elements of a course are reversed. You'll be able to view short video lectures before (or during) the class session, so in-class time can be devoted to exercises, projects, or discussions. Check out the first lecture to learn more.

Assignments

You are going to use Peergrade to submit your assignments and review others. Join the class peergrade by going on www.peergrade.io/join and type in the class code Z2CNEB. Note: The submission period is now open! Please, make sure each member of your group has joined peergrade.

N.B. Make sure you read the assignment guidelines

Assignment 1: https://github.com/suneman/socialdata2022/blob/main/assignments/Assignment1.ipynb

Assignment 2: https://github.com/suneman/socialdata2022/blob/main/assignments/Assignment2.ipynb (I will send you a secret id for the contributions later today. See the Assignment for more info.)

Project Assignment B:

Lectures

  • Before week 1: Info. Take a look at this page before you do anything. This class most likely works a little bit differently from other classes you've taken. The notebook explains pretty much everything - the rest will be explained during the lectures.

  • Before week 1: Python BootCamp. Python is the key tool we use in this class. If you don't feel 100% ready this notebook offers a quick refresher course. You will learn about installing python, about Jupyter notebooks. By the end of this thing, you'll know enough to get going with the course.

  • Week 1: Introduction. This week is all about getting started. It's a light load, since we want everyone to get a good start, especially if you're not a Python Ninja, just yet. Thus, there's room for prep, making sure you're all on top of Python, etc. You can also see the file here on github, but the videos won't display properly.

    • Reading: We'll be looking into crime patterns. Take a look at this article from Science Magazine to get a bit deeper sense of the topic.
    • Note: To run the notebook you first need to download it.
  • Before week 2: Info on Assignments and Final Project. Here's a quick informational video on how we run the assignments and the final project.

  • Week 2: Let the data science begin. Ok. So now that everyone's up to speed with Python and Pandas, we'll start with a little intro on data visualization while continuing the analysis of the data that we downloaded last week. You'll learn that just calculating simple distributions (and conditional distributions) can teach you A LOT about a dataset. But that's not it. We'll also get creative with plotting GPS data. So LOTS to do today. No time for reading :)

    • Reading: No reading this week. Just fun with coding.
    • Note. We are having issues with rendering notebooks with nbviewer: please, access and download the notebook here
  • Week 3: Plotting single variable data. This week we go deeper with the dataviz lectures. We'll also start reading independently and learn about the many different ways you can visualize just a single variable.

    • Reading: Data Analysis with Open Source Tools Chapter 2. To find the text, you will need to go to DTU Learn. It's under "Course content" → "Content" → "Lecture 3 reading".
  • Week 4: Heatmaps and data errors. GeoSpatial data is a very important category, so this week we dig deeper with options for visualizing that data-type. Including strategies for making little movies. We also have a small exercise to talk about errors in the data, which draws on some of the work we've done in previous weeks. I hope you enjoy todays relatively light load.

    • Reading I. Read through the following tutorial How to: Folium for maps, heatmaps & time data. Get it here
    • Reading II (Optional) There are also some nice tricks in Spatial Visualizations and Analysis in Python with Folium. Read it here if you'd like, otherwise it should be safe to skip:
  • Before Week 5: Week 5 will be posted later than usual this time around. However, you can start preparing the class by reading Chapter 3 from DAOST.

    • Reading: Data Analysis with Open Source Tools Chapter 3. To find the text, you will need to go to DTU Learn. It's under "Course content" → "Content" → "Lecture 5 reading".
  • Week 5: More plotting, linear regression. This lecture features more lecturing and a short intro to machine learning that we will need during next week! We get into exploring data with two variables, something which we have read about (see blow). Then we do logarithmic plots and have lots of fun with linear regression, its associated math, and sklearn.

    • Reading: DAOST Chapter 3. To find the text, you will need to go to DTU Learn. It's under "Course content" → "Content" → "Lecture 5 reading".
  • Before Week 6: Reading takes time, and I am a slow reader as maybe some of you! Next week we are going to continue our introduction to machine learning and to have all of you ready for the awesome work I need you to read a couple of chapters from "Data Science from Scratch". Make sure to read it before class so you can spend the time during the lecture on exercises and additional material.

    • Reading: Data Science from Scratch Chapter 11 (Intro to machine learning) and 17 (Decision trees). To find the text, you will need to go to DTU Learn. It's under "Course content" → "Content" → "Lecture 6 reading".
  • Week 6: More Machine Learning. Today we continue working with machine learning and together with some more fundamentals we will also have an introduction on Decision Trees. We will then put everything to practice and make predictions on crimes. In particular, we are going to explore a new dataset to predict criminal recidivism!

    • Reading: In addition to the before Week 6 reading, I have added a lot of additional material for those of you who crave it. I highly recommend to go through the fantastic visual explanation of decision trees.
  • Week 7: Interactive data Viz and ML Bias. Today we explore more in depth the new dataset on criminal recidivism by using interactive visualizations with Bokeh! We then study bias in Machine Learning find out if the model we built from week 6 is biased and try to correct it to enhance our model fairness.

    • Big thanks to Germans, our own fairness & bias expert 😊
  • Before Week 8. Next week we are going to talk about explanatory data visualization and narratives. Please, read this paper before class so you will be up to speed and use class time for exercises, questions, feedbacks etc.

  • Week 8 Explanatory & Narrative Visualization. Today you will focus on explanatory visualizations and how to create a story out of the awesome exploratory data analysis you have done in the previous weeks! You will first have an introduction to explanatory visualizations and then will focus on creating 3 main visualizations that will explain the analysis and results on the criminal recidivism case-study. These visualizations will be the backbone of a story that you'll write to explain this case-study to the general public!

Clone this wiki locally