Instructor: Colin Jemmott, [email protected]
Classes: Monday and Wednesday
- Section A: 6:30 PM - 7:50 PM, TM102 1
- Section B: 8:00 PM - 9:20 PM, EBU3B 2154
Office Hours: Wednesdays, 4:00 PM - 6:00 PM, CSE 3234
- Piazza
- Github
- Tableau Online
- JupyterHub To Sync type in terminal:
gitpuller https://github.com/jemmott/dsc-96 master dsc96
This class is about using data to answer questions. This is in contrast with most of your other classes this year, which are about fundamentals and theoretical underpinnings. The questions you get to answer are the big important ones, including “What happened?”, “Why did it happen?”, and “What will happen?”. The data used to answer the questions will range from real-world government data to tweets about UCSD to sound recordings.
By the time you finish this class, you will be able to:
- Identify problems that are good candidates for data science,
- Reframe the problem in a way that can be answered with the available data,
- Evaluate the limitations and quirks of the data,
- Manipulate the data to answer relevant questions, and
- Communicate the results clearly.
This class operates as a data science lab class, meaning that the bulk of time in class is spent doing data science. Formal lectures will be minimal, and meant to help you understand the tasks and the context.
Unless we get behind, there should not be mandatory coding or analytics work outside of class (though you are encouraged to expand on the projects and see how much you can do!). There is significant reading outside of class, and it is important to keep up with the reading because of the limited formal instruction during class time.
The required book for this class is “Confident Data Skills” by Kirill Eremenko. The book is meant for a wide audience, is very recently published, and is inexpensive. We will only be reading chapters 1-5, after which the book goes into techniques that will be covered in more detail in your other classes.
Each week, at least 24 hours before the start of the Wednesday class, you must email me a paragraph or two about the weekly reading. The goal of this is to have a more in-depth conversation than our class time allows. Think of these journals more as emails to discuss something you read with a colleague than as a formal essay.
The topic is up to you, but examples include:
- Relating a topic in the reading to something in your life or in the news,
- Asking a thoughtful question about the reading, or
- Picking a quote from the reading that you agree or disagree with and explaining why. I will read what you wrote and give a quick response (possibly up to a week later). I may bring up what you write to me in class unless you explicitly ask me in that email not to.
Details about assignments and reading are inside the weekly folders on github.
Week | Section | Topic |
---|---|---|
1 | What happened? | Answers on Day 1 |
2 | Data is Messy | |
3 | Questions to Metrics | |
4 | Communicating Results | |
5 | Why did it happen? | Images |
6 | Audio | |
7 | Unstructured Text | |
8 | A/B Testing | |
9 | What will happen? | Prediction |
10 |
Attendance is critical because the bulk of the coding happens in a collaborative manner during class time. Of course, you may have to miss class due to illness, a family emergency, or similar reason. If this happens, you should let me know by email as soon as possible (preferably before class). You will still be responsible for completing the in-class work. Without the collaboration and explanations that happen in class, this will be much more difficult, so I strongly recommend coming to the office hours for help. In-class assignments from a missed class will not be accepted more than a week late unless you ask for and receive special permission.
Some coding will be by yourself, and some will be paired programming, meaning that you will work together with a partner to complete tasks. In both cases you are encouraged to ask for help from the instructor or from other students. This is a collaborative environment, which means that while in this classroom it is ok to show your work to other students and discuss it openly.
However, even in this collaborative environment, the work you do must be your own. Specifically, you must do the actual work of completing the assignment (i.e. typing out the code, moving the mouse) and understand what your code or analysis is doing.
Some examples:
Totally Fine | Unacceptable |
---|---|
While solo programming in class, you get an unexpected answer. You get the attention of the student next to you and say “did you get a negative number for average age?”. They laugh and say no. | While solo programming in class, you get an unexpected answer. You get the attention of the student next to you and say “did you get a negative number for average age?”. They email you a bit of code which you paste into your notebook. |
You keep getting a strange error during class, so you ask someone near you if they know what might be causing it. They look at your code and say “Oh, have you thought about what would happen if your input is too short?” | You keep getting a strange error during class, so you ask someone near you if they know what might be causing it. They look at your code and say “change line 12 to …” |
You get super excited about a class project and decide to keep working on it after class for fun with a group of students. You put the code up on your website while clearly stating that it started as a class project and was a collaboration with others. | You miss a class, and so have to complete an assignment outside of class time. You get an unexpected answer and ask a someone who was in class “did you get a negative number for average age?”. They laugh and say no. |
If you are unsure about if what you are doing is ok, just ask! You will never be reprimanded in this class for asking for clarification. Also note that this is likely a different standard than your other classes
Assignment | Percent of Grade |
---|---|
Journaling responses to the reading | 20% |
SDPD traffic stops Tableau project | 30% |
Best assignment of “Why did it happen?” section (you choose) | 20% |
SDPD traffic stops final project | 20% |
In-class engagement | 10% |
Total Received | Final Grade |
---|---|
70%-100% | Pass |
0%-69% | No Pass |
For this class, the key to academic integrity is accurately representing the status and authorship of your work. I strongly encourage you to read the official UCSD policy on integrity of scholarship.
I am committed to an inclusive learning environment that respects our diversity of perspectives, experiences and identities. You, as a student in this course, are also responsible for maintaining an environment where your fellow students feel safe and respected.
In my opinion, the key to this is recognizing the inherent worth and dignity of every person. If there is a way you could feel more included please let me know, either in person, via email/discussion board, or even in a note under the door.
If you have a disability for which you are or may be requesting accommodations, please contact Office for Students with Disabilities. You must have documentation from the the Office before accommodations can be granted.
Another version of this class is available at https://github.com/afraenkel/DSC96