Skip to content

sammaphey/data-engineer-tech-interview

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

The following project is meant as a data science exercise to highlight and showcase your technical abilities.

Essentially, we want to mimic (at small scale), a data science project that is similar to one you would encounter for the position. For this particular project you are given a csv file that represents the major U.S. sports.

Your task is to ingest this CSV file and create a couple of interactive plots, and move the data into a "database".

The resulting code, plots, and "database" files should be uploaded to your github account and linked in the response email.

You may start with either task but the outline below should describe what we are looking for in each part.

Creating Plots

  • Create a plot that represents the total stadium capacity (as a bar chart) of each team. An example can be found here.
    • Now make this plot interactive, such that you can filter the data being plotted based on League (MLB, NFL, etc.), City (Los Angeles, New York, etc.), or any other field you find relevant.
  • Plot the cumulative sum of Stadium Capacity over time, it should look something like this plot.
    • As an example the stadium capacity for the New York Mets begins in the year 1962 (the year it was founded)
  • Build a plot that shows the number of championships per field (State, City, etc.). Allow the user to define which field they want to plot against.

Writing to a database

The "database" in our example will simply be a directory that will contain JSON files. You are to take each row in the csv document and create them as separate documents where the file name is the name of the team.

We expect there to be around 125 JSON files in the database directory and uploaded to the git repo.

Final Thoughts

The project does not require any particular language, but the final returned files should have an organized structure and have comments and documentation explaining how to make the resulting files.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published