Skip to content

Data Engineer exercise

Notifications You must be signed in to change notification settings

flvndh/awesome-inc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

🦾Exercise

Awesome Inc is proud to deliver and install high-quality products at customer's locations. As part of the Analytics team, your current job is to help us with the following topics.

📃 API

We would like to develop an API that abstracts away the internals of the Awesome Inc database. This API will be used downstream by Azure Data Factory, the integration service used by the company.

👓 Requirements

  • Implement a REST API on top of Awesome Inc database (e.g. using FastAPI);
  • Test the API (e.g. with pytest);
  • Containerize the application (e.g. using Docker);
  • Version your code.

📈 Data Warehouse

We would like to give our business users the ability to answer questions such as:

  • What is the number of installations that the company is doing every month?
  • Which product category brings us more revenues?
  • Which region of the world is our best market?

For that, we would like to create a data warehouse.

👓 Requirements

  • Design a dimensional model capable of answering those questions, and possibly more;
  • Implement this dimensional model (e.g. using dbt);
  • Version your code.

🐱‍🏍 Getting started

The repository contains a docker-compose file that starts a Postgres database containing Awesome Inc data. It also starts PgAdmin in order to manage the Postgres instance. By default, the interface is accessible at http://localhost:8080.

📏 Expectations

This exercise is designed to evaluate both your software development and data engineering skills.

We expect you to demonstrate your ability to write high-quality code. By that we mean code that is easy to maintain, test, configure and deploy. We'll not only assess the end result but also ask you about the design decisions you made.

We also expect you to demonstrate your ability to transform data for analytics needs. The data model you provide must be dimensional and data quality checks must exist.

The tools mentioned in the exercises are part of our stack. We don't expect you to know or use them. If you prefer to use Spring Boot and Spark, well, go for it. We value know-how over tool mastery.

👀 How to share your solution?

Create a private Github repository, publish your solution, and invite flvndh as a contributor.

About

Data Engineer exercise

Resources

Stars

Watchers

Forks