Skip to content
This repository has been archived by the owner on Feb 13, 2023. It is now read-only.

datastackacademy/deb-archive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TuraLabs Data Engineering Bootcamp

Learn to deploy end-to-end Serverless Data Engineering Pipelines on GCP via the most comprehensive and FREE online course.

Description

This repo contains the code for the TuraLabs Data Engineering Bootcamp(DEB). Code scaffolds and starting data mentioned in the DEB lessons are contained in this repo.

QuickLinks

Course Pre-requisites

This course immediately starts covering mid to high level topics. Therefore, we strongly recommend that learners have some experience with Python and SQL. For a more in depth explanation of what Pre-requisites are expected and a list of resrouces to bring you up to speed, please visit our blog post on Helpful Resources to Prep for this Course.

Setting Up Your Dev Environment

The DEB course uses Python and Google Cloud Platform(GCP) tools. Please follow the instructions in our Getting Started Guide to make sure dev environment is properly set up and compatible with our course. If you have any issues getting your dev environment up, pleaes visit our Discord Channel to talk to us.

Need Help?

We're here to help! If you have any questions, please connect with us on our Discord Channel and one of us would be happy to help you!

Suggestions

If you have any suggestions for the course or website, please feel free to open a GitHub Issue within this repo. We also welcome suggestions in the suggestion channel on our Discord Server.

Patch Notes

20210121 New Chapter 3 Episode 4 Added

Chapter 3

20210108 2 New Chapter 3 Episodes Added + Small Fixes

Chapter 1

  • fixed broken links
  • fixed typos
  • reworded for clarity

Chapter 3

20201204 New Chapter 3 Episode + Small Fixes

Chapter 1

-Fixed typos in chapter 1 overview (thank you Senad)

Chapter 2 -Added GCS source file download instructions to chapter 2 episode 2 (thank you Jason) -Fixed API registration link in chapter 2 episode 4 -Updated Portman API documentation links in chapter 2 episode 4 -Enhanced chapter 2 episode 5 webapp

Chapter 3

Blog -Added Spark Explained blog post

20201117 New Blog Post + Small Fixes

General

  • Fixed broken link this GitHub issue
  • Fixed typos

Website

  • Add Curriculum Overview page (/deb-info)
  • Add Registration page (/register)
  • Add About Us page (/about)

Blog

  • Add “What is Data Engineering? How is it different from Data Science?” Blog post

20201021 Ch1 Update + Small Fixes

General

  • Switching Slack to Discord
  • Fixed broken links to GCP Console and external documents

Ch1

New Ch1Ep5 lesson for advanced pandas use to replace the aircraft dataset to the latest FAA records

20200930 Ch2 Update + Small Fixes

Website

General

  • fix broken links

Ch1

Ch 2

  • add note in Episode 5 pointing towards Slack channel if you have any questions running a React App

Blog

  • add Pandas and Jupyter Notebook episode as a standalone blog post

DEB Repo

Ch 2

  • updated API and webapp for end of chapter to fix CORS issue
  • updated flights API request based on new query syntax
  • updated webapp READMEs to refer to Getting Started docs to acquire service account key

20200923 Small Fixes

General

  • fix typos
  • fix broken links
  • WSL User Setup
  • point to windows WSL2 initial setup
  • WSL vs WSL2: https://docs.microsoft.com/en-us/windows/wsl/compare-versions
  • Mention install ubuntu 20.04 from MS Store
  • fix using python3.7
  • add instructions to install python3.7 from deadsnake repos
  • adding pip to PATH: home/{username}/.local/bin to my PATH in Ubuntu in order to get access to pip
  • fix installing and setting up pip3 and virtualenv
  • fix creating a new virtualenv
  • remove --no-site-packages from virtualenv instructions
  • add instructions for deactivate

Chapter 1

-change paths in code examples to reflect location of data in provided repo