Brief Intro:
The aim of this project is to create a clean relational database to be ready for analysis. Star schema is chosen as the data model because of its simplisity. The model conatinted Immigration table (fact) Immigrants, City, Time and Monthly temprtures (diminsion tables)
Tools used:
Apache Spark helped in processing huge amount of data and delaing with different files types (SAS, CSV, Parquet)
Pandas Dataframes enhanced data readability