Skip to content

njdcornites/full_stack_de

 
 

Repository files navigation

Spin OceanRecords example database

  1. Set up postgres database on your local
  2. Execute sql statements located in manual_init.sql (those which refers to OceaReocrds operational sysytem)
  3. execute: python spin_osdb_init.py OceanRecordsInit --local-scheduler

notes:

  • make sure you've installed all necessery dependencies (requirements.txt)
  • make sure that in dir when script is executed also prod_raw file exists
  • script will output some temporary luigi related files in tmp_init_files folder

Update OceanRecords database

usage: python update_osdb.py Requires that initial state of databse already exists.

Airflow initialization

  1. Install airflow and any other prerequisite libraries
  2. Set up arflow home dir, e.g.: export AIRFLOW_HOME=~/airflow
  3. Create/edit airflow.cfg file. The first time you run Airflow (e.g. by running airflow command), it will create a config file in $AIRFLOW_HOME. Set load_examples = False, so example dags are not loaded.
  4. In airflow.cfg change executor type to: LocalExecutor
  5. In airflow.cfg modify sqlalchemy connection string to postgres airflow db
  6. Init ariflow database: airflow initdb
  7. Copy dags and etl_sql_code folders into AIRFLOW_HOME
  8. Run airflow by: bash start_airflow.sh &> output.log

notes:

  • all related to DWH scripts in manual_init.sql have to be executed before
  • make sure that all paths are correct. for example I've hardcoded mine local paths in start_airflow.sh

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 68.8%
  • PLpgSQL 30.1%
  • Shell 1.1%