Skip to content

Commit

Permalink
Updated structure draft
Browse files Browse the repository at this point in the history
  • Loading branch information
timgdavies authored and Bjwebb committed Jul 2, 2015
1 parent 9f5d719 commit 2552bc0
Show file tree
Hide file tree
Showing 7 changed files with 1,262 additions and 3 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ data
*~
.ve
.ipynb_checkpoints
process/*/data
24 changes: 21 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,32 @@
# resource-projects-etl
Extract, Transform and Load processes for rp.org

# Requirements
This repository contains a library for Extract, Transform and Load processes for ResourceProjects.org.

You can report issues with current transformations, or suggest sources which should be added to this library using the GitHub issue tracker.


## Processes
Each process, located in the **process** folder consists of:

* A README.md file describing the transformation
* An extract.sh or extract.py file to fetch the file
* A data/ subfolder where the extracted data is stored
* A transform.py file which runs the transformations
* A meta.json file, containing the meta-data which transform.py will use

The output of each process should be written to the /data/ folder, from where it can be loaded onto the ResourceProjects.org platform.



## Requirements

Python 3

# Getting started
### Getting started

```
virtualenv .ve --python=/usr/bin/python3
source .ve/bin/activate
pip install -r requirements.txt
```

Loading

0 comments on commit 2552bc0

Please sign in to comment.