Skip to content

Commit

Permalink
Update README iwth info about ETL Dashboard Docker
Browse files Browse the repository at this point in the history
  • Loading branch information
Bjwebb committed Oct 1, 2015
1 parent 2ed0cb4 commit 485ce5f
Showing 1 changed file with 20 additions and 45 deletions.
65 changes: 20 additions & 45 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@ This repository contains a library for Extract, Transform and Load processes for

You can report issues with current transformations, or suggest sources which should be added to this library using the GitHub issue tracker.

This GitHub repository builds two different docker images:
* The Dockerfile in the root dir builds https://registry.hub.docker.com/u/bjwebb/resource-projects-etl/
* The Dockerfile in the `load` dir builds https://registry.hub.docker.com/u/bjwebb/resource-projects-etl-load/

## Processes
Each process, located in the **process** folder consists of a collection of files that either (a) document a manual transformation of the data; or (b) perform an automated transformation.

Expand All @@ -22,67 +18,46 @@ Folders may contain:

The output of each process should be written to the root data/ folder, from where it can be loaded onto the ResourceProjects.org platform.

## Running ETL Dashboard with with docker

## Running locally

### Requirements

* Python 3
* Bash
### Running from docker hub

### Run
You will need [virtuoso container running](https://github.com/NRGI/resourceprojects.org-frontend/#pre-requisites).

```
virtualenv .ve --python=/usr/bin/python3
source .ve/bin/activate
pip install -r requirements.txt
./transform_all.sh
docker rm -f rp-etl
docker run --name rp-etl --link virtuoso:virtuoso -p 127.0.0.1:8000:80 -e DBA_PASS=dba opendataservices/resource-projects-etl
```

You will then have some data in the data/ directory. Currently the load step can only be run with docker (see "Using a data directory on the host system" below).
Update DBA_PASS as appropriate.

## Running with docker
Then visit http://locahost:8000/

### Requirements

Docker **1.7** (actual requirement may be >=1.6, but 1.7 is what's been tested. This is required because the docker library python image doesn't work otherwise).

### Running from docker hub
### Building docker image

```
docker rm -f rp-etl rp-load
docker run --name rp-etl -v /usr/src/app/data -v /usr/src/app/ontology bjwebb/resource-projects-etl
docker run --name rp-load --link virtuoso:virtuoso --volumes-from virtuoso --volumes-from rp-etl --rm bjwebb/resource-projects-etl-load
docker build -t opendataservices/resource-projects-etl .
```

To run the last command you will need [virtuoso container running](https://github.com/NRGI/resourceprojects.org-frontend/#pre-requisites).
Then run as described above. (You may want to use a different name for your own image, so as not to get confused with those actually from docker hub).

## Running taglifter locally

### Using a data directory on the host system

To transform the data, and put it in ./data on the host system, run:

```
docker rm -f rp-etl
docker run --name rp-etl -v `pwd`/data:/usr/src/app/data bjwebb/resource-projects-etl
```

To load the data, you can then run:

```
docker run --name rp-load --link virtuoso:virtuoso --volumes-from virtuoso -v `pwd`/data:/usr/src/app/data -v `pwd`/ontology:/usr/src/app/ontology --rm bjwebb/resource-projects-etl-load
```
### Requirements

This load step can also be used to load data not generated using a dockerized step.
* Python 3
* Bash

### Building docker images
### Run

```
docker build -t bjwebb/resource-projects-etl .
docker build -t bjwebb/resource-projects-etl-load load
virtualenv .ve --python=/usr/bin/python3
source .ve/bin/activate
pip install -r requirements.txt
./transform_all.sh
```

Then run as described above. (You may want to use a different name for your own images, so as not to get confused with those actually from docker hub).
You will then have some data as Turtle in the data/ directory.

# License

Expand Down

0 comments on commit 485ce5f

Please sign in to comment.