Skip to content

Commit

Permalink
Add more docker related info to README
Browse files Browse the repository at this point in the history
  • Loading branch information
Bjwebb committed Jul 2, 2015
1 parent 0e9f3eb commit e61cdca
Show file tree
Hide file tree
Showing 2 changed files with 29 additions and 4 deletions.
31 changes: 27 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@ This repository contains a library for Extract, Transform and Load processes for

You can report issues with current transformations, or suggest sources which should be added to this library using the GitHub issue tracker.

This GitHub repository builds two different docker images:
* The Dockerfile in the root dir builds https://registry.hub.docker.com/u/bjwebb/resource-projects-etl/
* The Dockerfile in the `load` dir builds https://registry.hub.docker.com/u/bjwebb/resource-projects-etl-load/

## Processes
Each process, located in the **process** folder consists of a collection of files that either (a) document a manual transformation of the data; or (b) perform an automated transformation.
Expand All @@ -17,24 +20,34 @@ Folders may contain:
* A meta.json file, containing the meta-data which transform.py will use
* A prov.ttl file containing provenance information (using [PROV-O](www.w3.org/TR/prov-o)) to be merged into the final graph

The output of each process should be written to the root /data/ folder, from where it can be loaded onto the ResourceProjects.org platform.
The output of each process should be written to the root data/ folder, from where it can be loaded onto the ResourceProjects.org platform.


## Running locally

## Requirements
### Requirements

* Python 3
* Bash

### Getting started
### Run

```
virtualenv .ve --python=/usr/bin/python3
source .ve/bin/activate
pip install -r requirements.txt
./transform_all.sh
```

### Running with docker
You will then have some data in the data/ directory. Currently the load step can only be run with docker.

## Running with docker

### Requirements

Docker *1.7* (actual requirement may be >=1.6, but 1.7 is what's been tested. This is required because the docker library python image doesn't work otherwise).

### Running from docker hub

```
docker rm -f rp-etl rp-load
Expand All @@ -43,3 +56,13 @@ docker run --name rp-load --link virtuoso:virtuoso --volumes-from virtuoso --vol
```

To run the last command you will need [virtuoso container running](https://github.com/NRGI/resourceprojects.org-frontend/#pre-requisites).


### Building docker images

```
docker build -t bjwebb/resource-projects-etl .
docker build -t bjwebb/resource-projects-etl-load load
```

Then run as described above. (You may want to use a different name for your own images, so as not to get confused with those actually from docker hub).
2 changes: 2 additions & 0 deletions load/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# Build based on virtuoso container in order to have access to virtuoso's
# isql command.
FROM caprenter/automated-build-virtuoso
ADD load.sh /load.sh
ADD import.sql /import.sql
Expand Down

0 comments on commit e61cdca

Please sign in to comment.