LinkedPipes ETL is an RDF based, lightweight ETL tool.
- REST API based set of components for easy integration
- Library of components to get you started faster
- Sharing of configuration among individual pipelines using templates
- RDF configuration of transformation pipelines
- Linux, Windows, iOS
- Docker, Docker Compose
You can run LP-ETL in Docker, or build it from the source.
To start LP-ETL master
branch on http://localhost:8080
, you can use a one-liner:
curl https://raw.githubusercontent.com/linkedpipes/etl/master/docker-compose.yml | docker-compose -f - up
Alternatively, you can clone the entire repository
git clone https://github.com/linkedpipes/etl.git
and run
docker-compose up
Note that this uses just the docker-compose.yml
file, so the rest of the cloned repository is useless.
You may need to run the commands as sudo
or be in the docker
group.
Each component (executor, executor-monitor, storage, frontend) has separate Dockerfile
.
Environment variables:
LP_ETL_BUILD_BRANCH
- TheDockerfiles
are designed to run build from the github repository, the branch is set using this property, default ismaster
.LP_ETL_BUILD_JAVA_TEST
- Set to empty to allow to run Java tests, this will slow down the build.LP_ETL_DOMAIN
- The URL of the instance, this is used instead of thedomain.uri
from the configuration.LP_ETL_FTP
- The URL of the FTP server, this is used instead of theexecutor-monitor.ftp.uri
from the configuration.
For Docker Compose, there are additional environment variables:
LP_ETL_PORT
- Specify port mapping for frontend, this is where you can connect to your instance. This does NOT have to be the same as port inLP_ETL_DOMAIN
in case of reverse-proxying.
For example to run LP-ETL from develop
branch on http://localhost:9080
use can use following command:
curl https://raw.githubusercontent.com/linkedpipes/etl/develop/docker-compose.yml | LP_ETL_PORT=9080 LP_ETL_DOMAIN=http://localhost:9080 docker-compose -f - up
docker-compose
utilizes several volumes that can be used to access/provide data.
See docker-compose.yml
comments for examples and configuration.
You may want to create your own docker-compose.yml
for custom configuration.
$ git clone https://github.com/linkedpipes/etl.git
$ cd etl
$ mvn install
The configuration file deploy/configuration.properties
can be edited, mainly changing paths to working, storage, log and library directories.
$ cd deploy
$ ./executor.sh >> executor.log &
$ ./executor-monitor.sh >> executor-monitor.log &
$ ./storage.sh >> storage.log &
$ ./frontend.sh >> frontend.log &
See example service files in the deploy/systemd
folder.
Note that it is also possible to use Bash on Ubuntu on Windows or Cygwin and proceed as with Linux.
git clone https://github.com/linkedpipes/etl.git
cd etl
mvn install
The configuration file deploy/configuration.properties
can be edited, mainly changing paths to working, storage, log and library directories.
In the deploy
folder, run
executor.bat
executor-monitor.bat
storage.bat
frontend.bat
The components live in the jars
directory.
Detailed description of how to create your own is coming soon, in the meantime, you can copy an existing component and change it.
Update note 5: 2019-09-03 breaking changes in the configuration file. Remove
/api/v1
from theexecutor-monitor.webserver.uri
, so it loolks like:executor-monitor.webserver.uri = http://localhost:8081
. You can also removeexecutor.execution.uriPrefix
as the value is derived fromdomain.uri
.
Update note 4: 2019-07-03 we changed the way frontend is run. If you do not use our script to run it, you need to update yours.
Update note 3: When upgrading from develop prior to 2017-02-14, you need to delete
{deploy}/jars
and{deploy}/osgi
.
Update note 2: When upgrading from master prior to 2016-11-04, you need to move your pipelines folder from e.g.,
/data/lp/etl/pipelines
to/data/lp/etl/storage/pipelines
, update the configuration.properites file and possibly the update/restart scripts as there is a new component,storage
.
Update note: When upgrading from master prior to 2016-04-07, you need to delete your old execution data (e.g., in /data/lp/etl/working/data)