-
Notifications
You must be signed in to change notification settings - Fork 13
Build a pipeline flow
Think about the dependencies between the images and/or manual actions in your pilot case. Which container depends on another container? Which (manual) actions must be executed before a container can start? Example dependencies could be:
- The Spark master needs to be started before the Spark worker such that the Spark worker can register itself at the Spark master.
- The input data needs to be loaded in HDFS before the Map Reduce algorithm starts computing.
Based on these dependencies construct a pipeline flow. The flow determines the order of the services that needs to be started and the actions that need to be executed. For example, in the demo application the flow is:
- Start HDFS
- Start Spark
- Put input file on HDFS
- Compute aggregations
- Get output from HDFS
You can configure your pipeline flow using the Pipeline Builder engine packaged with the integrator-ui. Browse to http://integrator-ui.big-data-europe.aksw.org (your integrator-ui instance) then click on "Workflow builder". You can then create your workflow and the steps needed. The init-daemon should then be able to manage the workflow as long as your services make sure to check & report their statuses.