A Kubeflow pipeline is a portable and scalable definition of a machine learning (ML) workflow. Each step in your ML workflow, such as preparing data or training a model, is an instance of a pipeline component.
A Kubeflow pipeline can be created with Kubeflow Python SDK using its own DSL (domain specific language). Currently, the Kubeflow Pipeline SDK (kfp) is only available in Python programming language.
Let's create a Jupyter Notebook first to interact with the Kubeflow Pipeline API Server backend.
-
Navigate to "Kubeflow UI Dashboard" -> "Notebooks" -> "+ New Notebook"
-
Input the following values in the
New notebook
plane
Basic Category | Input |
---|---|
Name: | <yourname>-pipeline-test |
Type: | JupyterLab |
Custom Notebook: | kubeflownotebookswg/jupyter-scipy:v1.7.0 |
CPU: | 0,2 |
RAM in GiB: | 0,5 |
Workspace Volume | Input |
---|---|
New volume | |
Type | Empty volume |
Size in Gi | 5 |
Storage class | homedir |
Access mode | ReadWriteOnce |
Mount path | /home/jovyan |
- Open the Advanced Options
and select the Configurations
- Allow access to Kubeflow Pipelines
- Click on
LAUNCH
button to create a new jupyterlab workbench.
Notice:
- If you haven't created
Allow access to Kubeflow Pipelines
PodDefault in your namespace, please follow the instruction in Section Git Versioning and Env variable to createAllow access to Kubeflow Pipelines
PodDefault first, and then back to this tutorial section.
Let's CONNECT
to the created JupyterLab workbench <yourname>-pipeline-test
with the kubeflow access token mounted.
You can create a Python Jupyter Notebook from the Launcher
by click on the Notebook Python 3
.
In this tutorial, you will start with a provided notebook
to start your first pipeline.
- Fetch the workshop git code repository by typing the following in terminal in the
<yourname>-pipeline-test
JupyterLab workbench:
cd $HOME;
git clone https://github.com/yingding/kf-examples;
- Navigate the
kf-examples/sdkV1/toy_v1_add.ipynb
in JupyterLab file browser, open it by double click.
Click on the ">>" Button to "Restart Kernal and Run All cells ..."
Or you can run every cell single separately and see examing the python notebook
- You can see, "Experiment details" and "Run details" as output at the end of
toy_v1_add.ipynb
, click on it to open theRun details
- Click on
Add op
component in the pipeline graph view to open the side panel
You can see Input/Output
, Details
, Logs
... of the Pod in this first pipeline run.
Note:
- You can also access the
Run
info of kubeflow pipeline fromKubeflow Dashboard UI
->Runs
Let's go back to the jupyter workbench by switch back the opened tab in browser.
- Navigate to
kf-examples/sdkV1/compiled/
folder, you can see the filesum.yaml
which is a Yaml representation of your first KFP pipeline.
- You can download
sum.yaml
by right mous click on the file in Jupyter file browser, and click onDowload
and save it in a location of your choice.
- Navigate to
Kubeflow Dashboard UI
, gotoPipelines
view, and click on "+ Upload pipeline" on up right corner
- Input the following values in the "Upload Pipeline or Pipeline Version" plane
Creat a new pipeline | |
---|---|
Pipeline Name: | <yourname>-sum-numbers-pipeline |
Pipeline Description: | this pipeline sums two float inputs |
Upload a file: | < choose your sum.yaml file download previously > |
Click on Create
.
Important:
- At the current Kubeflow version, the pipeline name is global unique, it is shared among all namespaces.
- If you receive an error, that pipeline already exists. Change your pipeline name or upload it as a new pipeline version to the existing pipeline.
- In Kubeflow 1.8.0 there will be private pipeline for namespace in the future.
- On the opened
<yourname>-um-numbers-pipeline
plane, click onCreate run
- On the
Start a run
plane
- choose
Experiment
as "demo" - Run parameters
a
to "3" and b "4"
keep the rest default inputs
Click on "Start"
- Examin the
Run
of your started pipeline<yourname>-sum-numbers-pipeline
- Click on the
top
run you just started. You can follow up therun info
in the pipeline UIrun
view and see the logs of components in this pipeline run.
If you want to learn more about Kubeflow Python SDK for building a ML pipeline, please keep the current JupyterLab workbench open, you will use it right away.
If you decided to learn Kubeflow Python SDK in a later time, please remember to clean up your resources:
stop
anddelete
, your kubeflow workbench instancedelete
the associatedworkspace
volume of the kubeflow workbench
You have learned in this tutorial
- Create a Jupyter Workbench with Pipeline Access
- Start your first Kubeflow Pipeline using KFP python SDK from a Jupyter Notebook
- Examing the Run details of a Kubeflow Pipeline run
- Download the Yaml representation of Kubeflow Pipeline created by Python SDK
- Upload the Yaml representation of Kubeflow Pipeline via Pipelines UI
- Create a new run from the Kubeflow Pipeline with input params
- Exam the Run details from Experiments (KFP), Kubeflow Dashboard UI menu