-
Notifications
You must be signed in to change notification settings - Fork 55
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add documentation and get started guide
Signed-off-by: Andrea Lamparelli <[email protected]>
- Loading branch information
Showing
4 changed files
with
387 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,249 @@ | ||
# Get Started | ||
|
||
Embark on your journey with this custom storage initializer by exploring a simple hello-world example. Learn how to seamlessly integrate and leverage the power of this tool in just a few steps. | ||
|
||
## Prerequisites | ||
|
||
* Install [Kind](https://kind.sigs.k8s.io/docs/user/quick-start) (Kubernetes in Docker)¶ to run local Kubernetes cluster with Docker container nodes. | ||
* Install the [Kubernetes CLI (kubectl)](https://kubernetes.io/docs/tasks/tools/), which allows you to run commands against Kubernetes clusters. | ||
* Install the [Kustomize](https://kustomize.io/), which allows you to customize app configuration. | ||
|
||
## Environment Preparation | ||
|
||
We assume all [prerequisites](#prerequisites) are satisfied at this point. | ||
|
||
### Create the environment | ||
|
||
1. After having kind installed, create a kind cluster with: | ||
```bash | ||
kind create cluster | ||
``` | ||
|
||
2. Configure `kubectl` to use kind context | ||
```bash | ||
kubectl config use-context kind-kind | ||
``` | ||
|
||
3. Setup local deployment of *Kserve* using the provided *Kserve quick installation* script | ||
```bash | ||
curl -s "https://raw.githubusercontent.com/kserve/kserve/release-0.11/hack/quick_install.sh" | bash | ||
``` | ||
|
||
4. Install *model registry* in the local cluster | ||
|
||
[Optional ]Use model registry with local changes: | ||
|
||
```bash | ||
TAG=$(git rev-parse HEAD) && \ | ||
MR_IMG=quay.io/$USER/model-registry:$TAG && \ | ||
make -C ../ IMG_ORG=$USER IMG_VERSION=$TAG image/build && \ | ||
kind load docker-image $MR_IMG | ||
``` | ||
|
||
then: | ||
|
||
```bash | ||
bash ./hack/install_modelregistry.sh -i $MR_IMG | ||
``` | ||
|
||
> _NOTE_: If you want to use a remote image you can simply remove the `-i` option | ||
> _NOTE_: The `./hack/install_modelregistry.sh` will make some change to [base/kustomization.yaml](../manifests/kustomize/base/kustomization.yaml) that you DON'T need to commit!! | ||
5. [Optional] Use local container image for CSI | ||
|
||
```bash | ||
IMG=quay.io/$USER/model-registry-storage-initializer:$(git rev-parse HEAD) && make IMG=$IMG docker-build && kind load docker-image $IMG | ||
``` | ||
|
||
## First InferenceService with ModelRegistry URI | ||
|
||
In this tutorial, you will deploy an InferenceService with a predictor that will load a model indexed into the model registry, the indexed model refers to a scikit-learn model trained with the [iris](https://archive.ics.uci.edu/ml/datasets/iris) dataset. This dataset has three output class: Iris Setosa, Iris Versicolour, and Iris Virginica. | ||
|
||
You will then send an inference request to your deployed model in order to get a prediction for the class of iris plant your request corresponds to. | ||
|
||
Since your model is being deployed as an InferenceService, not a raw Kubernetes Service, you just need to provide the storage location of the model using the `model-registry://` URI format and it gets some super powers out of the box. | ||
|
||
|
||
### Register a Model into ModelRegistry | ||
|
||
Apply `Port Forward` to the model registry service in order to being able to interact with it from the outside of the cluster. | ||
```bash | ||
kubectl port-forward --namespace kubeflow svc/model-registry-service 8080:8080 | ||
``` | ||
|
||
And then (in another terminal): | ||
```bash | ||
export MR_HOSTNAME=localhost:8080 | ||
``` | ||
|
||
Then, in the same terminal where you exported `MR_HOSTNAME`, perform the following actions: | ||
1. Register an empty `RegisteredModel` | ||
|
||
```bash | ||
curl --silent -X 'POST' \ | ||
"$MR_HOSTNAME/api/model_registry/v1alpha2/registered_models" \ | ||
-H 'accept: application/json' \ | ||
-H 'Content-Type: application/json' \ | ||
-d '{ | ||
"description": "Iris scikit-learn model", | ||
"name": "iris" | ||
}' | ||
``` | ||
|
||
Expected output: | ||
```bash | ||
{"createTimeSinceEpoch":"1709287882361","customProperties":{},"description":"Iris scikit-learn model","id":"1","lastUpdateTimeSinceEpoch":"1709287882361","name":"iris"} | ||
``` | ||
|
||
2. Register the first `ModelVersion` | ||
|
||
```bash | ||
curl --silent -X 'POST' \ | ||
"$MR_HOSTNAME/api/model_registry/v1alpha2/model_versions" \ | ||
-H 'accept: application/json' \ | ||
-H 'Content-Type: application/json' \ | ||
-d '{ | ||
"description": "Iris model version v1", | ||
"name": "v1", | ||
"registeredModelID": "1" | ||
}' | ||
``` | ||
|
||
Expected output: | ||
```bash | ||
{"createTimeSinceEpoch":"1709287890365","customProperties":{},"description":"Iris model version v1","id":"2","lastUpdateTimeSinceEpoch":"1709287890365","name":"v1"} | ||
``` | ||
|
||
3. Register the raw `ModelArtifact` | ||
|
||
This artifact defines where the actual trained model is stored, i.e., `gs://kfserving-examples/models/sklearn/1.0/model` | ||
|
||
```bash | ||
curl --silent -X 'POST' \ | ||
"$MR_HOSTNAME/api/model_registry/v1alpha2/model_versions/2/artifacts" \ | ||
-H 'accept: application/json' \ | ||
-H 'Content-Type: application/json' \ | ||
-d '{ | ||
"description": "Model artifact for Iris v1", | ||
"uri": "gs://kfserving-examples/models/sklearn/1.0/model", | ||
"state": "UNKNOWN", | ||
"name": "iris-model-v1", | ||
"modelFormatName": "sklearn", | ||
"modelFormatVersion": "1", | ||
"artifactType": "model-artifact" | ||
}' | ||
``` | ||
|
||
Expected output: | ||
```bash | ||
{"artifactType":"model-artifact","createTimeSinceEpoch":"1709287972637","customProperties":{},"description":"Model artifact for Iris v1","id":"1","lastUpdateTimeSinceEpoch":"1709287972637","modelFormatName":"sklearn","modelFormatVersion":"1","name":"iris-model-v1","state":"UNKNOWN","uri":"gs://kfserving-examples/models/sklearn/1.0/model"} | ||
``` | ||
|
||
> _NOTE_: double check the provided IDs are the expected ones. | ||
### Apply the `ClusterStorageContainer` resource | ||
|
||
Retrieve the model registry service and MLMD port: | ||
```bash | ||
MODEL_REGISTRY_SERVICE=model-registry-service | ||
MODEL_REGISTRY_REST_PORT=$(kubectl get svc/$MODEL_REGISTRY_SERVICE -n kubeflow --output jsonpath='{.spec.ports[0].targetPort}' ) | ||
``` | ||
|
||
Apply the cluster-scoped `ClusterStorageContainer` CR to setup configure the `model registry storage initilizer` for `model-registry://` URI formats. | ||
|
||
```bash | ||
kubectl apply -f - <<EOF | ||
apiVersion: "serving.kserve.io/v1alpha1" | ||
kind: ClusterStorageContainer | ||
metadata: | ||
name: mr-initializer | ||
spec: | ||
container: | ||
name: storage-initializer | ||
image: $IMG | ||
env: | ||
- name: MODEL_REGISTRY_BASE_URL | ||
value: "$MODEL_REGISTRY_SERVICE.kubeflow.svc.cluster.local:$MODEL_REGISTRY_REST_PORT" | ||
- name: MODEL_REGISTRY_SCHEME | ||
value: "http" | ||
resources: | ||
requests: | ||
memory: 100Mi | ||
cpu: 100m | ||
limits: | ||
memory: 1Gi | ||
cpu: "1" | ||
supportedUriFormats: | ||
- prefix: model-registry:// | ||
EOF | ||
``` | ||
|
||
> _NOTE_: as `$IMG` you could use either the one created during [env preparation](#environment-preparation) or any other remote img in the container registry. | ||
### Create an `InferenceService` | ||
|
||
1. Create a namespace | ||
```bash | ||
kubectl create namespace kserve-test | ||
``` | ||
|
||
2. Create the `InferenceService` | ||
```bash | ||
kubectl apply -n kserve-test -f - <<EOF | ||
apiVersion: "serving.kserve.io/v1beta1" | ||
kind: "InferenceService" | ||
metadata: | ||
name: "iris-model" | ||
spec: | ||
predictor: | ||
model: | ||
modelFormat: | ||
name: sklearn | ||
storageUri: "model-registry://iris/v1" | ||
EOF | ||
``` | ||
|
||
3. Check `InferenceService` status | ||
```bash | ||
kubectl get inferenceservices iris-model -n kserve-test | ||
``` | ||
|
||
4. Determine the ingress IP and ports | ||
|
||
```bash | ||
kubectl get svc istio-ingressgateway -n istio-system | ||
``` | ||
|
||
And then: | ||
```bash | ||
INGRESS_GATEWAY_SERVICE=$(kubectl get svc --namespace istio-system --selector="app=istio-ingressgateway" --output jsonpath='{.items[0].metadata.name}') | ||
kubectl port-forward --namespace istio-system svc/${INGRESS_GATEWAY_SERVICE} 8081:80 | ||
``` | ||
|
||
After that (in another terminal): | ||
```bash | ||
export INGRESS_HOST=localhost | ||
export INGRESS_PORT=8081 | ||
``` | ||
|
||
5. Perform the inference request | ||
|
||
Prepare the input data: | ||
```bash | ||
cat <<EOF > "/tmp/iris-input.json" | ||
{ | ||
"instances": [ | ||
[6.8, 2.8, 4.8, 1.4], | ||
[6.0, 3.4, 4.5, 1.6] | ||
] | ||
} | ||
EOF | ||
``` | ||
|
||
If you do not have DNS, you can still curl with the ingress gateway external IP using the HOST Header. | ||
```bash | ||
SERVICE_HOSTNAME=$(kubectl get inferenceservice iris-model -n kserve-test -o jsonpath='{.status.url}' | cut -d "/" -f 3) | ||
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" "http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/iris-v1:predict" -d @/tmp/iris-input.json | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,89 @@ | ||
TODO | ||
# Model Registry Custom Storage Initializer | ||
|
||
This is a Model Registry specific implementation of a KServe Custom Storage Initializer (CSI). | ||
More details on what `Custom Storage Initializer` is can be found in the [KServe doc](https://kserve.github.io/website/0.11/modelserving/storage/storagecontainers/). | ||
|
||
## Implementation | ||
|
||
The Model Registry CSI is a simple Go executable that basically takes two positional arguments: | ||
1. __Source URI__: identifies the `storageUri` set in the `InferenceService`, this must be a model-registry custom URI, i.e., `model-registry://...` | ||
2. __Deestination Path__: the location where the model should be stored, e.g., `/mnt/models` | ||
|
||
The core logic of this CSI is pretty simple and it consists of three main steps: | ||
1. Parse the custom URI in order to extract `registered model name` and `model version` | ||
2. Query the model registry in order to retrieve the original model location (e.g., `http`, `s3`, `gcs` and so on) | ||
3. Use `github.com/kserve/kserve/pkg/agent/storage` pkg to actually download the model from well-known protocols. | ||
|
||
### Workflow | ||
|
||
The below sequence diagram should highlight the workflow when this CSI is injected into the KServe pod deployment. | ||
|
||
```mermaid | ||
sequenceDiagram | ||
actor U as User | ||
participant MR as Model Registry | ||
participant KC as KServe Controller | ||
participant MD as Model Deployment (Pod) | ||
participant MRSI as Model Registry Storage Initializer | ||
U->>+MR: Register ML Model | ||
MR-->>-U: Indexed Model | ||
U->>U: Create InferenceService CR | ||
Note right of U: The InferenceService should<br/>point to the model registry<br/>indexed model, e.g.,:<br/> model-registry://<model>/<version> | ||
KC->>KC: React to InferenceService creation | ||
KC->>+MD: Create Model Deployment | ||
MD->>+MRSI: Initialization (Download Model) | ||
MRSI->>MRSI: Parse URI | ||
MRSI->>+MR: Fetch Model Metadata | ||
MR-->>-MRSI: Model Metadata | ||
Note over MR,MRSI: The main information that is fetched is the artifact URI which specifies the real model location, e.g.,: https://.. or s3://... | ||
MRSI->>MRSI: Download Model | ||
Note right of MRSI: The storage initializer will use<br/> the KServe default providers<br/> to download the model<br/> based on the artifact URI | ||
MRSI-->>-MD: Downloaded Model | ||
MD->>-MD: Deploy Model | ||
``` | ||
|
||
|
||
## Get Started | ||
|
||
Please look at [Get Started](./GET_STARTED.md) guide for a very simple quickstart that showcases how this custom storage initializer can be used for ML models serving operations. | ||
|
||
## Development | ||
|
||
### Build the executable | ||
|
||
You can just run: | ||
```bash | ||
make build | ||
``` | ||
|
||
Which wil create the executable under `bin/mr-storage-initializer`. | ||
|
||
### Run the executable | ||
|
||
You can run `main.go` (without building the executable) by running: | ||
```bash | ||
./bin/mr-storage-initializer "model-registry://model/version" "./" | ||
``` | ||
|
||
or directly running the `main.go` skipping the previous step: | ||
```bash | ||
make SOURCE_URI=model-registry://model/version DEST_PATH=./ run | ||
``` | ||
|
||
> _NOTE_: a Model Registry service should be up and running at `localhost:8080`. | ||
### Build container image | ||
|
||
Run: | ||
```bash | ||
make docker-build | ||
``` | ||
|
||
By default the container image name is `quay.io/${USER}/model-registry-storage-initializer:latest` but it can be overridden providing the `IMG` env variable, e.g., `make IMG=abc/ORG/NAME:TAG docker-build`. | ||
|
||
### Push container image | ||
|
||
Issue the following command: | ||
```bash | ||
make [IMG=..] docker-push | ||
``` |
Oops, something went wrong.