I. Installation of KServe & its dependencies
II. Setting up local MinIO S3 storage
III. Setting up your OpenShift AI workbench
V. Convert model to Caikit format and save to S3 storage
V. Deploy model onto Caikit-TGIS Serving Runtime
Prerequisites
- To support training and inference, your cluster needs a node with CPUS, 4 GPUs, and GB memory. Instructions to add GPU support to RHOAI can be found here.
- You have a cluster administrator permissions
- You have installed the OpenShift CLI (
oc
) - You have installed the
Red Hat OpenShift Service Mesh Operator
- You have installed the
Red Hat OpenShift Serverless Operator
- You have installed the
Red Hat OpenShift AI Operator
and created a DataScienceCluster object
Instructions adapted from Manually installing KServe
-
Git clone this repository
git clone https://github.com/trustyai-explainability/trustyai-detoxify-sft.git
-
Login to your OpenShift cluster as a cluster adminstrator
oc login --token=<token>
-
Create the required namespace for Red Hat OpenShift Service Mesh
oc create ns istio-system
-
Create a
ServiceMeshControlPlane
objectoc apply -f manifests/kserve/smcp.yaml -n istio-system
-
Sanity check to verify creation of the service mesh instance
oc get pods -n istio-system
Expected output:
NAME READY STATUS RESTARTS AGE istio-egressgateway-7c46668687-fzsqj 1/1 Running 0 22h istio-ingressgateway-77f94d8f85-fhsp9 1/1 Running 0 22h istiod-data-science-smcp-cc8cfd9b8-2rkg4 1/1 Running 0 22h
-
Create the required namespace for a
KnativeServing
instanceoc create ns knative-serving
-
Create a
ServiceMeshMember
objectoc apply -f manifests/kserve/default-smm.yaml -n knative-serving
-
Create and define a
KnativeServing
objectoc apply -f manifests/kserve/knativeserving-istio.yaml -n knative-serving
-
Sanity check to validate creation of the Knative Serving instance
oc get pods -n knative-serving
Expected output:
NAME READY STATUS RESTARTS AGE activator-7586f6f744-nvdlb 2/2 Running 0 22h activator-7586f6f744-sd77w 2/2 Running 0 22h autoscaler-764fdf5d45-p2v98 2/2 Running 0 22h autoscaler-764fdf5d45-x7dc6 2/2 Running 0 22h autoscaler-hpa-7c7c4cd96d-2lkzg 1/1 Running 0 22h autoscaler-hpa-7c7c4cd96d-gks9j 1/1 Running 0 22h controller-5fdfc9567c-6cj9d 1/1 Running 0 22h controller-5fdfc9567c-bf5x7 1/1 Running 0 22h domain-mapping-56ccd85968-2hjvp 1/1 Running 0 22h domain-mapping-56ccd85968-lg6mw 1/1 Running 0 22h domainmapping-webhook-769b88695c-gp2hk 1/1 Running 0 22h domainmapping-webhook-769b88695c-npn8g 1/1 Running 0 22h net-istio-controller-7dfc6f668c-jb4xk 1/1 Running 0 22h net-istio-controller-7dfc6f668c-jxs5p 1/1 Running 0 22h net-istio-webhook-66d8f75d6f-bgd5r 1/1 Running 0 22h net-istio-webhook-66d8f75d6f-hld75 1/1 Running 0 22h webhook-7d49878bc4-8xjbr 1/1 Running 0 22h webhook-7d49878bc4-s4xx4 1/1 Running 0 22h
-
From the web console, install KServe by going to Operators -> Installed Operators and click on the Red Hat OpenShift AI Operator
-
Click on the DSC Intialization tab and click on the default-dsci object
-
Click on the YAML tab and in the
spec
section, change theserviceMesh.managementState
toUnmanaged
spec: serviceMesh: managementState: Unmanaged
-
Click Save
-
Click on the Data Science Cluster tab and click on the default-dsc object
-
Click on the YAML tab and in the
spec
section, change thecomponents.kserve.managementState
and thecomponents.kserve.serving.managementState
toManaged
spec: components: kserve: managementState: Managed serving: managementState: Managed
-
Click Save
-
Create a namespace for your project called "detoxify-sft"
oc create namespace detoxify-sft
-
Set up your local MinIO S3 storage in your newly created namespace
oc apply -f manifests/minio/setup-s3.yaml -n detoxify-sft
-
Run the following sanity checks
oc get pods -n detoxify-sft | grep "minio"
Expected output:
NAME READY STATUS RESTARTS AGE minio-7586f6f744-nvdl 1/1 Running 0 22h
oc get route -n detoxify-sft | grep "minio"
Expected output:
NAME STATUS LOCATION SERVICE minio-api Accepted https://minio-api... minio-service minio-ui Accepted https://minio-ui... minio-service
-
Get the MinIO UI location URL and open it in a web browser
oc get route minio-ui -n detoxify-sft
-
Login using the credentials in
manifests/minio/setup-s3.yaml
user:
minio
password:
minio123
-
Click on Create a Bucket and choose a name for your bucket and click on Create Bucket
-
Go to Red Hat OpenShift AI from the web console
-
Click on Data Science Projects and then click on Create data science project
-
Give your project a name and then click Create
-
Click on the Workbenches tab and then create a workbench with a Pytorch notebook image, set the container size to Large, and select a single NVIDIA GPU. Click on Create Workbench
-
Click on Add data connection to create a matching data connection for MinIO
-
Fill out the required fields and then click on Add data collection
-
Once your workbench status changes from Starting to Running, click on Open to open JupyterHub in a web browser
-
In your JupyterHub environment, launch a terminal and clone this project
git clone https://github.com/trustyai-explainability/trustyai-detoxify-sft.git
-
Go into the
notebooks
directory
-
Open the
01-sft.ipynb
file -
Run each cell in the notebook
-
Once the model trained and uploaded to HuggingFace Hub, open the
02-eval.ipynb
file and run each cell to compare the model trained on raw input-output pairs vs. the one trained on detoxified prompts
- Open the
03-save_convert_model.ipynb
and run each cell in the notebook to convert the model Caikit format and save it to a MinIO bucket
-
In the OpenShift AI dashboard, navigate to the project details page and click the Models tab
-
In the Single-model serving platform tile, click on deploy model. Provide the following values:
Model Name:
opt-350m-caikit
Serving Runtime:
Caikit-TGIS Serving Runtime
Model framework:
caikit
Existing data connection:
My Storage
Path:
models/opt-350m-caikit
-
Click Deploy
-
Increase the
initialDelaySeconds
oc patch template caikit-tgis-serving-template --type=='merge' -p '{"spec":{"containers":[{"readinessProbe":"initialDelaySeconds":300, "livenessProbe":"initialDelaySeconds":300}]}}'
-
Wait for the model Status to show a green checkmark
-
Return to the JupyterHub environment to test out the deployed model
-
Click on
03-inference_request.ipynb
and run each cell to make an inference request to the detoxified model