The scripts require kubectl and setting the context to use the namespace where ModelMesh Serving is installed, usually modelmesh-serving
. Below are examples of creating/removing models from the ModelMesh Serving instance using the included shell scripts.
cd multi_model_tester
# Deploy 5000 Mnist Onnx models named mnist-onnx-1 to mnist-onnx-5000 at concurrency of 10
./deployNpredictors.sh 10 mnist-onnx 1 5000 deploy_1mnist_onnx_predictor.sh
# Deploy 1000 simple string models named simple-string-tf-1000 to simple-string-2000 at concurrency of 30
./deployNpredictors.sh 30 simple-string-tf 1000 2000 deploy_1simple_string_tf_predictor.sh
# Remove Mnist Onnx models named mnist-onnx-1 to mnist-onnx-5000 at concurrency of 10
./rmNpredictors.sh 10 mnist-onnx 1 5000 rm_1mnist_onnx_predictor.sh
# Remove 1000 simple string models named simple-string-tf-1000 to simple-string-2000 at concurrency of 30
./rmNpredictors.sh 30 simple-string-tf 1000 2000 rm_1simple_string_tf_predictor.sh
The multi_model_tester
is written in Go and utilizes a modified Trunks library that uses round_robbin loadbalancer policy and targets dns:///modelmesh-serving:8033
.
The tester generates a mix of up to six heterogeneous model inference requests to stress the Inference Request gRPC API of the ModelMesh Serving instance.
cd multi_model_tester
go build .
./multi_model_test -h
Usage of ./multi_model_test:
-cp uint
Number of connections to create. Default: 1 (default 1)
-debug
Sends 1 request and view response. Default: false
-dur int
Test duration in seconds. Default: 1 (default 1)
-ma string
List of different types models separate by space. Available options: SimpleStringTF, MnistSklearn, MushroomXgboost, CifarPytorch, MushroomLightgbm, MnistOnnx
-npm int
Number of model name 1 to npm per model to generate (default 1)
-qps int
Constant Queries Per Second to hold. Default: 1 (default 1)
-u string
Inference Server URL. Default: localhost:8033 (default "localhost:8033")
-wp uint
Number of worker pool. Default: 1 (default 1)
The following examples use the tester from inside the same namespace to target the kube-dns address and port corresponding to the ModelMesh Serving service.
Target the Simple-String models named simple-string-tf-1
to simple-string-tf-2000
at 1000 queries for 60 seconds.
./multi_model_test -u dns:///modelmesh-serving:8033 -npm 2000 -ma "SimpleStringTF" -qps 1000 -dur 60
Target the Simple-String and MnistOnnx models named simple-string-tf-1
to simple-string-tf-2000
and mnist-onnx-1
to mnist-onnx-2000
at 1000 queries for 60 seconds.
./multi_model_test -u dns:///modelmesh-serving:8033 -npm 2000 -ma "SimpleStringTF MnistOnnx" -qps 1000 -dur 60
Sometimes, it is useful to verify whether all the models can respond. In debug mode, the tester sends inference requests to Simple-String and MnistOnnx models named simple-string-tf-1
to simple-string-tf-2000
and mnist-onnx-1
to mnist-onnx-2000
sequentially and print the response.
./multi_model_test -u dns:///modelmesh-serving:8033 -npm 2000 -ma "SimpleStringTF MnistOnnx" -debug