forked from kserve/modelmesh
-
Notifications
You must be signed in to change notification settings - Fork 18
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #47 from spolti/sync
Sync
- Loading branch information
Showing
24 changed files
with
705 additions
and
99 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
# For most projects, this workflow file will not need changing; you simply need | ||
# to commit it to your repository. | ||
# | ||
# You may wish to alter this file to override the set of languages analyzed, | ||
# or to provide custom queries or build logic. | ||
# | ||
# ******** NOTE ******** | ||
# We have attempted to detect the languages in your repository. Please check | ||
# the `language` matrix defined below to confirm you have the correct set of | ||
# supported CodeQL languages. | ||
# | ||
name: "CodeQL" | ||
|
||
on: | ||
push: | ||
branches: ["main"] | ||
pull_request: | ||
# The branches below must be a subset of the branches above | ||
branches: ["main"] | ||
schedule: | ||
- cron: '45 8 * * *' | ||
|
||
jobs: | ||
analyze: | ||
name: Analyze | ||
# Runner size impacts CodeQL analysis time. To learn more, please see: | ||
# - https://gh.io/recommended-hardware-resources-for-running-codeql | ||
# - https://gh.io/supported-runners-and-hardware-resources | ||
# - https://gh.io/using-larger-runners | ||
# Consider using larger runners for possible analysis time improvements. | ||
runs-on: ${{ (matrix.language == 'swift' && 'macos-latest') || 'ubuntu-latest' }} | ||
timeout-minutes: ${{ (matrix.language == 'swift' && 120) || 360 }} | ||
permissions: | ||
actions: read | ||
contents: read | ||
security-events: write | ||
|
||
strategy: | ||
fail-fast: false | ||
matrix: | ||
language: ["java-kotlin", "python"] | ||
# CodeQL supports [ 'c-cpp', 'csharp', 'go', 'java-kotlin', 'javascript-typescript', 'python', 'ruby', 'swift' ] | ||
# Use only 'java-kotlin' to analyze code written in Java, Kotlin or both | ||
# Use only 'javascript-typescript' to analyze code written in JavaScript, TypeScript or both | ||
# Learn more about CodeQL language support at https://aka.ms/codeql-docs/language-support | ||
|
||
steps: | ||
- name: Checkout repository | ||
uses: actions/checkout@v3 | ||
|
||
- name: Set up Java 17 | ||
uses: actions/setup-java@v3 | ||
with: | ||
java-version: '17' | ||
distribution: 'temurin' | ||
|
||
# Initializes the CodeQL tools for scanning. | ||
- name: Initialize CodeQL | ||
uses: github/codeql-action/init@v2 | ||
with: | ||
languages: ${{ matrix.language }} | ||
# If you wish to specify custom queries, you can do so here or in a config file. | ||
# By default, queries listed here will override any specified in a config file. | ||
# Prefix the list here with "+" to use these queries and those in the config file. | ||
|
||
# For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs | ||
# queries: security-extended,security-and-quality | ||
|
||
# Autobuild attempts to build any compiled languages (C/C++, C#, Go, Java, or Swift). | ||
# If this step fails, then you should remove it and run the build manually (see below) | ||
- name: Autobuild | ||
uses: github/codeql-action/autobuild@v2 | ||
|
||
# ℹ️ Command-line programs to run using the OS shell. | ||
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun | ||
|
||
# If the Autobuild fails above, remove it and uncomment the following three lines. | ||
# modify them (or add more) to build your code if your project, please refer to the EXAMPLE below for guidance. | ||
|
||
# - run: | | ||
# echo "Run, Build Application using script" | ||
# ./location_of_script_within_repo/buildscript.sh | ||
|
||
- name: Perform CodeQL Analysis | ||
uses: github/codeql-action/analyze@v2 | ||
with: | ||
category: "/language:${{matrix.language}}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,50 +1,17 @@ | ||
[![Build](https://github.com/kserve/modelmesh/actions/workflows/build.yml/badge.svg?branch=main)](https://github.com/kserve/modelmesh/actions/workflows/build.yml) | ||
|
||
# ModelMesh | ||
|
||
The ModelMesh framework is a mature, general-purpose model serving management/routing layer designed for high-scale, high-density and frequently-changing model use cases. It works with existing or custom-built model servers and acts as a distributed LRU cache for serving runtime models. | ||
|
||
See these [these charts](https://github.com/kserve/modelmesh/files/8854091/modelmesh-jun2022.pdf) for more information on supported features and design details. | ||
|
||
For full Kubernetes-based deployment and management of ModelMesh clusters and models, see the [ModelMesh Serving](https://github.com/kserve/modelmesh-serving) repo. This includes a separate controller and provides K8s custom resource based management of ServingRuntimes and InferenceServices along with common, abstracted handling of model repository storage and ready-to-use integrations with some existing OSS model servers. | ||
|
||
### Quick-Start | ||
|
||
1. Wrap your model-loading and invocation logic in this [model-runtime.proto](./src/main/proto/current/model-runtime.proto) gRPC service interface | ||
- `runtimeStatus()` - called only during startup to obtain some basic configuration parameters from the runtime, such as version, capacity, model-loading timeout | ||
- `loadModel()` - load the specified model into memory from backing storage, returning when complete | ||
- `modelSize()` - determine size (mem usage) of previously-loaded model. If very fast, can be omitted and provided instead in the response from `loadModel` | ||
- `unloadModel()` - unload previously loaded model, returning when complete | ||
- Use a separate, arbitrary gRPC service interface for model inferencing requests. It can have any number of methods and they are assumed to be idempotent. See [predictor.proto](src/test/proto/predictor.proto) for a very simple example. | ||
- The methods of your custom applier interface will be called only for already fully-loaded models. | ||
2. Build a grpc server docker container which exposes these interfaces on localhost port 8085 or via a mounted unix domain socket | ||
3. Extend the [Kustomize-based Kubernetes manifests](config) to use your docker image, and with appropriate mem and cpu resource allocations for your container | ||
4. Deploy to a Kubernetes cluster as a regular Service, which will expose [this grpc service interface](./src/main/proto/current/model-mesh.proto) via kube-dns (you do not implement this yourself), consume using grpc client of your choice from your upstream service components | ||
- `registerModel()` and `unregisterModel()` for registering/removing models managed by the cluster | ||
- Any custom inferencing interface methods to make a runtime invocation of previously-registered model, making sure to set a `mm-model-id` or `mm-vmodel-id` metadata header (or `-bin` suffix equivalents for UTF-8 ids) | ||
|
||
### Deployment and Upgrades | ||
|
||
Prerequisites: | ||
|
||
- An etcd cluster (shared or otherwise) | ||
- A Kubernetes namespace with the etcd cluster connection details configured as a secret key in [this json format](https://github.com/IBM/etcd-java/blob/master/etcd-json-schema.md) | ||
- Note that if provided, the `root_prefix` attribute _is_ used as a key prefix for all of the framework's use of etcd | ||
|
||
From an operational standpoint, ModelMesh behaves just like any other homogeneous clustered microservice. This means it can be deployed, scaled, migrated and upgraded as a regular Kubernetes deployment without any special coordination needed, and without any impact to live service usage. | ||
|
||
In particular the procedure for live upgrading either the framework container or service runtime container is the same: change the image version in the deployment config yaml and then update it `kubectl apply -f model-mesh-deploy.yaml` | ||
For more information on supported features and design details, see [these charts](https://github.com/kserve/modelmesh/files/8854091/modelmesh-jun2022.pdf). | ||
|
||
### Build | ||
## Get Started | ||
|
||
Sample build: | ||
To learn more about and get started with the ModelMesh framework, check out [the documentation](/docs). | ||
|
||
```bash | ||
GIT_COMMIT=$(git rev-parse HEAD) | ||
BUILD_ID=$(date '+%Y%m%d')-$(git rev-parse HEAD | cut -c -5) | ||
IMAGE_TAG_VERSION="dev" | ||
IMAGE_TAG=${IMAGE_TAG_VERSION}-$(git branch --show-current)_${BUILD_ID} | ||
## Developer guide | ||
|
||
docker build -t modelmesh:${IMAGE_TAG} \ | ||
--build-arg imageVersion=${IMAGE_TAG} \ | ||
--build-arg buildId=${BUILD_ID} \ | ||
--build-arg commitSha=${GIT_COMMIT} . | ||
``` | ||
Use the [developer guide](developer-guide.md) to learn about development practices for the project. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,220 @@ | ||
# Developer Guide | ||
|
||
## Prerequisites | ||
|
||
You need [Java](https://openjdk.org/) and [Maven](https://maven.apache.org/guides/getting-started/maven-in-five-minutes.html#running-maven-tools) | ||
to build ModelMesh from source and [`etcd`](https://etcd.io/) to run the unit tests. | ||
To build your custom `modelmesh` container image and deploy it to a ModelMesh Serving installation on a Kubernetes cluster, | ||
you need the [`docker`](https://docs.docker.com/engine/reference/commandline/cli/) and | ||
[`kubectl`](https://kubectl.docs.kubernetes.io/references/kubectl/) CLIs. | ||
On `macOS` you can install the required CLIs with [Homebrew](https://brew.sh/): | ||
|
||
- Java: `brew install java` | ||
- Maven: `brew install maven` | ||
- Etcd: `brew install etcd` | ||
- Docker: `brew install docker` | ||
- Kubectl: `brew install kubectl` | ||
|
||
## Generating sources | ||
|
||
The gRPC stubs like the `ModelMeshGrpc` class have to be generated by the gRPC proto compiler from | ||
the `.proto` source files under `src/main/proto`. | ||
The generated sources should be created in the target directory `target/generated-sources/protobuf/grpc-java`. | ||
|
||
To generate the sources run either of the following commands: | ||
|
||
```shell | ||
mvn package -DskipTests | ||
mvn install -DskipTests | ||
``` | ||
|
||
## Project setup using an IDE | ||
|
||
If you are using an IDE like [IntelliJ IDEA](https://www.jetbrains.com/idea/) or [Eclipse](https://eclipseide.org/) | ||
to help with your code development you should set up source and target folders so that the IDE's compiler can find all | ||
the source code including the generated sources (after running `mvn install -DskipTests`). | ||
|
||
For IntelliJ this can be done by going to **File > Project Structure ... > Modules**: | ||
|
||
- **Source Folders** | ||
- src/main/java | ||
- src/main/proto | ||
- target/generated-sources/protobuf/grpc-java (generated) | ||
- target/generated-sources/protobuf/java (generated) | ||
- **Test Source Folders** | ||
- src/test/java | ||
- target/generated-test-sources/protobuf/grpc-java (generated) | ||
- target/generated-test-sources/protobuf/java (generated) | ||
- **Resource Folders** | ||
- src/main/resources | ||
- **Test Resource Folders** | ||
- src/test/resources | ||
- **Excluded Folders** | ||
- target | ||
|
||
You may also want to increase your Java Heap size to at least 1.5 GB. | ||
|
||
## Testing code changes | ||
|
||
**Note**, before running the test cases, make sure you have `etcd` installed (see #prerequisites): | ||
|
||
```Bash | ||
$ etcd --version | ||
|
||
etcd Version: 3.5.5 | ||
Git SHA: 19002cfc6 | ||
Go Version: go1.19.1 | ||
Go OS/Arch: darwin/amd64 | ||
``` | ||
|
||
You can either run all test suites at once. You can use the `-q` flag to reduce noise: | ||
|
||
```Bash | ||
mvn test -q | ||
``` | ||
|
||
Or you can run individual test cases: | ||
|
||
```Bash | ||
mvn test -Dtest=ModelMeshErrorPropagationTest | ||
mvn test -Dtest=SidecarModelMeshTest,ModelMeshFailureExpiryTest | ||
``` | ||
|
||
It can be handy to use `grep` to reduce output noise: | ||
|
||
```Bash | ||
mvn test -Dtest=SidecarModelMeshTest,ModelMeshFailureExpiryTest | \ | ||
grep -E " Running |\[ERROR\]|Failures|SUCCESS|Skipp|Total time|Finished" | ||
|
||
[INFO] Running com.ibm.watson.modelmesh.ModelMeshFailureExpiryTest | ||
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.257 s - in com.ibm.watson.modelmesh.ModelMeshFailureExpiryTest | ||
[INFO] Running com.ibm.watson.modelmesh.SidecarModelMeshTest | ||
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 17.302 s - in com.ibm.watson.modelmesh.SidecarModelMeshTest | ||
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0 | ||
[INFO] BUILD SUCCESS | ||
[INFO] Total time: 39.916 s | ||
[INFO] Finished at: 2022-11-01T14:33:33-07:00 | ||
``` | ||
|
||
## Building the container image | ||
|
||
After testing your code changes locally, it's time to build a new `modelmesh` container image. Replace the value of the | ||
`DOCKER_USER` environment variable to your DockerHub user ID and change the `IMAGE_TAG` to something meaningful. | ||
|
||
```bash | ||
export DOCKER_USER="<your-docker-userid>" | ||
export IMAGE_NAME="${DOCKER_USER}/modelmesh" | ||
export IMAGE_TAG="dev" | ||
export GIT_COMMIT=$(git rev-parse HEAD) | ||
export BUILD_ID=$(date '+%Y%m%d')-$(git rev-parse HEAD | cut -c -5) | ||
|
||
docker build -t ${IMAGE_NAME}:${IMAGE_TAG} \ | ||
--build-arg imageVersion=${IMAGE_TAG} \ | ||
--build-arg buildId=${BUILD_ID} \ | ||
--build-arg commitSha=${GIT_COMMIT} . | ||
|
||
docker push ${IMAGE_NAME}:${IMAGE_TAG} | ||
``` | ||
|
||
## Updating the ModelMesh Serving deployment | ||
|
||
In order to test the code changes in an existing [ModelMesh Serving](https://github.com/kserve/modelmesh-serving) deployment, | ||
the newly built container image needs to be added to the `model-serving-config` ConfigMap. | ||
|
||
First, check if your ModelMesh Serving deployment already has an existing `model-serving-config` ConfigMap: | ||
|
||
```Shell | ||
kubectl get configmap | ||
|
||
NAME DATA AGE | ||
kube-root-ca.crt 1 4d2h | ||
model-serving-config 1 4m14s | ||
model-serving-config-defaults 1 4d2h | ||
tc-config 2 4d2h | ||
``` | ||
|
||
If the ConfigMap list contains `model-serving-config`, save the contents of your existing configuration | ||
in a local temp file: | ||
|
||
```Bash | ||
mkdir -p temp | ||
kubectl get configmap model-serving-config -o yaml > temp/model-serving-config.yaml | ||
``` | ||
|
||
And add the `modelMeshImage` property to the `config.yaml` string property: | ||
```YAML | ||
modelMeshImage: | ||
name: <your-docker-userid>/modelmesh | ||
tag: dev | ||
``` | ||
Replace the `<your-docker-userid>` placeholder with your Docker username/login. | ||
|
||
The complete ConfigMap YAML file might look like this: | ||
|
||
```YAML | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: model-serving-config | ||
namespace: modelmesh-serving | ||
data: | ||
config.yaml: | | ||
podsPerRuntime: 1 | ||
restProxy: | ||
enabled: true | ||
scaleToZero: | ||
enabled: false | ||
gracePeriodSeconds: 5 | ||
modelMeshImage: | ||
name: <your-docker-userid>/modelmesh | ||
tag: dev | ||
``` | ||
|
||
Apply the ConfigMap to your cluster: | ||
|
||
```Bash | ||
kubectl apply -f temp/model-serving-config.yaml | ||
``` | ||
|
||
If you are comfortable using vi, you can forgo creating a temp file and edit the ConfigMap directly in the terminal: | ||
|
||
```Shell | ||
kubectl edit configmap model-serving-config | ||
``` | ||
|
||
If you did not already have a `model-serving-config` ConfigMap on your cluster, you can create one like this: | ||
|
||
```shell | ||
# export DOCKER_USER="<your-docker-userid>" | ||
# export IMAGE_NAME="${DOCKER_USER}/modelmesh" | ||
# export IMAGE_TAG="dev" | ||
kubectl apply -f - <<EOF | ||
--- | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: model-serving-config | ||
data: | ||
config.yaml: | | ||
modelMeshImage: | ||
name: ${IMAGE_NAME} | ||
tag: ${IMAGE_TAG} | ||
EOF | ||
``` | ||
|
||
The `modelmesh-controller` watches the ConfigMap and responds to updates by automatically restarting the serving runtime | ||
pods using the newly built `modelmesh` container image. | ||
|
||
You can check which container images are used by running the following command: | ||
|
||
```Shell | ||
kubectl get pods -o jsonpath='{range .items[*]}{"\n"}{.metadata.name}{"\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}' | sort | column -ts $'\t' | sed 's/, *$//g' | ||
etcd-78ff7867d5-45svw quay.io/coreos/etcd:v3.5.4 | ||
minio-6ddbfc9665-gtf7x kserve/modelmesh-minio-examples:latest | ||
modelmesh-controller-64f5c8d6d6-k6rzc kserve/modelmesh-controller:latest | ||
modelmesh-serving-mlserver-1.x-84884c6849-s8dw6 kserve/rest-proxy:latest, seldonio/mlserver:1.3.2, kserve/modelmesh-runtime-adapter:latest, kserve/modelmesh:dev | ||
modelmesh-serving-mlserver-1.x-84884c6849-xpdw4 kserve/rest-proxy:latest, seldonio/mlserver:1.3.2, kserve/modelmesh-runtime-adapter:latest, kserve/modelmesh:dev | ||
``` |
Oops, something went wrong.