MindSpore Serving is a lightweight and high-performance service module that helps MindSpore developers efficiently deploy online inference services in the production environment. After completing model training on MindSpore, you can export the MindSpore model and use MindSpore Serving to create an inference service for the model.
MindSpore Serving architecture:
MindSpore Serving includes two parts: Client
and Server
. On a Client
node, you can deliver inference service
commands through the gRPC or RESTful API. The Server
consists of a Main
node and one or more Worker
nodes.
The Main
node manages all Worker
nodes and their model information, accepts user requests from Client
s, and
distributes the requests to Worker
nodes. Servable
is deployed on a worker node, indicates a single model or a
combination of multiple models and can provide different services in various methods. `
On the server side, when MindSpore is used as the inference backend,, MindSpore Serving supports the Ascend 910 and Nvidia GPU environments. When MindSpore Lite is used as the inference backend, MindSpore Serving supports Ascend 310/310P, Nvidia GPU and CPU environments. Client` does not depend on specific hardware platforms.
MindSpore Serving provides the following functions:
- gRPC and RESTful APIs on clients
- Pre-processing and post-processing of assembled models
- Batch. Multiple instance requests are split and combined to meet the
batch size
requirement of the model. - Simple Python APIs on clients
- The multi-model combination is supported. The multi-model combination and single-model scenarios use the same set of interfaces.
- Distributed model inference
For details about how to install and configure MindSpore Serving, see the MindSpore Serving installation page.
MindSpore-based Inference Service Deployment is used to demonstrate how to use MindSpore Serving.
- gRPC-based MindSpore Serving Access
- RESTful-based MindSpore Serving Access
- Services Provided Through Model Configuration
- Services Composed of Multiple Models
- MindSpore Serving-based Distributed Inference Service Deployment
For more details about the installation guide, tutorials, and APIs, see MindSpore Python API.
- MindSpore Slack developer communication platform
Welcome to MindSpore contribution.