MindSpore Serving

MindSpore Serving

Overview

MindSpore Serving is a lightweight and high-performance service module that helps MindSpore developers efficiently deploy online inference services in the production environment. After completing model training on MindSpore, you can export the MindSpore model and use MindSpore Serving to create an inference service for the model.

MindSpore Serving architecture:

MindSpore Serving includes two parts: Client and Server. On a Client node, you can deliver inference service commands through the gRPC or RESTful API. The Server consists of a Main node and one or more Worker nodes. The Main node manages all Worker nodes and their model information, accepts user requests from Clients, and distributes the requests to Worker nodes. Servable is deployed on a worker node, indicates a single model or a combination of multiple models and can provide different services in various methods. `

On the server side, when MindSpore is used as the inference backend,, MindSpore Serving supports the Ascend 910 and Nvidia GPU environments. When MindSpore Lite is used as the inference backend, MindSpore Serving supports Ascend 310/310P, Nvidia GPU and CPU environments. Client` does not depend on specific hardware platforms.

MindSpore Serving provides the following functions:

gRPC and RESTful APIs on clients
Pre-processing and post-processing of assembled models
Batch. Multiple instance requests are split and combined to meet the batch size requirement of the model.
Simple Python APIs on clients
The multi-model combination is supported. The multi-model combination and single-model scenarios use the same set of interfaces.
Distributed model inference

Installation

For details about how to install and configure MindSpore Serving, see the MindSpore Serving installation page.

Quick Start

MindSpore-based Inference Service Deployment is used to demonstrate how to use MindSpore Serving.

Documents

Developer Guide

gRPC-based MindSpore Serving Access
RESTful-based MindSpore Serving Access
Services Provided Through Model Configuration
Services Composed of Multiple Models
MindSpore Serving-based Distributed Inference Service Deployment

For more details about the installation guide, tutorials, and APIs, see MindSpore Python API.

Community

Governance

MindSpore Open Governance

Communication

MindSpore Slack developer communication platform

Contributions

Welcome to MindSpore contribution.

Release Notes

RELEASE

License

Apache License 2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MindSpore Serving

Overview

Installation

Quick Start

Documents

Developer Guide

Community

Governance

Communication

Contributions

Release Notes

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

MindSpore Serving

Overview

Installation

Quick Start

Documents

Developer Guide

Community

Governance

Communication

Contributions

Release Notes

License