Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* parent 2257489 author Dan Sun <[email protected]> 1698039744 -0400 committer agriffith50 <[email protected]> 1716219052 -0400 parent 2257489 author Dan Sun <[email protected]> 1698039744 -0400 committer agriffith50 <[email protected]> 1716218313 -0400 parent 2257489 author Dan Sun <[email protected]> 1698039744 -0400 committer agriffith50 <[email protected]> 1716217744 -0400 Add TorchServe Huggingface accelerate example (#304) * Add LLM example for huggingface accelerate Signed-off-by: Dan Sun <[email protected]> * Add inputs Signed-off-by: Dan Sun <[email protected]> * Update storage uri Signed-off-by: Dan Sun <[email protected]> * Add to LLM runtime to index Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> 0.11 release blog (#310) * Add 0.11 release blog Signed-off-by: Dan Sun <[email protected]> * Update blog Signed-off-by: Dan Sun <[email protected]> * Add vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Add vllm example doc Signed-off-by: Dan Sun <[email protected]> * Update blog link Signed-off-by: Dan Sun <[email protected]> * Add vLLM intro Signed-off-by: Dan Sun <[email protected]> * add python runtime open inference protocol tutorials Signed-off-by: Dan Sun <[email protected]> * Fix warning Signed-off-by: Dan Sun <[email protected]> * Add warning Signed-off-by: Dan Sun <[email protected]> * Address comments Signed-off-by: Dan Sun <[email protected]> * Fix newline Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> Fix torchserve llm example link Signed-off-by: Dan Sun <[email protected]> Fixed formatting in get_started (#319) Signed-off-by: Helber Belmiro <[email protected]> clarify prometheus annotation (#316) Signed-off-by: JuHyung-Son <[email protected]> Document servingruntime constraint introduced by kserve/kserve#3181 (#320) * Document serving runtime constraint introduced by kserve/kserve#3181 Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Set content type for predict/explainer curl requests Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Update docs/modelserving/servingruntimes.md Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Add kubeflow summit 2023 Jooho's presentation link (#325) add kubeflow summit 2023 Jooho's presentation link Signed-off-by: jooho <[email protected]> docs: Add one related presentations from Kubeflow Summit 2023 (#327) * docs: Add two new related presentations from Kubeflow Summit 2023Update presentations.md Signed-off-by: Yuan Tang <[email protected]> * Update presentations.md Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]> Added example for torchserve grpc v1 and v2. (#307) * Added example for torchserve grpc v1 and v2. Signed-off-by: Andrews Arokiam <[email protected]> * Schema order changed. Signed-off-by: Andrews Arokiam <[email protected]> * corrected v2 REST input. Signed-off-by: Andrews Arokiam <[email protected]> * Updated grpc-v2 protocolVersion. Signed-off-by: Andrews Arokiam <[email protected]> * Update README.md * Update README.md * Update README.md --------- Signed-off-by: Andrews Arokiam <[email protected]> Co-authored-by: Dan Sun <[email protected]> Add link to release process doc in developer.md (#330) Signed-off-by: Yuan Tang <[email protected]> Update tranformer collocation docs for specifying storage uri (#323) Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Fix incorrect edit URL to docs (#329) Signed-off-by: Yuan Tang <[email protected]> Set resources for inferencegraph example (#322) Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Fixes #331 - broken link to AMD Inference Server (#332) Tested locally with mkdocs serve Render KServe Python Runtime API doc with mkdoc (#333) * Update KServe python sdk docs Signed-off-by: Dan Sun <[email protected]> * Update serving runtime doc Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> Fix build: Install kserve for rendering the docstring (#334) * Update KServe python sdk docs Signed-off-by: Dan Sun <[email protected]> * Install kserve sdk for mkdocstring Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> Onnx docs update (#275) * Updated Onnx example. Signed-off-by: Andrews Arokiam <[email protected]> * Reverting sklearn doc update as there is a separate PR Signed-off-by: andyi2it <[email protected]> * Added new schema in onnx example. Signed-off-by: Andrews Arokiam <[email protected]> * protocolVersion and old schema updated with onnx example. Signed-off-by: Andrews Arokiam <[email protected]> --------- Signed-off-by: Andrews Arokiam <[email protected]> Signed-off-by: andyi2it <[email protected]> Standardized schema order (#318) * Standardized schema's order. Signed-off-by: Andrews Arokiam <[email protected]> * Fix v2 spec for torch serve --------- Signed-off-by: Andrews Arokiam <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Update link to Slack instructions Signed-off-by: Yuan (Terry) Tang <[email protected]> Update README.md (#344) Fix incorrect storage uri prefix Signed-off-by: zoramt <[email protected]> Added steps to delete model-store-pod (#343) Signed-off-by: murata.yu <[email protected]> Update README.md Signed-off-by: Dan Sun <[email protected]> Add documentation for modelcars (#337) * Add documentation for modelcars, introduced in 0.12 as experimental feature Signed-off-by: Roland Huß <[email protected]> * added some references to this feature Signed-off-by: Roland Huß <[email protected]> --------- Signed-off-by: Roland Huß <[email protected]> add certificate doc (#326) * add certificate doc Signed-off-by: jooho <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: jooho <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> docs: fix the emoji deprecation message and invalid file name (#348) Signed-off-by: Peter Jausovec <[email protected]> Add documentation for GCS (#351) * Add documentation for GCS Signed-off-by: tjandy98 <[email protected]> * Update mkdocs to include GCS Signed-off-by: tjandy98 <[email protected]> * Fix formatting Signed-off-by: tjandy98 <[email protected]> --------- Signed-off-by: tjandy98 <[email protected]> Add ModelRegistry custom storage intializer example (#346) * Add ModelRegistry custom storage intializer example Signed-off-by: Andrea Lamparelli <[email protected]> * Update docs/modelserving/storage/storagecontainers.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Andrea Lamparelli <[email protected]> --------- Signed-off-by: Andrea Lamparelli <[email protected]> Co-authored-by: Dan Sun <[email protected]> Updated docs for autoscaling on gpu. (#328) Signed-off-by: Andrews Arokiam <[email protected]> Update version matrix for 0.12 (#353) * Update version matrix for 0.12 Signed-off-by: Dan Sun <[email protected]> * Update kubernetes_deployment.md Signed-off-by: Dan Sun <[email protected]> * Update notes for gRPC issues Signed-off-by: Dan Sun <[email protected]> * Update kserve install Signed-off-by: Dan Sun <[email protected]> * Update kubernetes_deployment.md Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> docs: update kserve resource yaml file (#356) fix docs Signed-off-by: Niels ten Boom <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update serving runtime version for 0.12 release and add some notes (#354) * Fix few bugs, add quick install failure note and update docs for release 0.12.0 Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Add warning about control plane namespaces Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Resolve comments Signed-off-by: Sivanantham Chinnaiyan <[email protected]> --------- Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Signed-off-by: agriffith50 <[email protected]> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: agriffith50 <[email protected]> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Signed-off-by: agriffith50 <[email protected]> Add Helm installation commands in get started guide (#358) Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update README.md (#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <[email protected]> Signed-off-by: agriffith50 <[email protected]> Add Huggingface Serving Runtime example with Llama2 (#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * fix review comments Signed-off-by: Gavrish Prabhu <[email protected]> * add linking Signed-off-by: Gavrish Prabhu <[email protected]> * fix comments Signed-off-by: Gavrish Prabhu <[email protected]> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> * Update triton doc Signed-off-by: Dan Sun <[email protected]> * Fix Hugging Face Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix Hugging Face Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update adopters.md (#361) Signed-off-by: agriffith50 <[email protected]> Point users to vLLM production server (#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <[email protected]> Signed-off-by: agriffith50 <[email protected]> initial draft of kserve release blog Signed-off-by: agriffith50 <[email protected]> change title Signed-off-by: agriffith50 <[email protected]> resolving comments Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> update comment Signed-off-by: agriffith50 <[email protected]> update for vllm comment Signed-off-by: agriffith50 <[email protected]> add more info about completions endpoints Signed-off-by: agriffith50 <[email protected]> add hf img Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: agriffith50 <[email protected]> add new kserve img Signed-off-by: agriffith50 <[email protected]> Update future plan and other changes Signed-off-by: agriffith50 <[email protected]> Add Huggingface Serving Runtime example with Llama2 (#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * fix review comments Signed-off-by: Gavrish Prabhu <[email protected]> * add linking Signed-off-by: Gavrish Prabhu <[email protected]> * fix comments Signed-off-by: Gavrish Prabhu <[email protected]> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> * Update triton doc Signed-off-by: Dan Sun <[email protected]> * Fix Hugging Face Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix Hugging Face Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update huggingface triton yaml Signed-off-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update blog link Signed-off-by: agriffith50 <[email protected]> Add triton huggingface reference Signed-off-by: agriffith50 <[email protected]> resolve merge Signed-off-by: agriffith50 <[email protected]> docs: update kserve resource yaml file (#356) fix docs Signed-off-by: Niels ten Boom <[email protected]> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: agriffith50 <[email protected]> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Signed-off-by: agriffith50 <[email protected]> Add Helm installation commands in get started guide (#358) Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update README.md (#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <[email protected]> Signed-off-by: agriffith50 <[email protected]> Add Huggingface Serving Runtime example with Llama2 (#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * fix review comments Signed-off-by: Gavrish Prabhu <[email protected]> * add linking Signed-off-by: Gavrish Prabhu <[email protected]> * fix comments Signed-off-by: Gavrish Prabhu <[email protected]> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> * Update triton doc Signed-off-by: Dan Sun <[email protected]> * Fix Hugging Face Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix Hugging Face Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update adopters.md (#361) Signed-off-by: agriffith50 <[email protected]> Point users to vLLM production server (#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <[email protected]> Signed-off-by: agriffith50 <[email protected]> initial draft of kserve release blog Signed-off-by: agriffith50 <[email protected]> change title Signed-off-by: agriffith50 <[email protected]> resolving comments Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> update comment Signed-off-by: agriffith50 <[email protected]> update for vllm comment Signed-off-by: agriffith50 <[email protected]> add hf img Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: agriffith50 <[email protected]> add new kserve img Signed-off-by: agriffith50 <[email protected]> Update future plan and other changes Add Huggingface Serving Runtime example with Llama2 (#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * fix review comments Signed-off-by: Gavrish Prabhu <[email protected]> * add linking Signed-off-by: Gavrish Prabhu <[email protected]> * fix comments Signed-off-by: Gavrish Prabhu <[email protected]> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> * Update triton doc Signed-off-by: Dan Sun <[email protected]> * Fix Hugging Face Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix Hugging Face Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Update huggingface triton yaml Signed-off-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update blog link Signed-off-by: agriffith50 <[email protected]> Add triton huggingface reference Signed-off-by: agriffith50 <[email protected]> resolve merge Signed-off-by: agriffith50 <[email protected]> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <[email protected]> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Add Helm installation commands in get started guide (#358) Signed-off-by: Yuan Tang <[email protected]> Update README.md (#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <[email protected]> Update adopters.md (#361) Point users to vLLM production server (#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Update huggingface triton yaml Signed-off-by: Dan Sun <[email protected]> * fix merge Signed-off-by: agriffith50 <[email protected]> * fix more merge issue Signed-off-by: agriffith50 <[email protected]> * Move up the diagram Signed-off-by: agriffith50 <[email protected]> * fix flag naming Signed-off-by: agriffith50 <[email protected]> * update slack Signed-off-by: agriffith50 <[email protected]> * Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Yuan Tang <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> * Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Yuan Tang <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> * Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Yuan Tang <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> * fix Hugging Face Signed-off-by: agriffith50 <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Co-authored-by: Dan Sun <[email protected]> Co-authored-by: Yuan Tang <[email protected]>
- Loading branch information