Skip to content

Commit

Permalink
DOC: update builtin models (#2587)
Browse files Browse the repository at this point in the history
  • Loading branch information
qinxuye authored Nov 26, 2024
1 parent e8c480b commit 760f31f
Show file tree
Hide file tree
Showing 22 changed files with 576 additions and 39 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,9 @@ static/
# doc
doc/source/savefig/

# local env
local_env

asv/results

.DS_Store
2 changes: 1 addition & 1 deletion doc/source/getting_started/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Currently, supported models include:

.. vllm_start
- ``llama-2``, ``llama-3``, ``llama-3.1``, ``llama-2-chat``, ``llama-3-instruct``, ``llama-3.1-instruct``
- ``llama-2``, ``llama-3``, ``llama-3.1``, ``llama-3.2-vision``, ``llama-2-chat``, ``llama-3-instruct``, ``llama-3.1-instruct``
- ``mistral-v0.1``, ``mistral-instruct-v0.1``, ``mistral-instruct-v0.2``, ``mistral-instruct-v0.3``, ``mistral-nemo-instruct``, ``mistral-large-instruct``
- ``codestral-v0.1``
- ``Yi``, ``Yi-1.5``, ``Yi-chat``, ``Yi-1.5-chat``, ``Yi-1.5-chat-16k``
Expand Down
20 changes: 20 additions & 0 deletions doc/source/models/builtin/audio/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,21 +31,41 @@ The following is a list of built-in audio models in Xinference:

whisper-base

whisper-base-mlx

whisper-base.en

whisper-base.en-mlx

whisper-large-v3

whisper-large-v3-mlx

whisper-large-v3-turbo

whisper-large-v3-turbo-mlx

whisper-medium

whisper-medium-mlx

whisper-medium.en

whisper-medium.en-mlx

whisper-small

whisper-small-mlx

whisper-small.en

whisper-small.en-mlx

whisper-tiny

whisper-tiny-mlx

whisper-tiny.en

whisper-tiny.en-mlx

19 changes: 19 additions & 0 deletions doc/source/models/builtin/audio/whisper-base-mlx.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.. _models_builtin_whisper-base-mlx:

================
whisper-base-mlx
================

- **Model Name:** whisper-base-mlx
- **Model Family:** whisper
- **Abilities:** audio-to-text
- **Multilingual:** True

Specifications
^^^^^^^^^^^^^^

- **Model ID:** mlx-community/whisper-base-mlx

Execute the following command to launch the model::

xinference launch --model-name whisper-base-mlx --model-type audio
19 changes: 19 additions & 0 deletions doc/source/models/builtin/audio/whisper-base.en-mlx.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.. _models_builtin_whisper-base.en-mlx:

===================
whisper-base.en-mlx
===================

- **Model Name:** whisper-base.en-mlx
- **Model Family:** whisper
- **Abilities:** audio-to-text
- **Multilingual:** False

Specifications
^^^^^^^^^^^^^^

- **Model ID:** mlx-community/whisper-base.en-mlx

Execute the following command to launch the model::

xinference launch --model-name whisper-base.en-mlx --model-type audio
19 changes: 19 additions & 0 deletions doc/source/models/builtin/audio/whisper-large-v3-mlx.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.. _models_builtin_whisper-large-v3-mlx:

====================
whisper-large-v3-mlx
====================

- **Model Name:** whisper-large-v3-mlx
- **Model Family:** whisper
- **Abilities:** audio-to-text
- **Multilingual:** True

Specifications
^^^^^^^^^^^^^^

- **Model ID:** mlx-community/whisper-large-v3-mlx

Execute the following command to launch the model::

xinference launch --model-name whisper-large-v3-mlx --model-type audio
19 changes: 19 additions & 0 deletions doc/source/models/builtin/audio/whisper-large-v3-turbo-mlx.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.. _models_builtin_whisper-large-v3-turbo-mlx:

==========================
whisper-large-v3-turbo-mlx
==========================

- **Model Name:** whisper-large-v3-turbo-mlx
- **Model Family:** whisper
- **Abilities:** audio-to-text
- **Multilingual:** True

Specifications
^^^^^^^^^^^^^^

- **Model ID:** mlx-community/whisper-large-v3-turbo

Execute the following command to launch the model::

xinference launch --model-name whisper-large-v3-turbo-mlx --model-type audio
19 changes: 19 additions & 0 deletions doc/source/models/builtin/audio/whisper-medium-mlx.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.. _models_builtin_whisper-medium-mlx:

==================
whisper-medium-mlx
==================

- **Model Name:** whisper-medium-mlx
- **Model Family:** whisper
- **Abilities:** audio-to-text
- **Multilingual:** True

Specifications
^^^^^^^^^^^^^^

- **Model ID:** mlx-community/whisper-medium-mlx

Execute the following command to launch the model::

xinference launch --model-name whisper-medium-mlx --model-type audio
19 changes: 19 additions & 0 deletions doc/source/models/builtin/audio/whisper-medium.en-mlx.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.. _models_builtin_whisper-medium.en-mlx:

=====================
whisper-medium.en-mlx
=====================

- **Model Name:** whisper-medium.en-mlx
- **Model Family:** whisper
- **Abilities:** audio-to-text
- **Multilingual:** False

Specifications
^^^^^^^^^^^^^^

- **Model ID:** mlx-community/whisper-medium.en-mlx

Execute the following command to launch the model::

xinference launch --model-name whisper-medium.en-mlx --model-type audio
19 changes: 19 additions & 0 deletions doc/source/models/builtin/audio/whisper-small-mlx.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.. _models_builtin_whisper-small-mlx:

=================
whisper-small-mlx
=================

- **Model Name:** whisper-small-mlx
- **Model Family:** whisper
- **Abilities:** audio-to-text
- **Multilingual:** True

Specifications
^^^^^^^^^^^^^^

- **Model ID:** mlx-community/whisper-small-mlx

Execute the following command to launch the model::

xinference launch --model-name whisper-small-mlx --model-type audio
19 changes: 19 additions & 0 deletions doc/source/models/builtin/audio/whisper-small.en-mlx.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.. _models_builtin_whisper-small.en-mlx:

====================
whisper-small.en-mlx
====================

- **Model Name:** whisper-small.en-mlx
- **Model Family:** whisper
- **Abilities:** audio-to-text
- **Multilingual:** False

Specifications
^^^^^^^^^^^^^^

- **Model ID:** mlx-community/whisper-small.en-mlx

Execute the following command to launch the model::

xinference launch --model-name whisper-small.en-mlx --model-type audio
19 changes: 19 additions & 0 deletions doc/source/models/builtin/audio/whisper-tiny-mlx.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.. _models_builtin_whisper-tiny-mlx:

================
whisper-tiny-mlx
================

- **Model Name:** whisper-tiny-mlx
- **Model Family:** whisper
- **Abilities:** audio-to-text
- **Multilingual:** True

Specifications
^^^^^^^^^^^^^^

- **Model ID:** mlx-community/whisper-tiny

Execute the following command to launch the model::

xinference launch --model-name whisper-tiny-mlx --model-type audio
19 changes: 19 additions & 0 deletions doc/source/models/builtin/audio/whisper-tiny.en-mlx.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.. _models_builtin_whisper-tiny.en-mlx:

===================
whisper-tiny.en-mlx
===================

- **Model Name:** whisper-tiny.en-mlx
- **Model Family:** whisper
- **Abilities:** audio-to-text
- **Multilingual:** False

Specifications
^^^^^^^^^^^^^^

- **Model ID:** mlx-community/whisper-tiny.en-mlx

Execute the following command to launch the model::

xinference launch --model-name whisper-tiny.en-mlx --model-type audio
2 changes: 1 addition & 1 deletion doc/source/models/builtin/embedding/gte-qwen2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ Specifications

Execute the following command to launch the model::

xinference launch --model-name gte-Qwen2 --model-type embedding
xinference launch --model-name gte-Qwen2 --model-type embedding
12 changes: 8 additions & 4 deletions doc/source/models/builtin/llm/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -240,16 +240,16 @@ The following is a list of built-in LLM in Xinference:
- chat, tools
- 131072
- The Llama 3.1 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks..

* - :ref:`llama-3.2-vision <models_llm_llama-3.2-vision>`
- generate, vision
- 131072
- The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a collection of pretrained and instruction-tuned image reasoning generative models in 11B and 90B sizes (text + images in / text out)...
- The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image...

* - :ref:`llama-3.2-vision-instruct <models_llm_llama-3.2-vision-instruct>`
- chat, vision
- 131072
- The Llama 3.2-Vision-instruct instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The models outperform many of the available open source and closed multimodal models on common industry benchmarks...
- Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image...

* - :ref:`minicpm-2b-dpo-bf16 <models_llm_minicpm-2b-dpo-bf16>`
- chat
Expand Down Expand Up @@ -641,6 +641,10 @@ The following is a list of built-in LLM in Xinference:

llama-3.1-instruct

llama-3.2-vision

llama-3.2-vision-instruct

minicpm-2b-dpo-bf16

minicpm-2b-dpo-fp16
Expand Down
2 changes: 1 addition & 1 deletion doc/source/models/builtin/llm/llama-3.1-instruct.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ llama-3.1-instruct
- **Context Length:** 131072
- **Model Name:** llama-3.1-instruct
- **Languages:** en, de, fr, it, pt, hi, es, th
- **Abilities:** chat
- **Abilities:** chat, tools
- **Description:** The Llama 3.1 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks..

Specifications
Expand Down
10 changes: 5 additions & 5 deletions doc/source/models/builtin/llm/llama-3.2-vision-instruct.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,12 @@ llama-3.2-vision-instruct
- **Model Name:** llama-3.2-vision-instruct
- **Languages:** en, de, fr, it, pt, hi, es, th
- **Abilities:** chat, vision
- **Description:** The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The models outperform many of the available open source and closed multimodal models on common industry benchmarks...
- **Description:** Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image...

Specifications
^^^^^^^^^^^^^^


Model Spec 1 (pytorch, 11 Billion)
++++++++++++++++++++++++++++++++++++++++

Expand All @@ -26,8 +27,8 @@ Model Spec 1 (pytorch, 11 Billion)
Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-engine transformers --model-name llama-3.2-vision-instruct --size-in-billions 11 --model-format pytorch --quantization ${quantization}
xinference launch --model-engine vllm --enforce_eager --max_num_seqs 16 --model-name llama-3.2-vision-instruct --size-in-billions 11 --model-format pytorch
xinference launch --model-engine ${engine} --model-name llama-3.2-vision-instruct --size-in-billions 11 --model-format pytorch --quantization ${quantization}


Model Spec 2 (pytorch, 90 Billion)
++++++++++++++++++++++++++++++++++++++++
Expand All @@ -42,6 +43,5 @@ Model Spec 2 (pytorch, 90 Billion)
Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-engine transformers --model-name llama-3.2-vision-instruct --size-in-billions 90 --model-format pytorch --quantization ${quantization}
xinference launch --model-engine vllm --enforce_eager --max_num_seqs 16 --model-name llama-3.2-vision-instruct --size-in-billions 90 --model-format pytorch
xinference launch --model-engine ${engine} --model-name llama-3.2-vision-instruct --size-in-billions 90 --model-format pytorch --quantization ${quantization}

Loading

0 comments on commit 760f31f

Please sign in to comment.