Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support nexa model with nexa-sdk #1053

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

LuciusMos
Copy link
Collaborator

@LuciusMos LuciusMos commented Oct 14, 2024

Description

Support v1/chat/completions

Motivation and Context

close #925

  • I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Put an x in all the boxes that apply:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds core functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)
  • Example (update in the folder of example)

Implemented Tasks

  • Subtask 1
  • Subtask 2
  • Subtask 3

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

  • I have read the CONTRIBUTION guide. (required)
  • My change requires a change to the documentation.
  • I have updated the tests accordingly. (required for a bug fix or a new feature)
  • I have updated the documentation accordingly.

… not support start_server. not support other callings in nexa-sdk
@LuciusMos LuciusMos self-assigned this Oct 14, 2024
@LuciusMos LuciusMos requested a review from Wendong-Fan October 14, 2024 22:16
@Wendong-Fan Wendong-Fan added the Model Related to backend models label Oct 15, 2024
@Wendong-Fan Wendong-Fan added this to the Sprint 14 milestone Oct 15, 2024
@Wendong-Fan Wendong-Fan linked an issue Oct 15, 2024 that may be closed by this pull request
2 tasks
Copy link
Member

@Wendong-Fan Wendong-Fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @LuciusMos , left some comments below, I think for now there's some issue which may make the model can't be called as expected, could you do some further test and add example? Thanks


def __init__(
self,
model_type: ModelType,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no defined ModelType for Nexa

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I define ModelType for Nexa? Or should I just use str type for model platforms like Nexa as well as vLLM, ollama?

Copy link
Member

@Wendong-Fan Wendong-Fan Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now we support both ModelType and str as model_type input, please update with our latest master branch, I think for Nexa we can just use str since there are many models hosted

response = self._client.chat.completions.create(
messages=messages,
# model=self.model_type,
model="model-type",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here shouldn't be hard coded

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as above. This would be solved if I simply use str type.

the tokens with top_p probability mass. So :obj:`0.1` means only
the tokens comprising the top 10% probability mass are considered.
(default: :obj:`1.0`)
top_k (int, optional): The number of highest probability vocabulary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to use openai compatibility, I think this parameter would not be accepted

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is no evidence that nexa-sdk officially supports OpenAI serving (but it works well in my testing), do you think it is ok for us to delete top_k and nctx for now, and support them in the future if there are updates in nexa-sdk (and openai)?

Copy link
Member

@Wendong-Fan Wendong-Fan Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from https://github.com/NexaAI/nexa-sdk they mentioned that they offers an OpenAI-compatible API server, and we can delete top_k and nctx for now

(default: :obj:`False`)
stop (str or list, optional): List of stop words for early stopping
generating further tokens. (default: :obj:`None`)
nctx (int, optional): Length of context window. (default: :obj:`2048`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to use openai compatibility, I think this parameter would not be accepted

@Wendong-Fan Wendong-Fan changed the title Support nexa model with nexa-sdk feat: Support nexa model with nexa-sdk Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Model Related to backend models
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[Feature Request] Integrate Nexa On-Device AI models
2 participants