Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running in instruct mode and model file in a different directory #35

Open
regstuff opened this issue Apr 30, 2023 · 5 comments
Open

Running in instruct mode and model file in a different directory #35

regstuff opened this issue Apr 30, 2023 · 5 comments

Comments

@regstuff
Copy link

Was wondering how I could pass the arguments --instruct and --model to the npm start command.
PORT=14003 npm start mlock ctx_size 1500 threads 12 instruct model ~/llama_models/wizardLM-7B-GGML/wizardLM-7B.ggml.q5_1.bin
I get an Args error: instruct is not a valid argument. model is not a valid argument.
These are valid arguments for llama.cpp to run alpaca style models from a directory other than the default model folder.

@keldenl
Copy link
Owner

keldenl commented May 1, 2023

instruct isn't a valid flag because it's encompassed in the api itself – ChatCompletion will simulate a chat response and Completion will simulate a completion specifically. So it's not a necessary flag (the app using the OpenAi API should already be doing the right "instruct" mode when necessary)

For the model, you want to pass that into the gpt-app instead (like chatbot-ui or auto-gpt), typically in the .env file, so it'd look something like OPENAI_API_KEY=../llama.cpp/models/wizardLM-7B-GGML/wizardLM-7B.ggml.q5_1.bin

@lee-b
Copy link

lee-b commented May 1, 2023

That would be weird abuse of a variable. It would be much better to have a LOCAL_MODEL_PATH variable, and if no local model path is set, then use OpenAI's API, for example. I would favor trying to use a de facto standard local API such as text-generation-webui's API, rather than trying to reinvent the wheel by running local models directly, though. For one thing, sharing one local API means that multiple tools can use it. For another, there's a LOT of complexity in supporting local acceleration hardware and different model types and so on. Just using a standard local API makes it a lot simpler.

@regstuff
Copy link
Author

regstuff commented May 1, 2023

@keldenl
Sorry I think I'm missing something. How do I get it to follow the ### INSTRUCTION ### RESPONSE template for alpaca and similar models. When I use chatcompletion, it seems to be in a User: Assistant: template, which isn't working for wizardLM. The LLM doesn't follow my instructions.
When I use the Completions endpoint and add the Instruction Response template into the prompt, the server seems to hang and no response is generated.
It Processes the prompt, and then the ===== RESPONSE ===== line appears, and that's it.

@keldenl
Copy link
Owner

keldenl commented May 6, 2023

That would be weird abuse of a variable. It would be much better to have a LOCAL_MODEL_PATH variable, and if no local model path is set, then use OpenAI's API, for example. I would favor trying to use a de facto standard local API such as text-generation-webui's API, rather than trying to reinvent the wheel by running local models directly, though. For one thing, sharing one local API means that multiple tools can use it. For another, there's a LOT of complexity in supporting local acceleration hardware and different model types and so on. Just using a standard local API makes it a lot simpler.

The thing about this is that the end goal for this project to be able to plug 'n play with any GPT-powered project – the less changes (even 0 changes like in chatbot-ui) to the code the better. LOCAL_MODEL_PATH is something people need to account for (i.e. langchain supporting local models), but this project aims to solve for all the other GPT apps that exist out there how can we leverage the work folks have done but run a local model against it? That's the goal.

@keldenl
Copy link
Owner

keldenl commented May 6, 2023

@keldenl
Sorry I think I'm missing something. How do I get it to follow the ### INSTRUCTION ### RESPONSE template for alpaca and similar models. When I use chatcompletion, it seems to be in a User: Assistant: template, which isn't working for wizardLM. The LLM doesn't follow my instructions.
When I use the Completions endpoint and add the Instruction Response template into the prompt, the server seems to hang and no response is generated.
It Processes the prompt, and then the ===== RESPONSE ===== line appears, and that's it.

@regstuff it sounds like you might be running into a different issue – any chance you could post what's showing up on your terminal and what the request is? (where are you using the server? chatbot-ui?)

also i just merged some changes that should give u better error logging so maybe pull and then post here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants