Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow speed Vicuna - 7B Help plz #45

Open
C0deXG opened this issue May 13, 2023 · 3 comments
Open

Slow speed Vicuna - 7B Help plz #45

C0deXG opened this issue May 13, 2023 · 3 comments

Comments

@C0deXG
Copy link

C0deXG commented May 13, 2023

when i ask a qestions it is soo slow it is taking forever to write one sentence how can i make it faster btw am using vicuna 7B to make it light wight for me and am using mac OS m2 chip and that doesnt even help :( so can i host the gpt-llama.cpp on render if so yes when i run sh ./scripts/test-installation.sh what should i put for the port and the locations of the file since am using render to render the model to make it faster ?

@C0deXG
Copy link
Author

C0deXG commented May 13, 2023

when i ask a qestions it is soo slow it is taking forever to write one sentence how can i make it faster btw am using vicuna 7B to make it light wight for me and am using mac OS m2 chip and that doesnt even help :( so can i host the gpt-llama.cpp on render if so yes when i run sh ./scripts/test-installation.sh what should i put for the port and the locations of the file since am using render to render the model to make it faster ?

fallow up: if i use render for example and i run on my pc or somewhere else sh ./scripts/test-installation.sh and it ask me the port am running since render uses URL base how am i gonna get this to work web-base or host the backend/model and where to host it

@keldenl
Copy link
Owner

keldenl commented May 13, 2023

try using mlock, that had historically helped me when i've had memory issues

@msj121
Copy link

msj121 commented Jun 11, 2023

Also sometimes lowering the thread count helps, because it oversaturates, or perhaps uses a slower worker thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants