Low Quality (Python Bindings, LLAMA 13B Q4) #978

R4mboX · 2023-04-14T18:56:55Z

R4mboX
Apr 14, 2023

Hi There.

I'm using LLaMA 13B Q4_0 with the Python bindings in CPU Mode.
I don't manage to get out good responses like I'm used to when working with the text-generation-webui.
I think I'm doing something wrong.

Thats how I Instantiate the Model:
llm = Llama(model_path="./llama.cpp/models/13B/ggml-model-q4_0.bin", seed = 0, n_ctx = 1200)

And thats how I try to get a response:
output = llm(prompt, max_tokens=64, stop=[Human_Name + ":", "\n"], echo=True)

Technically everything works, but the quality of the response I get is quite off, and no where near I want it to be.
It should be like in chat-mode. Mostly I get out of context responses, sometimes empty responses and sometimes gibberish.

Here's the Full Prompt I use in output = llm():

Can you help me?

Some more Info about the Model:

Response Meta:

amirouche · 2023-04-14T19:44:08Z

amirouche
Apr 14, 2023

I had a similar xp.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Low Quality (Python Bindings, LLAMA 13B Q4) #978

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Low Quality (Python Bindings, LLAMA 13B Q4) #978

R4mboX Apr 14, 2023

Replies: 1 comment

amirouche Apr 14, 2023

R4mboX
Apr 14, 2023

amirouche
Apr 14, 2023