-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Auto-GPT #2
Comments
Just got AutoGPT working consistently from this embeddings fix b3db39f Still need to manually update the vector size in autogpt to 5120 or 4096 depending on the llama model (see https://huggingface.co/shalomma/llama-7b-embeddings#quantitative-analysis), but it works! vicuna 13b, the best model i've tested so far, isn't doing a GREAT job generating actions that it can follow continuously (gets stuck in a loop) |
I'll put up a fork of AutoGPT tmr with all the changes I made (openai BASE_URL and configurable vector dimensions) to make it easier for folks to replicate and test as well. |
Thanks for your work! i was exited to try it out and made those changes myself. i also made a pull request Significant-Gravitas/AutoGPT#2594 Feel free to add your changes if i missed something |
wow @DGdev91 that was EXACTLY what i was going to do.. thanks!! nothing to add for me |
Re-opening this until we get the PR Significant-Gravitas/AutoGPT#2594 fully merged |
🚀 🚀 AUTOGPT + GPT-LLAMA.CPP GUIDE IS NOW AVAILABLE: https://github.com/keldenl/gpt-llama.cpp/blob/master/docs/Auto-GPT-setup-guide.md HUGE shoutout to @DGdev91 for putting up the PR to make this possible – hopefully the PR gets merged to master soon. Here's a (very short) demo of it running on my M1 Mac: https://github.com/keldenl/gpt-llama.cpp/blob/master/docs/demos.md#Auto-GPT |
@keldenl Hi, I tested gpt-llama.cpp's API using the following script:
My model is original llama 7B mode and I succesfully got the response from localhost:443:
I also configured the Auto-GPT according to this guide: https://github.com/keldenl/gpt-llama.cpp/blob/master/docs/Auto-GPT-setup-guide.md. But this time when I run the
Is there something wrong? |
I guess you already have done it, but let's check anyway. The example uses vicuna 13b as model, so the settings on .env file are set like this
In your case, should be like this:
Have you already did that, or you just copied the values in the example? Also, i'm not 100% sure the llama 7b model is good enough to handle AutoGPT requests correctly. I suggest you to try with vicuna. Even better the 13b model, wich is the one i and kendenl used for testing |
yes, I already have done that, exactly the same as you mentioned. I’m quite sure the auto-gpt configuration is good. It seems like a problem with llama model itself or gpt-llama.cpp. You are right, I will try with Vicuna and see if same problem exists. Thanks!
发自我的iPhone
…------------------ Original ------------------
From: DGdev91 ***@***.***>
Date: Fri,Apr 21,2023 1:53 AM
To: keldenl/gpt-llama.cpp ***@***.***>
Cc: Yuhua Wei ***@***.***>, Comment ***@***.***>
Subject: Re: [keldenl/gpt-llama.cpp] Add support for Auto-GPT (Issue #2)
@keldenl Hi, I tested gpt-llama.cpp's API using the following script:
curl --location --request POST 'http://localhost:443/v1/chat/completions' \ --header 'Authorization: Bearer /home/nero/code/llama.cpp/models/7B/ggml-model-q4_0.bin' \ --header 'Content-Type: application/json' \ --data-raw '{ "model": "gpt-3.5-turbo", "messages": [ { "role": "system", "content": "You are ChatGPT, a helpful assistant developed by OpenAI." }, { "role": "user", "content": "How are you doing today?" } ] }'
My model is original llama 7B mode and I succesfully got the response from localhost:443:
{"choices":[{"message":{"content":" Great! Thanks for asking :)\n"},"finish_reason":"stop","index":0}],"created":1682008784901,"id":"Zi7iQapUGOgjeVVB2oJtc","object":"chat.completion.chunk","usage":{"prompt_tokens":99,"completion_tokens":7,"total_tokens":106}}%
I also configured the Auto-GPT according to this guide: https://github.com/keldenl/gpt-llama.cpp/blob/master/docs/Auto-GPT-setup-guide.md. But this time when I run the python -m autogpt --debug, the server on localhost:443 can't return a response, the terminal is like this:
--REQUEST-- user: ''' You are a helpful assistant. ''', ''' { "command": { "name": "command name", "args": { "arg name": "value" } }, "thoughts": { "text": "thought", "reasoning": "reasoning", "plan": "- short bulleted - list that conveys - long-term plan", "criticism": "constructive self-criticism", "speak": "thoughts summary to say to user" } } ''' Readable Stream: CLOSED
Is there something wrong?
I guess you already have done it, but let's check anyway.
The example uses vicuna 13b as model, so the settings on .env file are set like this
```
OPENAI_API_BASE_URL=http://localhost:443/v1
EMBED_DIM=5120
OPENAI_API_KEY=../llama.cpp/models/vicuna/13B/ggml-vicuna-unfiltered-13b-4bit.bin
In your case, should be like this: ``` OPENAI_API_BASE_URL=http://localhost:443/v1 EMBED_DIM=4096 OPENAI_API_KEY=/home/nero/code/llama.cpp/models/7B/ggml-model-q4_0.bin
Have you already did that, or you just copied the values in the example?
Also, i'm not 100% sure the llama 7b model is good enough to handle AutoGPT requests correctly. I suggest you to try with vicuna. Even better the 13b model, wich is the one i and kendenl used for testing
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
@Neronjust2017 you're exactly right. the issue is the power of the model. i also see this issue sometimes when the response ends early and the crashes autogpt. i have had better luck with vicuna (vs. alpaca and llama), and 13B (7b is very inconsistent) |
There may be a better way of prompting that could help with this – i was going to try babyagi and see how the results differ |
What's about running auto-gpt against open-ai with a large amount of goals, persisting question/answer-pairs and use that as a (lora-)finetuning-dataset for vicuna? |
I almost always get timeouts on the autogpt side. Choosing 7B makes it a tad less likely but even then it gets 1-2 responses deep at most. Using latest llama.cpp and node 20, by the way, I thought to use avx512 users should pass a param to llama.cpp, no?: gpt-llama.cpp:
|
@keldenl Hi, I tried using the Vicuna-7b, but I still couldn't get a response but got
Please note that this time I am using the message failed for AUTO-GPT , so that I can rule out errors caused by the message.
Now, the script parameters and coversation message are the same, what other factors can cause |
@bjoern79de i was literally thinking about that last night, but it feels like that should be a last resort kind of thing and we should try our best with prompt engineering first IMO also i don't know much about actual fine tuning loras, so there's that too haha |
@dany-on-demand i'm getting the same experience – not sure why. it's also weird that it prints out part of the original prompt mid text generation. since gpt-llama.cpp works by just spawning a terminal and running llama.cpp in it, i wonder what would cause it to "crash" and show the original prompt again |
@Neronjust2017 the only other difference i can think of are the parameters, with temp being 0. let me see what autogpt does for its parameters.. |
Does this perform better than GPT-3.5 for AutoGPT? |
It depends on the model you are using. I'm sure in the near future we'll see many models able to outperform it. Also... There's still some work to do to.make llama and derivates work gor AutoGPT, right now many models aren't good enough to handle the json structure required by AutoGPT correctly. |
I'll ask here, when I run auto-gpt configured according to the guide, my connection fails with the error HTTPConnectionPool(host='localhost', port=4000): Read timed out. (read timeout=600) |
Hi all.. i just found a bug that was causing the Issue was Auto-GPT sometimes sends a NULL max_tokens – and in turn I would still pass it into llama.cpp so it'd break
the request sent to llama.cpp via gpt-llama.cpp
u see the flag but no value for --n_predict , which causes it to crash and thus the readstream closed immediately. the fix just ignores max_tokens if it's null (as it should), so it no longer crashes i'm running with this fix right now and autogpt no longer stops early! i still see some potential weirdness with embeddings happening at the same time potentially (?), but i wanted to push out this fix asap so folks could start testing it now |
My thought on and considerations on using Local LLM. Prompts must be less complex and ask less. Giving the constraints, the agent needs to remember things like the task list, the thought section, and maybe other things like completed tasks. Every prompt needs to be reconstructed and not rely on a previous prompt. I believe this will help prevent any looping issues. The main thread needs to have a very limited set of commands, maybe just enough to manage sub-agents and some file management. The sub-agents commands offered could be a subset of Linux shell commands or python as it knowledgeable in these syntaxes. Then it would need to be parsed. I think some of the would help with standard Auto-GPT as well, have not had it successfully finish and objective yet without it failing in some nasty loop. |
wtf, you're right! unfortuntately, setting proper EMBED_DIM didn't make a difference :( EDIT: if i do not use EMBEDDINGS=py, it works with the correct EMBED_DIM. either, EMBED_DIM must be set to 768. keep investigating ..... |
@valerino i would try and see if embeddings are working in general first, since it does need to install the sentence transformers stuff before it works the first time |
@DGdev91 i've been messing with the prompt too – how successful has your branch's prompt been? |
Actually i didn't had much time to test that in the past days. |
I have been toying around with generator.py to give it a little more push on what they should include with each property value. Giving it a nudge to the right direction has far better result and it is now filling up the json.
I've also messed around with ai_config.py to move GOALS towards the later part of the prompt, this time round it can remember the goals (it never even touch any goals previously). However it isn't working well still as it will go around forgetting the other part of the long prompt. These were all experiments to understand why local LLM just can't cope and bringing back the proper JSON reconstructed responses. So yes, the original prompt meant for Open AI GPT will need to be reconstructed for local LLM. It is way too long and LocalLLM is having amnesia just within the same prompt :D Moreover, it isn't that 'smart' enough to populate the right area of JSON. *btw, i am using vicuna13b non quantized , which seems to work much better for JSON responses than the quantized variant for some strange reason. Also using oobabooga extension for openAPI calls. |
In this reyponse_format there is still lots of redundancy. this would save a lot of tokens. |
@maddes8cht i was just experimenting with the prompt to see if it made a difference, and it did! Indeed it would need a shorter one to save some tokens. However, having it too simple would not have vicuna realized on what we actually intended it to do. |
@OreoTango What I had a very resounding success with was an The problem with this model is of course that it is bigger and slower - which is why I first get timeouts from Auto-Gpt while waiting. What I did:In llm_utils.py there is in line 73 and in line 143 a In Promptgenerator.py the definition of response_format from line 23 on looks like this: self.response_format = {
"thoughts": {
"text": "<replace with single sentence of your AI thoughts>",
"reasoning": "<replace with single sentence of your AI reasoning>",
}, "plan": "<replace with short multi bullet points list that convey your AI long term plans>",
"criticism": "<replace with single sentence of your AI constructive self-criticism>",
"speak": "<replace with single string of your AI thoughts summary to say to user as response>",
}, What i got:I've run a lot of samples and ran into different kind of problems, so this is not mor or less better or worse, but different from several others. I also
but the commands seems to return nothing, which in this case isn't a problem of the local model but the forked Auto-Gpt version. In the shown example, the loop ends there with an error - I've got longer sessions with more loops, but as none of the So, now follows my sample session: sample session:Auto-GPT output:Debug Mode: ENABLED
NEWS: # Website and Documentation Site 📰📖 Check out *https://agpt.co*, the official news & updates site for Auto-GPT! The documentation also has a place here, at *https://docs.agpt.co* # 🚀 v0.3.0 Release 🚀 Over a week and 275 pull requests have passed since v0.2.2, and we are happy to announce the release of v0.3.0! *From now on, we will be focusing
on major improvements* rather than bugfixes, as we feel stability has reached a reasonable level. Most remaining issues relate to limitations in prompt generation and the memory
system, which will be the focus of our efforts for the next release. Highlights and notable changes in this release: ## Plugin support 🔌 Auto-GPT now has support for plugins! With plugins, you can extend Auto-GPT's abilities, adding support for third-party services
and more. See https://github.com/Significant-Gravitas/Auto-GPT-Plugins for instructions and available plugins. ## Changes to Docker configuration 🐋 The workdir has been changed
from */home/appuser* to */app*. Be sure to update any volume mounts accordingly! # ⚠️ Command `send_tweet` is DEPRECATED, and will be removed in v0.4.0 ⚠️ Twitter functionality (and more) is now covered by plugins, see [Plugin support 🔌]
Welcome back! Would you like me to return to being Entrepreneur-GPT?
Continue with the last settings?
Name: Entrepreneur-GPT
Role: an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.
Goals: ['Increase net worth.', 'Develop and manage multiple businesses autonomously.', 'Play to your strengths as a Large Language Model.']
Continue (y/n): y
Using memory of type: LocalCache
Using Browser: chrome
Token limit: 2000
Memory Stats: (0, (0, 6656))
Token limit: 2000
Send Token Count: 963
Tokens remaining for response: 1037
------------ CONTEXT SENT TO AI ---------------
System: The current time and date is Sat May 13 11:26:15 2023
System: This reminds you of these events from your past:
User: Determine which next command to use, and respond using the format specified above:
----------- END OF CONTEXT ----------------
Creating chat completion with model gpt-3.5-turbo, temperature 0.0, max_tokens 1037
The JSON object is invalid.
{
"thoughts": {
"text": "I am considering my options for increasing net worth.",
"reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",
"plan": [
"Analyze Code",
"Write Tests",
"Start GPT Agent"
],
"criticism": "I need to be mindful of the 2000-word short term memory limit.",
"speak": "Currently, I am contemplating which command I should execute next to achieve my goal."
},
"command": {
"name": "list_agents",
"args": {}
}
}
The following issues were found:
Error: ['Analyze Code', 'Write Tests', 'Start GPT Agent'] is not of type 'string'
THOUGHTS: I am considering my options for increasing net worth.
REASONING: As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.
PLAN:
- Analyze Code
- Write Tests
- Start GPT Agent
CRITICISM: I need to be mindful of the 2000-word short term memory limit.
NEXT ACTION: COMMAND = list_agents ARGUMENTS = {}
Enter 'y' to authorise command, 'y -N' to run N continuous commands, 'n' to exit program, or enter feedback for ...
Input:y
-=-=-=-=-=-=-= COMMAND AUTHORISED BY USER -=-=-=-=-=-=-=
SYSTEM: Command list_agents returned: List of agents:
Token limit: 2000
Memory Stats: (1, (1, 6656))
Token limit: 2000
Send Token Count: 1196
Tokens remaining for response: 804
------------ CONTEXT SENT TO AI ---------------
System: The current time and date is Sat May 13 11:33:04 2023
System: This reminds you of these events from your past:
['Assistant Reply: {\r\n"thoughts": {\r\n"text": "I am considering my options for increasing net worth.",\r\n"reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",\r\n"plan": ["Analyze Code", "Write Tests", "Start GPT Agent"],\r\n"criticism": "I need to be mindful of the 2000-word short term memory limit.",\r\n"speak": "Currently, I am contemplating which command I should execute next to achieve my goal."\r\n},\r\n"command": {\r\n"name": "list_agents",\r\n"args": {}\r\n}\r\n} \nResult: Command list_agents returned: List of agents:\n \nHuman Feedback: GENERATE NEXT COMMAND JSON ']
User: Determine which next command to use, and respond using the format specified above:
----------- END OF CONTEXT ----------------
Creating chat completion with model gpt-3.5-turbo, temperature 0.0, max_tokens 804
The JSON object is invalid.
{
"thoughts": {
"text": "I am considering my options for increasing net worth.",
"reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",
"plan": [
"Analyze Code",
"Write Tests",
"Start GPT Agent"
],
"criticism": "I need to be mindful of the 2000-word short term memory limit.",
"speak": "Currently, I am contemplating which command I should execute next to achieve my goal."
},
"command": {
"name": "google",
"args": {
"input": "sell clothes online"
}
}
}
The following issues were found:
Error: ['Analyze Code', 'Write Tests', 'Start GPT Agent'] is not of type 'string'
THOUGHTS: I am considering my options for increasing net worth.
REASONING: As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.
PLAN:
- Analyze Code
- Write Tests
- Start GPT Agent
CRITICISM: I need to be mindful of the 2000-word short term memory limit.
NEXT ACTION: COMMAND = google ARGUMENTS = {'input': 'sell clothes online'}
Enter 'y' to authorise command, 'y -N' to run N continuous commands, 'n' to exit program, or enter feedback for ...
Input:y
-=-=-=-=-=-=-= COMMAND AUTHORISED BY USER -=-=-=-=-=-=-=
SYSTEM: Command google returned: []
Token limit: 2000
Traceback (most recent call last):
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "E:\AutoGPT\Auto-GPT\autogpt\__main__.py", line 5, in <module>
autogpt.cli.main()
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1055, in main
rv = self.invoke(ctx)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1635, in invoke
rv = super().invoke(ctx)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "E:\AutoGPT\Auto-GPT\autogpt\cli.py", line 151, in main
agent.start_interaction_loop()
File "E:\AutoGPT\Auto-GPT\autogpt\agent\agent.py", line 75, in start_interaction_loop
assistant_reply = chat_with_ai(
File "E:\AutoGPT\Auto-GPT\autogpt\chat.py", line 85, in chat_with_ai
else permanent_memory.get_relevant(str(full_message_history[-9:]), 10)
File "E:\AutoGPT\Auto-GPT\autogpt\memory\local.py", line 128, in get_relevant
scores = np.dot(self.data.embeddings, embedding)
File "<__array_function__ internals>", line 200, in dot
ValueError: shapes (2,6656) and (0,) not aligned: 6656 (dim 1) != 0 (dim 0) here is the corresponding output that happened on gpt-llama.cpp:===== LLAMA.CPP SPAWNED =====
E:\AutoGPT\llama.cpp\main -m E:\AutoGPT\llama.cpp\models\OpenAssistant-30B-epoch7.ggml.q5_1.bin --temp 0.7 --n_predict 804 --top_p 0.1 --top_k 40 -c 2048 --seed -1 --repeat_penalty 1.1764705882352942 --mlock --threads 6 --ctx_size 2048 --mirostat 2 --repeat_penalty 1.15 --reverse-prompt user: --reverse-prompt
user --reverse-prompt system: --reverse-prompt
system --reverse-prompt
-i -p Complete the following chat conversation between the user and the assistant. system messages should be strictly followed as additional instructions.
system: You are a helpful assistant.
user: How are you?
assistant: Hi, how may I help you today?
system: You are Entrepreneur-GPT, an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.
Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.
GOALS:
1. Increase net worth.
2. Develop and manage multiple businesses autonomously.
3. Play to your strengths as a Large Language Model.
Constraints:
1. 2000 words max for short term memory. Save important info to files ASAP
2. To recall past events, think of similar ones. Helps with uncertainty.
3. No user assistance
4. Exclusively use the commands listed in double quotes e.g. "command name"
5. Use subprocesses for commands that will not terminate within a few minutes
Commands:
1. Google Search: "google", args: "input": "<search>"
2. Browse Website: "browse_website", args: "url": "<url>", "question": "<what_you_want_to_find_on_website>"
3. Start GPT Agent: "start_agent", args: "name": "<name>", "task": "<short_task_desc>", "prompt": "<prompt>"
4. Message GPT Agent: "message_agent", args: "key": "<key>", "message": "<message>"
5. List GPT Agents: "list_agents", args:
6. Delete GPT Agent: "delete_agent", args: "key": "<key>"
7. Clone Repository: "clone_repository", args: "repository_url": "<url>", "clone_path": "<directory>"
8. Write to file: "write_to_file", args: "file": "<file>", "text": "<text>"
9. Read file: "read_file", args: "file": "<file>"
10. Append to file: "append_to_file", args: "file": "<file>", "text": "<text>"
11. Delete file: "delete_file", args: "file": "<file>"
12. Search Files: "search_files", args: "directory": "<directory>"
13. Analyze Code: "analyze_code", args: "code": "<full_code_string>"
14. Get Improved Code: "improve_code", args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>"
15. Write Tests: "write_tests", args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>"
16. Execute Python File: "execute_python_file", args: "file": "<file>"
17. Generate Image: "generate_image", args: "prompt": "<prompt>"
18. Execute Shell Command, non-interactive commands only: "execute_shell", args: "command_line": "<command_line>"
19. Execute Shell Command Popen, non-interactive commands only: "execute_shell_popen", args: "command_line": "<command_line>"
20. Do Nothing: "do_nothing", args:
21. Task Complete (Shutdown): "task_complete", args: "reason": "<reason>"
Resources:
1. Internet access for searches and information gathering.
2. Long Term memory management.
3. GPT-3.5 powered Agents for delegation of simple tasks.
4. File output.
Performance Evaluation:
1. Continuously review and analyze your actions to perform to the best of your abilities.
2. Constructively self-criticize your big-picture behavior constantly.
3. Reflect on past decisions and strategies to refine your approach.
4. Be smart and efficient. Aim to complete tasks in the least number of steps.
You should only respond in JSON format as described below
Response Format:
{
"thoughts": {
"text": "<replace with single sentence of your AI thoughts>",
"reasoning": "<replace with single sentence of your AI reasoning>",
"plan": "<replace with short list that convey your AI long term plan>",
"criticism": "<replace with single sentence of your AI constructive self-criticism>",
"speak": "<replace with single string of your AI thoughts summary to say to user as response>"
},
"command": {
"name": "command name",
"args": {
"arg name": "value"
}
}
}
Ensure the response can be parsed by Python json.loads
system: The current time and date is Sat May 13 11:33:04 2023
system: This reminds you of these events from your past:
['Assistant Reply: {\r\n"thoughts": {\r\n"text": "I am considering my options for increasing net worth.",\r\n"reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",\r\n"plan": ["Analyze Code", "Write Tests", "Start GPT Agent"],\r\n"criticism": "I need to be mindful of the 2000-word short term memory limit.",\r\n"speak": "Currently, I am contemplating which command I should execute next to achieve my goal."\r\n},\r\n"command": {\r\n"name": "list_agents",\r\n"args": {}\r\n}\r\n} \nResult: Command list_agents returned: List of agents:\n \nHuman Feedback: GENERATE NEXT COMMAND JSON ']
user: Determine which next command to use, and respond using the format specified above:
assistant:
===== REQUEST =====
user: Determine which next command to use, and respond using the format specified above:
===== PROCESSING PROMPT... =====
===== PROCESSING PROMPT... =====
===== RESPONSE =====
{
"thoughts": {
"text": "I am considering my options for increasing net worth.",
"reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",
"plan": ["Analyze Code", "Write Tests", "Start GPT Agent"],
"criticism": "I need to be mindful of the 2000-word short term memory limit.",
"speak": "Currently, I am contemplating which command I should execute next to achieve my goal."
},
"command": {
"name": "google",
"args": {"input": "sell clothes online"}
}
}
user:Request DONE
> PROCESS COMPLETE
> PROCESS COMPLETE
> PROCESS COMPLETE
> REQUEST RECEIVED
> PROCESSING NEXT REQUEST FOR /v1/embeddings
===== EMBEDDING REQUEST =====
===== LLAMA.CPP SPAWNED =====
E:\AutoGPT\llama.cpp\embedding -m E:\AutoGPT\llama.cpp\models\OpenAssistant-30B-epoch7.ggml.q5_1.bin -p Assistant Reply: {
\"thoughts\": {
\"text\": \"I am considering my options for increasing net worth.\",
\"reasoning\": \"As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.\",
\"plan\": [\"Analyze Code\", \"Write Tests\", \"Start GPT Agent\"],
\"criticism\": \"I need to be mindful of the 2000-word short term memory limit.\",
\"speak\": \"Currently, I am contemplating which command I should execute next to achieve my goal.\"
},
\"command\": {
\"name\": \"google\",
\"args\": {\"input\": \"sell clothes online\"}
}
}
Result: Command google returned: []
Human Feedback: GENERATE NEXT COMMAND JSON
===== REQUEST =====
Assistant Reply: {
"thoughts": {
"text": "I am considering my options for increasing net worth.",
"reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",
"plan": ["Analyze Code", "Write Tests", "Start GPT Agent"],
"criticism": "I need to be mindful of the 2000-word short term memory limit.",
"speak": "Currently, I am contemplating which command I should execute next to achieve my goal."
},
"command": {
"name": "google",
"args": {"input": "sell clothes online"}
}
}
Result: Command google returned: []
Human Feedback: GENERATE NEXT COMMAND JSON
===== STDERR =====
stderr Readable Stream: CLOSED
== Running in interactive mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to LLaMa.
- If you want to submit another line, end your input in '\'.
{
object: 'list',
data: [ { object: 'embedding', embedding: [Array], index: 0 } ],
embeddingSize: 6656,
usage: { prompt_tokens: 230, total_tokens: 230 }
}
Embedding Request DONE
===== STDERR =====
stderr Readable Stream: CLOSED
llama_print_timings: prompt eval time = 17534.34 ms / 263 tokens ( 66.67 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 22078.27 ms
> REQUEST RECEIVED
> PROCESSING NEXT REQUEST FOR /v1/embeddings
===== EMBEDDING REQUEST =====
===== LLAMA.CPP SPAWNED =====
E:\AutoGPT\llama.cpp\embedding -m E:\AutoGPT\llama.cpp\models\OpenAssistant-30B-epoch7.ggml.q5_1.bin -p [{'role': 'user', 'content': 'Determine which next command to use, and respond using the format specified above:'}, {'role': 'assistant', 'content': '{\r\n\"thoughts\": {\r\n\"text\": \"I am considering my options for increasing net worth.\",\r\n\"reasoning\": \"As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.\",\r\n\"plan\": [\"Analyze Code\", \"Write Tests\", \"Start GPT Agent\"],\r\n\"criticism\": \"I need to be mindful of the 2000-word short term memory limit.\",\r\n\"speak\": \"Currently, I am contemplating which command I should execute next to achieve my goal.\"\r\n},\r\n\"command\": {\r\n\"name\": \"list_agents\",\r\n\"args\": {}\r\n}\r\n}'}, {'role': 'system', 'content': 'Command list_agents returned: List of agents:\n'}, {'role': 'user', 'content': 'Determine which next command to use, and respond using the format specified above:'}, {'role': 'assistant', 'content': '{\r\n\"thoughts\": {\r\n\"text\": \"I am considering my options for increasing net worth.\",\r\n\"reasoning\": \"As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.\",\r\n\"plan\": [\"Analyze Code\", \"Write Tests\", \"Start GPT Agent\"],\r\n\"criticism\": \"I need to be mindful of the 2000-word short term memory limit.\",\r\n\"speak\": \"Currently, I am contemplating which command I should execute next to achieve my goal.\"\r\n},\r\n\"command\": {\r\n\"name\": \"google\",\r\n\"args\": {\"input\": \"sell clothes online\"}\r\n}\r\n}'}, {'role': 'system', 'content': 'Command google returned: []'}]
===== REQUEST =====
[{'role': 'user', 'content': 'Determine which next command to use, and respond using the format specified above:'}, {'role': 'assistant', 'content': '{\r\n"thoughts": {\r\n"text": "I am considering my options for increasing net worth.",\r\n"reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",\r\n"plan": ["Analyze Code", "Write Tests", "Start GPT Agent"],\r\n"criticism": "I need to be mindful of the 2000-word short term memory limit.",\r\n"speak": "Currently, I am contemplating which command I should execute next to achieve my goal."\r\n},\r\n"command": {\r\n"name": "list_agents",\r\n"args": {}\r\n}\r\n}'}, {'role': 'system', 'content': 'Command list_agents returned: List of agents:\n'}, {'role': 'user', 'content': 'Determine which next command to use, and respond using the format specified above:'}, {'role': 'assistant', 'content': '{\r\n"thoughts": {\r\n"text": "I am considering my options for increasing net worth.",\r\n"reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",\r\n"plan": ["Analyze Code", "Write Tests", "Start GPT Agent"],\r\n"criticism": "I need to be mindful of the 2000-word short term memory limit.",\r\n"speak": "Currently, I am contemplating which command I should execute next to achieve my goal."\r\n},\r\n"command": {\r\n"name": "google",\r\n"args": {"input": "sell clothes online"}\r\n}\r\n}'}, {'role': 'system', 'content': 'Command google returned: []'}]
===== STDERR =====
stderr Readable Stream: CLOSED
CUDA error 12 at D:\a\llama.cpp\llama.cpp\ggml-cuda.cu:527: invalid pitch argument
{
object: 'list',
data: [ { object: 'embedding', embedding: [], index: 0 } ],
embeddingSize: 0,
usage: { prompt_tokens: 529, total_tokens: 529 }
}
Embedding Request DONE
> PROCESS COMPLETE
> PROCESS COMPLETE |
Here is another variant of the output. User: Determine which next command to use, and respond using the format specified above:
----------- END OF CONTEXT ----------------
Creating chat completion with model gpt-3.5-turbo, temperature 0.0, max_tokens 1037
The JSON object is invalid.
{
"thoughts": {
"text": "I am thinking about how to improve my net worth. I will focus on developing multiple businesses and delegating simple tasks to GPT-3.5 powered Agents.",
"reasoning": "To increase my net worth, it is necessary to have a diversified portfolio of profitable business ventures. Delegating simple tasks with the use of subprocesses for commands that will not terminate within a few minutes will allow me to focus on high-level decision making and achieve better results.",
"plan": [
"Launching new business ventures",
"Developing GPT-3.5 powered Agents to delegate tasks"
],
"criticism": "Reflecting on past decisions, I see the need for more efficient use of resources to maximize profits. This will be a key part of my future planning.",
"speak": "I am thinking about how to improve my net worth and will focus on developing multiple businesses while delegating simple tasks to GPT-3.5 powered Agents."
},
"command": {
"name": "Start GPT Agent",
"args": {
"name": "Business Venture Planner",
"task": "Plan out new business ventures based on market analysis",
"prompt": "What are the most profitable and feasible business ideas for our company to pursue?"
}
}
}
The following issues were found:
Error: ['Launching new business ventures', 'Developing GPT-3.5 powered Agents to delegate tasks'] is not of type 'string'
THOUGHTS: I am thinking about how to improve my net worth. I will focus on developing multiple businesses and delegating simple tasks to GPT-3.5 powered Agents.
REASONING: To increase my net worth, it is necessary to have a diversified portfolio of profitable business ventures. Delegating simple tasks with the use of subprocesses for commands that will not terminate within a few minutes will allow me to focus on high-level decision making and achieve better results.
PLAN:
- Launching new business ventures
- Developing GPT-3.5 powered Agents to delegate tasks
CRITICISM: Reflecting on past decisions, I see the need for more efficient use of resources to maximize profits. This will be a key part of my future planning.
NEXT ACTION: COMMAND = Start GPT Agent ARGUMENTS = {'name': 'Business Venture Planner', 'task': 'Plan out new business ventures based on market analysis', 'prompt': 'What are the most profitable and feasible business ideas for our company to pursue?'}
Enter 'y' to authorise command, 'y -N' to run N continuous commands, 'n' to exit program, or enter feedback for ...
Input:y
-=-=-=-=-=-=-= COMMAND AUTHORISED BY USER -=-=-=-=-=-=-=
|
I would like to point out these error messages from gpt-llama.cpp It happens right after I successfully get the json-response with the command in
I don't see why this is happening. For further investigation, i append the full session output:this is the (almost) complete output, Auto-gpt:Welcome back! Would you like me to return to being Entrepreneur-GPT?
Continue with the last settings?
Name: Entrepreneur-GPT
Role: an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.
Goals: ['Increase net worth.', 'Develop and manage multiple businesses autonomously.', 'Play to your strengths as a Large Language Model.']
Continue (y/n): yy
Using memory of type: LocalCache
Using Browser: chrome
Token limit: 2000
Memory Stats: (0, (0, 6656))
Token limit: 2000
Send Token Count: 966
Tokens remaining for response: 1034
------------ CONTEXT SENT TO AI ---------------
System: The current time and date is Sat May 13 15:13:55 2023
System: This reminds you of these events from your past:
User: Determine which next command to use, and respond using the format specified above:
----------- END OF CONTEXT ----------------
Creating chat completion with model gpt-3.5-turbo, temperature 0.0, max_tokens 1034
json {
"thoughts": {
"text": "I am considering what the best command would be to pursue my goals",
"reasoning": "To increase net worth, I must develop multiple businesses autonomously. To do this, I need information from the web and assistance from GPT-3.5 powered agents. However, I cannot seek user assistance.",
"plan": "- Use Google Search for information gathering\n- Create a new agent to help me",
"criticism": "I must ensure that I only use commands allowed by my constraints, such as using subprocesses for long running processes and not seeking user assistance. Additionally, I should focus on simple strategies.",
"speak": "Let's start with gathering information from the web to see what business ideas we can find."
},
"command": {
"name": "Google Search",
"args": {
"google": "business ideas"
}
}
json loads error Expecting ',' delimiter: line 14 column 2 (char 817)
The JSON object is valid.
THOUGHTS: I am considering what the best command would be to pursue my goals
REASONING: To increase net worth, I must develop multiple businesses autonomously. To do
this, I need information from the web and assistance from GPT-3.5 powered agents. However, I cannot seek user assistance.
PLAN:
- Use Google Search for information gathering
- Create a new agent to help me
CRITICISM: I must ensure that I only use commands allowed by my constraints, such as using subprocesses for long running processes and not seeking user assistance. Additionally,
I should focus on simple strategies.
NEXT ACTION: COMMAND = Google Search ARGUMENTS = {'google': 'business ideas'}
Enter 'y' to authorise command, 'y -N' to run N continuous commands, 'n' to exit program, or enter feedback for ...
Input:y
-=-=-=-=-=-=-= COMMAND AUTHORISED BY USER -=-=-=-=-=-=-=
Traceback (most recent call last):
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\connectionpool.py", line 449, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\connectionpool.py", line 444, in _make_request
httplib_response = conn.getresponse()
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\http\client.py", line 1374, in getresponse
response.begin()
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\http\client.py", line 318, in begin
version, status, reason = self._read_status()
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\http\client.py", line 279, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\socket.py", line 705, in readinto
return self._sock.recv_into(b)
TimeoutError: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\requests\adapters.py", line 486, in send
resp = conn.urlopen(
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\connectionpool.py", line 787, in urlopen
retries = retries.increment(
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\util\retry.py", line 550, in increment
raise six.reraise(type(error), error, _stacktrace)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\packages\six.py", line 770, in reraise
raise value
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\connectionpool.py", line 451, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\connectionpool.py", line 340, in _raise_timeout
raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='localhost', port=443): Read timed out. (read timeout=600)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\openai\api_requestor.py", line 516, in request_raw
result = _thread_context.session.request(
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\requests\sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\requests\sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\requests\adapters.py", line 532, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPConnectionPool(host='localhost', port=443): Read timed out. (read timeout=600)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "E:\AutoGPT\Auto-GPT\autogpt\__main__.py", line 5, in <module>
autogpt.cli.main()
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1055, in main
rv = self.invoke(ctx)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1635, in invoke
rv = super().invoke(ctx)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "E:\AutoGPT\Auto-GPT\autogpt\cli.py", line 151, in main
agent.start_interaction_loop()
File "E:\AutoGPT\Auto-GPT\autogpt\agent\agent.py", line 184, in start_interaction_loop
self.memory.add(memory_to_add)
File "E:\AutoGPT\Auto-GPT\autogpt\memory\local.py", line 78, in add
embedding = create_embedding_with_ada(text)
File "E:\AutoGPT\Auto-GPT\autogpt\llm_utils.py", line 155, in create_embedding_with_ada
return openai.Embedding.create(
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\openai\api_resources\embedding.py", line 33, in create
response = super().create(*args, **kwargs)
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\openai\api_requestor.py", line 216, in request
result = self.request_raw(
File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\openai\api_requestor.py", line 526, in request_raw
raise error.Timeout("Request timed out: {}".format(e)) from e
openai.error.Timeout: Request timed out: HTTPConnectionPool(host='localhost', port=443): Read timed out. (read timeout=600) and this on the gpt-llama.cpp side:
|
As a sidenote: That's how it is now: self.response_format = {
"thoughts": {
"text": "<replace with single sentence of your AI thoughts>",
"reasoning": "<replace with single sentence of your AI reasoning>",
"plan": "<replace with short dashed (-) list that convey your AI long term plan>",
"criticism": "<replace with single sentence of your AI constructive self-criticism>",
"speak": "<replace with single string of your AI thoughts summary to say to user as response>",
},
"command": {"name": "command name", "args": {"arg name": "value"}},
} |
With my current set of prompts, i'm getting pretty consistent and reliable json_responses from OpenAssistant 30b. I'm getting improvements, but still inconsistent and unreliable results from any vicuna or koala 13b models. OpenAssistant 30b runs (slooow) on my machine with gpt_llama.cpp, but got a significant speedup with the recent GPU update in lama.cpp This update will enable a lot more people to run 30b models that they couldn't use before, as they gain aditional memory from their graphics cards. I'm eager to try out IBMs new dromedary 60b model with gpt_llama and auto-gpt as it seems to be very smart, but maybe i need to wait until gpt-llama.cpp updates to gpu support of llama.cpp. It's not because of the speedup, which may be neglectable, but becaue of the additional 12 gb of my rtx3060 that will enable these model sizes for me. As has been said, the Auto-GPT capabilities seems to boil down on the "smartness" of the model. Current 30b llama derivatives may Manage the task "out of the box", at least with the customized prompts, while the smaller ones may never be able even with more specific prompts. BUT: remember llama itself wasn't capable of being used in Chat conversations until Alpaca finetuning was released. With alpaca, even small models now can be used in interactive Chat. It should be worth having a specific training for this usecase.As "Chat-Conversation" is a rather broad sceneario, the Alpaca Dataset with about 52.000 sets was pretty small. So maybe we should do an attempt to create such a dataset for finetuning small models to work together with AutoGPT. This should get in cooperation with AutoGPT development, as it should also involve the best prompts on which the dataset is to be trained for. This way, I am confident that even small models will become capable of working with AutoGPT. |
For me auto-gpt also returning responses to gpt-llama.cpp window .... I have no idea why ... |
For me it looks like I now get quite reliable usable first responses that consist of valid json and are also accepted by auto-gpt. And this does not seem to be a problem of the model. How can we support that from here? My possibilities for the moment are (still) limited to testing models and prompts / prompt combinations, and with appropriate hints I can also dig into code a bit, but I don't have a deep experience in this... |
In other words: I could be wrong, but then, please prove it. |
Most of the issues with the commands are caused by the LLM not understsnding well what string represent the actual command.
If you are referring to my fork, i'm trying to get it merged and i often merge the changes from the main repository. Finally, i'm also trying to make the prompts configurable, this can be useful to find a prompt wich works better for llama-derived llms Significant-Gravitas/AutoGPT#3375 You can try both of my changes in the branch custom_base_url_and_prompts |
@keldenl there's some work going on on ChatGPT to better handle prompts, the current version seems to pick the right commands now, even with the default prompt. |
Hi @DGdev91 and @keldenl , thanks for your awesome update on the package. I've met some errors while trying to run Auto-GPT with the llama 7B model. I followed the README carefully and here is my modified
However when I test the auto GPT, it keeps throwing me the following error (no matter what prompt I input):
On the gpt-llama server side, it shows:
May I ask for your opinion on how to solve the error here? Thanks a lot! |
Not every LLM is good enough to handle the json reply requested by AutoGPT. |
@DGdev91 thank you for your amazing work. I have followed the instructions but can't get it to work. Here are my errors. auto-gpt:
gpt-llama.cpp:
Models I have tested with: WizardLM-7B-uncensored.ggmlv3.q4_0.bin Please note: Any suggestions? Thanks in advance. |
Figured out what my issue was, thanks to an email from Openai that my API key was leaked LMFAO. Turns out I have/had OPENAI_API_KEY set in my .bash_profile, which was getting picked up.
|
Heads up that api key needs to be revoked if it hasn't already been by secret scanning |
That's one of the great things about openai, it scans github for api-keys, and immediately disables them, hence the email I received. Thanks for the heads up. |
the common thing is most of us facing same error ===== STDERR ===== as of 25-June-2023 using latest llama.cpp and gpt-llama.cpp with viucna 13B model if i change some code then it reveles that llama.cpp ignore the --reverse-prompts and keep genrateing tokens and gpt-llama check the --reverse-prompts and return the result and when we try to intract with model again and its not finished response to previous pormpt something crash in gpt-llama.cpp |
Does gpt-llama.cpp work now with the main branch of auto-gpt? |
Issue is more related to model than code. This repo is great but we are dealing with model that generates random text 🤣 |
I's been some time since the last time i tested that, but it should. My PR has been merged in ver. 0.4.0. @keldenl sorry, i forgot about telling you about that. The docs in https://github.com/keldenl/gpt-llama.cpp/blob/master/docs/Auto-GPT-setup-guide.md needs an update |
Hello. Have you find a solution ? I have the same error autogpt.cpp Goals:
|
Get Auto-GPT working. It's blocked on the following 2 items
Embeddings aren't working quite right / very inconsistent between models, so having a built-in local embedding option is good.
For the second bullet point, we can technically work around and set the BASE_URL by modifying the code, but it would be nice to have it as an option / env variable so it's easier to get started.
I'll likely open a new PR against Auto-GPT for the second point and push to get that PR i linked above in
The text was updated successfully, but these errors were encountered: