-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stateful chat #2
base: master
Are you sure you want to change the base?
Conversation
@@ -28,11 +28,15 @@ const { | |||
|
|||
const { prefixInput } = require('arthasgpt/src/utils/prefix'); | |||
|
|||
const __statefulChatFunction = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can cache the stateful chat
method per worker, but it is a bit of an antipattern in a stateless API, and when scaling instances there will be multiple (inconsistent) histories depending on which worker is picking up the request.
With CACHE=true
in the .env
we already cache the query, prompt, and model responses. For scale, it might be worth the trade-off to keep the prompt
endpoint stateless, relying on library-level caching for speed, and an application-level solution for chat history/context.
Note: __statefulChatFunction
is instantiated as an object (hash) because it maintains a chat
method for each active persona, not just 1, like this:
__statefulChatFunction = {
[persona1Key]: persona1ChatFunction,
[persona2Key]: persona2ChatFunction,
...
}
to be called like this:
__statefulChatFunction[persona1Key]("what did you just say?")
arthas-api/index.js
Outdated
@@ -30,7 +30,8 @@ const PORT = 8000; | |||
* Config | |||
*/ | |||
|
|||
const numCPUs = availableParallelism(); | |||
// const numCPUs = availableParallelism(); | |||
const numCPUs = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach works in a single instance, but would still need another solution to share the stateful chat
method across multiple.
Closes #1