Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stateful chat #2

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 25 additions & 10 deletions arthas-api/api/post/prompt/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,15 @@ const {

const { prefixInput } = require('arthasgpt/src/utils/prefix');

const __statefulChatFunction = {};
Copy link
Owner Author

@bennyschmidt bennyschmidt Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can cache the stateful chat method per worker, but it is a bit of an antipattern in a stateless API, and when scaling instances there will be multiple (inconsistent) histories depending on which worker is picking up the request.

With CACHE=true in the .env we already cache the query, prompt, and model responses. For scale, it might be worth the trade-off to keep the prompt endpoint stateless, relying on library-level caching for speed, and an application-level solution for chat history/context.

Note: __statefulChatFunction is instantiated as an object (hash) because it maintains a chat method for each active persona, not just 1, like this:

__statefulChatFunction = {
  [persona1Key]: persona1ChatFunction,
  [persona2Key]: persona2ChatFunction,
    
  ...
}

to be called like this:

__statefulChatFunction[persona1Key]("what did you just say?")


/**
* Prompt
*/

module.exports = asyncCache => async (req, res) => {
const config = await asyncCache.getItem('config');

const timeout = await asyncCache.getItem('timeout');

let answer = await asyncCache.getItem('answer');
Expand All @@ -44,8 +48,6 @@ module.exports = asyncCache => async (req, res) => {
body.push(chunk);
})
.on('end', async () => {
const config = await asyncCache.getItem('config');

body = Buffer.concat(body).toString();

const { key, input } = JSON.parse(body || '{}');
Expand Down Expand Up @@ -112,16 +114,29 @@ module.exports = asyncCache => async (req, res) => {
await delay(timeout);
}

if (isVerbose) {
log(CREATING_AGENT);
}
const currentChat = __statefulChatFunction?.[key];

answer = await ArthasGPT({
...currentConfig,
if (currentChat) {
await currentChat(messageResponse);
} else {
if (isVerbose) {
log(CREATING_AGENT);
}

query: messageResponse,
cache: false
});
answer = await ArthasGPT({
...currentConfig,

query: messageResponse,
cache: true
});

await asyncCache.setItem('answer', answer);

// Cache the stateful chat method in the API
// (antipattern)

__statefulChatFunction[key] = answer?.chat;
}

res.end(JSON.stringify({
success: true,
Expand Down
3 changes: 2 additions & 1 deletion arthas-api/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ const PORT = 8000;
* Config
*/

const numCPUs = availableParallelism();
// const numCPUs = availableParallelism();
const numCPUs = 1;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach works in a single instance, but would still need another solution to share the stateful chat method across multiple.


const onCluster = require('./events/cluster');
const onWorker = require('./events/worker');
Expand Down