NOTE: YOU will see alot of types declared as any, this is done due to time limitations of workshop, you should always use proper types.
This repository contains prebuilt simple chat interface and Hello World GET and POST API examples.
An RAG application which can ingest and search through a given documentation and answers questions. The app will be a chat UI that will use the Marvel Universe wiki as data source.
Node.Js (>= v18) : Installation
Next.js: Installation
If you're using Windows, a WSL setup too may be needed.
Link: supabase.com
- We need a Postgresql database with the extension
pgvector
installed. - Create account on supabase.com gives us a free Postgresql instance with pg_vector
- Or install the db locally with the extension
Using .env.example
as a template create a .env
file with the following values:
CHAT_API_KEY
- API Key from the Model provider. For Tune AI you can obtain it from Profile > Access Keys
section (check docs for detailed steps)
CHAT_API_HOST
- If using Tune Studio, set this to https://proxy.tune.app
(first value)
EMBEDDING_ENDPOINT
- If using Tune Studio, set this to https://proxy.tune.app/v1/embeddings
(first value)
MODEL
- If using Tune Studio, set this to rohan/tune-gpt-4o
(first value)
EMBEDDING_MODEL
- If using Tune Studio, set this to openai/text-embedding-ada-002
(first value)
For the nextsection, you can obtain values from the Supabase dashboard
CHAT_API_KEY=sk-tune-tese
CHAT_API_HOST=https://proxy.tune.app // or https://api.openai.com/v1
EMBEDDING_ENDPOINT=https://proxy.tune.app/v1/embeddings // or https://api.openai.com/v1/embeddings
MODEL=rohan/tune-gpt-4o // or gpt-4o
EMBEDDING_MODEL=openai/text-embedding-ada-002 // or text-embedding-ada-002
NEXT_PUBLIC_SUPABASE_URL=https://---.supabase.co
SUPABASE_SERVICE_ROLE_KEY=ey....
DATABASE_URL=postgresql://postgres.---:[email protected]:5432/postgres?pgbouncer=true
-
Install the required dependencies using
npm install
-
You should be able to run the Next.js server using
npm run dev
Go to http://localhost:3000
to see the chat interface.
There should not be any errors in the console or on the page.
Run the following SQL commands in Supabase SQL editor to set up the databnse,
and create documents
table along with the match_documents
SQL function.
-- Enable the pg_vector extension, makes the VECTOR data type available
CREATE EXTENSION vector;
-- Create the documents table
CREATE TABLE documents (
"id" BIGSERIAL PRIMARY KEY,
"content" TEXT,
"embedding" VECTOR(1536),
"url" TEXT,
"title" TEXT
);
-- Create the match_documents function
-- Starts here
CREATE OR REPLACE FUNCTION match_documents (
query_embedding VECTOR(1536),
match_threshold FLOAT,
match_count INT
)
RETURNS TABLE (
id BIGINT,
content TEXT,
url TEXT,
title TEXT,
similarity FLOAT
)
LANGUAGE SQL STABLE
AS $$
SELECT
documents.id,
documents.content,
documents.url,
documents.title,
1 - (documents.embedding <=> query_embedding) AS similarity
FROM documents
WHERE documents.embedding <=> query_embedding < 1 - match_threshold
ORDER BY documents.embedding <=> query_embedding
LIMIT match_count;
$$;
-- Ends here
Run npx prisma init
followed by npx prisma db pull
to generate the schema files.
Defines a function, llmStream
to interact with LLM API.
This function is defined in utils/index.ts
file, and
used for responding to user queries in the chat interface.
Here is the code for the function:
export const llmStream = async ({
messages,
stream,
temperature,
max_tokens,
is_recursive = false,
}: {
stream: boolean,
messages: {
role: string,
content: string,
}[],
max_tokens?: number,
temperature?: number,
is_recursive?: boolean,
}): Promise<any> => {
const resp = await fetch(process.env.CHAT_API_HOST + "/chat/completions", {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: "Bearer " + process.env.CHAT_API_KEY,
},
body: JSON.stringify({
temperature: temperature,
messages: messages,
model: process.env.MODEL,
stream: stream,
max_tokens: max_tokens,
}),
});
console.log(resp);
return resp.body;
};
Use this function for the api/prompt
endpoint (defined inside the app/api/prompt/route.ts
file) to interact with the LLM API. The complete code for the file is as follows:
import { llmStream } from "@/utils";
import { StreamingTextResponse } from "ai";
export async function POST(req: Request) {
const { messages, temperature, max_tokens, stream } = await req.json();
const response = await llmStream({
messages,
temperature,
max_tokens,
stream,
});
console.log(response);
return new StreamingTextResponse(response);
}
Defines two functions, generateEmbeddings
and saveEmbedding
to generate and save embeddings respectively.
generateEmbeddings
function: takes in the page content as input, uses the embedding model API to generate embeddings, and returns the embeddings.saveEmbedding
function: takes in the page content and the embeddings as input, and saves the embeddings to the database. Note that theembedding
column's data type isVECTOR()
.
const generateEmbeddings = async (pageContent: {
title: string,
url: string,
content: string,
}) => {
const response = await fetch(`${process.env.EMBEDDING_ENDPOINT}`, {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: "Bearer " + process.env.CHAT_API_KEY || "",
},
body: JSON.stringify({
input: pageContent.content,
model: process.env.EMBEDDING_MODEL,
}),
}).then((res) => res.json());
return response?.data?.[0]?.embedding;
};
const saveEmbedding = async (
pageContent: {
title: string,
url: string,
content: string,
},
embedding: any
) => {
await prisma.$queryRaw`INSERT INTO documents (title, url, content, embedding) values (${pageContent.title}, ${pageContent.url}, ${pageContent.content}, ${embedding})`;
};
For every web page we want, we will
- Fetch the content of the page
- Extract the text from relevant sections (paragraphs, headings, etc.)
- Generate embeddings for the extracted text
- Save the embeddings to the database
We define a function injestWebPage
(in the utils/index.ts
file):
export const injestWebPage = async (url: string) => {
const urlResp = await fetch(url).then((resp) => resp.text());
const $ = load(urlResp);
const title = $("title").text();
const content = $("h1, h2, h3, table, p, span").text();
console.log(title, content);
let pageContent = {
title: title,
url: url,
content: content,
};
const embedding = await generateEmbeddings(pageContent);
// const embedding = "";
saveEmbedding(pageContent, embedding);
};
We will now use this function for the /api/injest
endpoint defined in the app/api/injest/route.ts
file.
The full contents of the file are as follows:
import { injestWebPage } from "@/utils";
import { getSitemapUrls } from "@/utils/bonus";
export async function GET() {
const urls = (await getSitemapUrls(
"https://studio.tune.app/docs/sitemap.xml"
)) as any[];
for (let i = 0; i < urls.length; i++) {
await injestWebPage(urls[i]);
}
// const url = "https://studio.tune.app/docs/cookbook/supabase-rag";
// const resp = await injestWebPage(url);
return new Response(
JSON.stringify({
urls,
}),
{
headers: {
"content-type": "application/json",
},
}
);
}
Define a function getSimilarDocuments
(in utils/index.ts
) to fetch similar documents based on the query.
export const getSimilarDocuments = async (query: string) => {
const supabaseClient = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL,
process.env.SUPABASE_SERVICE_ROLE_KEY
);
const embedding = await generateEmbeddings({
title: "",
url: "",
content: query,
});
const data = await supabaseClient.rpc("match_documents", {
query_embedding: embedding,
match_threshold: 0.7,
match_count: 3,
});
return data?.data?.map(
(doc: { title: string, url: string, content: string, embedding: any }) => {
return doc.content;
}
);
};
Now, we modify the llmStream
function to handle function calls and recursive calls.
The search_docs
function calls' structure is defined in app/constants/index.ts
file:
The LLM will indicate a function call in the response, and we will handle it by calling the getSimilarDocuments
function and recursively calling the llmStream
function with the added information from relevant documents.
export const llmStream = async ({
messages,
stream,
temperature,
max_tokens,
is_recursive = false,
}: {
stream: boolean;
messages: {
role: string;
content: string;
}[];
max_tokens?: number;
temperature?: number;
is_recursive?: boolean;
}): Promise<any> => {
const resp = (await fetch(process.env.CHAT_API_HOST + "/chat/completions", {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: "Bearer " + process.env.CHAT_API_KEY,
},
body: JSON.stringify({
temperature: temperature,
messages: [
{
role: "system",
content: constants.system,
},
...messages,
],
model: process.env.MODEL,
stream: Boolean(is_recursive),
max_tokens: max_tokens,
tools: is_recursive ? undefined : constants.tools,
}),
})) as any;
// New code to handle function calls
// If the response is a function call, get similar documents and recursively call llmStream with content from top similar documents
if (!is_recursive) {
const temp = await resp.json();
if (
temp?.choices?.[0]?.message?.tool_calls?.[0]?.function?.name ===
"search_docs"
) {
const functionVals = JSON.parse(
temp?.choices?.[0]?.message?.tool_calls?.[0]?.function?.arguments
);
const docs = await getSimilarDocuments(functionVals.query);
return llmStream({
messages: [
...messages,
{
role: "user",
content: `Use below sources to answer above question ${docs?.join(
"\n\n"
)}`,
},
],
stream: true,
is_recursive: true,
max_tokens: max_tokens,
temperature: temperature,
});
} else {
console.log({ temp });
return temp;
}
} else return resp.body;
};