Pathway's LLM App is a chatbot application which provides real-time responses to user queries, based on the freshest knowledge available in a document store. It does not require a separate vector database, and helps to avoid fragmented LLM stacks (such as Pinecone/Weaviate + Langchain + Redis + FastAPI +...). Document data lives in the place where it was stored already, and on top of this, LLM App provides a light but integrated data processing layer, which is highly performant and can be easily customized and extended. It is particulary recommended for privacy-preserving LLM applications.
LLM App reads a corpus of documents stored in S3 or locally, preprocesses them, and builds a vector index by calling a routine from the Pathway package. It then listens to user queries coming as HTTP REST requests. Each query uses the index to retrieve relevant documentation snippets and uses the OpenAI API/ Hugging Face to provide a response in natural language. The bot is reactive to changes in the corpus of documents: once new snippets are provided, it reindexes them and starts to use the new knowledge to answer subsequent queries.
(Available soon)
- HTTP REST queries: The system is capable of responding in real-time to HTTP REST queries.
- Real-time document indexing pipeline: This pipeline reads data directly from S3-compatible storage, without the need to query a vector document database.
- User session and beta testing handling: The query building process can be extended to handle user sessions and beta testing for new models.
- Code reusability for offline evaluation: The same code can be used for static evaluation of the system.
This section provides a general introduction on how to start using the app. You can run it in different settings:
Pipeline Mode | Description |
---|---|
contextful |
In this mode, the app will index the documents located in the data/pathway-docs directory. These indexed documents are then taken into account when processing queries. The pathway pipeline being run in this mode is located at llm_app/pathway_pipelines/contextful/app.py . |
contextful_s3 |
This mode operates similarly to the contextful mode. The main difference is that the documents are stored and indexed from an S3 bucket, allowing the handling of a larger volume of documents. This can be more suitable for production environments. |
contextless |
This pipeline calls OpenAI ChatGPT API but does not use an index when processing queries. It relies solely on the given user query. |
local |
This mode runs the application using Huggingface Transformers, which eliminates the need for the data to leave the machine. It provides a convenient way to use state-of-the-art NLP models locally. |
-
Clone the repository: This is done with the
git clone
command followed by the URL of the repository:git clone https://github.com/pathwaycom/llm-app.git
Next, navigate to the repository:
cd llm-app
-
Environment Variables: Create an .env file in
llm_app/
directory and add the following environment variables, adjusting their values according to your specific requirements and setup.Environment Variable Description PIPELINE_MODE Determines which pipeline to run in your application. Available modes are [ contextful
,contextful_s3
,contextless
,local
]. By default, the mode is set tocontextful
.PATHWAY_REST_CONNECTOR_HOST Specifies the host IP for the REST connector in Pathway. For the dockerized version, set it to 0.0.0.0
Natively, you can use127.0.0.1
PATHWAY_REST_CONNECTOR_PORT Specifies the port number on which the REST connector service of the Pathway should listen. Here, it is set to 8080. OPENAI_API_TOKEN The API token for accessing OpenAI services. If you are not running the local version, please remember to replace it with your personal API token, which you can generate from your account on openai.com. PATHWAY_CACHE_DIR Specifies the directory where cache is stored. You could use /tmp/cache. PIPELINE_MODE=contextful PATHWAY_REST_CONNECTOR_HOST=0.0.0.0 PATHWAY_REST_CONNECTOR_PORT=8080 OPENAI_API_TOKEN=<Your Token> PATHWAY_CACHE_DIR=/tmp/cache
Docker is a tool designed to make it easier to create, deploy, and run applications by using containers. Here is how to use Docker to build and run the LLM App:
- Build and Run with Docker The first step is to build the Docker image for the LLM App. You do this with the docker build command.
Build the image:
After your image is built, you can run it as a container. You use the docker run command to do this
docker build -t llm-app .
When the process is complete, the App will be up and running inside a Docker container and accessible atdocker run -it -p 8080:8080 llm-app
0.0.0.0:8080
. From there, you can proceed to the "Usage" section of the documentation for information on how to interact with the application.
-
Virtual Python Environment: Create a new environment and install the required packages to isolate the dependencies of this project from your system's Python:
# Creates an env called pw-env and activates this environment. python -m venv pw-env && source pw-env/bin/activate pip install --upgrade --extra-index-url https://packages.pathway.com/966431ef6ba -r requirements.txt
-
Run the App: You can start the application with the command:
cd llm_app/ python main.py
-
Send REST queries (in a separate terminal window): These are examples of how to interact with the application once it's running.
curl
is a command-line tool used to send data using various network protocols. Here, it's being used to send HTTP requests to the application.curl --data '{"user": "user", "query": "How to connect to Kafka in Pathway?"}' http://localhost:8080/ | jq curl --data '{"user": "user", "query": "How to use LLMs in Pathway?"}' http://localhost:8080/ | jq
Please change
localhost
to0.0.0.0
if you are running the app on docker. -
Test reactivity by adding a new file: This shows how to test the application's ability to react to changes in data by adding a new file and sending a query.
cp ./data/documents_extra.jsonl ./data/pathway-docs/ curl --data '{"user": "user", "query": "How to use LLMs in Pathway?"}' http://localhost:8080/ | jq
LLM App can be configured to run with local Machine Learning models, without making API calls outside of the User's Organization.
It can also be extended to handle live data sources (news feeds, API's, data streams in Kafka), to include user permissions, a data security layer, and an LLMops monitoring layer.
See: Features for Organizations.
Read more about the implementation details and how to extend this application in our blog series.