Skip to content

Real-time LLM App - a responsive AI application leveraging OpenAI/Hugging Face APIs to provide natural language responses to user queries. No vector database required.

License

Notifications You must be signed in to change notification settings

mdmalhou/llm-app

 
 

Repository files navigation

Contributors Contributors Build Linux macOS
chat on Discord follow on Twitter

LLM App

Pathway's LLM App is a chatbot application which provides real-time responses to user queries, based on the freshest knowledge available in a document store. It does not require a separate vector database, and helps to avoid fragmented LLM stacks (such as Pinecone/Weaviate + Langchain + Redis + FastAPI +...). Document data lives in the place where it was stored already, and on top of this, LLM App provides a light but integrated data processing layer, which is highly performant and can be easily customized and extended. It is particulary recommended for privacy-preserving LLM applications.

Project Overview

LLM App reads a corpus of documents stored in S3 or locally, preprocesses them, and builds a vector index by calling a routine from the Pathway package. It then listens to user queries coming as HTTP REST requests. Each query uses the index to retrieve relevant documentation snippets and uses the OpenAI API/ Hugging Face to provide a response in natural language. The bot is reactive to changes in the corpus of documents: once new snippets are provided, it reindexes them and starts to use the new knowledge to answer subsequent queries.

Watch a Demo Here

(Available soon)

Key Features

  • HTTP REST queries: The system is capable of responding in real-time to HTTP REST queries.
  • Real-time document indexing pipeline: This pipeline reads data directly from S3-compatible storage, without the need to query a vector document database.
  • User session and beta testing handling: The query building process can be extended to handle user sessions and beta testing for new models.
  • Code reusability for offline evaluation: The same code can be used for static evaluation of the system.

Getting Started

This section provides a general introduction on how to start using the app. You can run it in different settings:

Pipeline Mode Description
contextful In this mode, the app will index the documents located in the data/pathway-docs directory. These indexed documents are then taken into account when processing queries. The pathway pipeline being run in this mode is located at llm_app/pathway_pipelines/contextful/app.py.
contextful_s3 This mode operates similarly to the contextful mode. The main difference is that the documents are stored and indexed from an S3 bucket, allowing the handling of a larger volume of documents. This can be more suitable for production environments.
contextless This pipeline calls OpenAI ChatGPT API but does not use an index when processing queries. It relies solely on the given user query.
local This mode runs the application using Huggingface Transformers, which eliminates the need for the data to leave the machine. It provides a convenient way to use state-of-the-art NLP models locally.

Installation

  • Clone the repository: This is done with the git clone command followed by the URL of the repository:

    git clone https://github.com/pathwaycom/llm-app.git

    Next, navigate to the repository:

    cd llm-app
  • Environment Variables: Create an .env file in llm_app/ directory and add the following environment variables, adjusting their values according to your specific requirements and setup.

    Environment Variable Description
    PIPELINE_MODE Determines which pipeline to run in your application. Available modes are [contextful, contextful_s3, contextless, local]. By default, the mode is set to contextful.
    PATHWAY_REST_CONNECTOR_HOST Specifies the host IP for the REST connector in Pathway. For the dockerized version, set it to 0.0.0.0 Natively, you can use 127.0.0.1
    PATHWAY_REST_CONNECTOR_PORT Specifies the port number on which the REST connector service of the Pathway should listen. Here, it is set to 8080.
    OPENAI_API_TOKEN The API token for accessing OpenAI services. If you are not running the local version, please remember to replace it with your personal API token, which you can generate from your account on openai.com.
    PATHWAY_CACHE_DIR Specifies the directory where cache is stored. You could use /tmp/cache.
    PIPELINE_MODE=contextful
    PATHWAY_REST_CONNECTOR_HOST=0.0.0.0
    PATHWAY_REST_CONNECTOR_PORT=8080
    OPENAI_API_TOKEN=<Your Token>
    PATHWAY_CACHE_DIR=/tmp/cache

Using Docker:

Docker is a tool designed to make it easier to create, deploy, and run applications by using containers. Here is how to use Docker to build and run the LLM App:

  • Build and Run with Docker The first step is to build the Docker image for the LLM App. You do this with the docker build command. Build the image:
    docker build -t llm-app .
    After your image is built, you can run it as a container. You use the docker run command to do this
    docker run -it -p 8080:8080 llm-app
    When the process is complete, the App will be up and running inside a Docker container and accessible at 0.0.0.0:8080. From there, you can proceed to the "Usage" section of the documentation for information on how to interact with the application.

Natively:

  • Virtual Python Environment: Create a new environment and install the required packages to isolate the dependencies of this project from your system's Python:

    # Creates an env called pw-env and activates this environment.
    python -m venv pw-env && source pw-env/bin/activate
    
    pip install --upgrade --extra-index-url https://packages.pathway.com/966431ef6ba -r requirements.txt
  • Run the App: You can start the application with the command:

    cd llm_app/
    python main.py

Usage

  1. Send REST queries (in a separate terminal window): These are examples of how to interact with the application once it's running. curl is a command-line tool used to send data using various network protocols. Here, it's being used to send HTTP requests to the application.

    curl --data '{"user": "user", "query": "How to connect to Kafka in Pathway?"}' http://localhost:8080/ | jq
    
    curl --data '{"user": "user", "query": "How to use LLMs in Pathway?"}' http://localhost:8080/ | jq

    Please change localhost to 0.0.0.0 if you are running the app on docker.

  2. Test reactivity by adding a new file: This shows how to test the application's ability to react to changes in data by adding a new file and sending a query.

    cp ./data/documents_extra.jsonl ./data/pathway-docs/
    curl --data '{"user": "user", "query": "How to use LLMs in Pathway?"}' http://localhost:8080/ | jq

Data Privacy and Use in Organizations

LLM App can be configured to run with local Machine Learning models, without making API calls outside of the User's Organization.

It can also be extended to handle live data sources (news feeds, API's, data streams in Kafka), to include user permissions, a data security layer, and an LLMops monitoring layer.

See: Features for Organizations.

Further Reading

Read more about the implementation details and how to extend this application in our blog series.

Supported and maintained by:

About

Real-time LLM App - a responsive AI application leveraging OpenAI/Hugging Face APIs to provide natural language responses to user queries. No vector database required.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.9%
  • Dockerfile 2.1%