The NVIDIA Developer LLM Operator enables developers to build RAG-LLM pipelines on Kubernetes and manage the lifecycle of the components for a sample pipeline.
The Operator manages the lifecycle of the following components:
-
Jupyter Notebook server: The container includes sample notebooks to demonstrate a sample pipeline.
-
Chatbot web application: The sample web application enables you to perform question and answering with the chatbot and to upload PDF documents to form a knowledge base.
-
Vector database: The sample pipeline uses Milvus to manage the embeddings generated by the LLM.
-
NVIDIA Triton Inference Server: The server is configured with the NVIDIA Nemo Framework for working with LLMs.
Refer to Installing the Operator to get started.