ChatPDF

Chat with your PDF. Make summary. Ask relevant questions This is a short project I did to explore the LLM and RAG domain.
Tested on Mistral 7B models.

Please read the limitations at the end.

Setup

Create environment
conda create -n chatPDF python=3.10
Install poetry
pip install poetry
Then run
poetry install
Poetry will not install llama-cpp-python
Use pip install llama-cpp-python to install the CPU version. If you want to use CUDA, follow the steps in llama-cpp-python
Download the model and save it in the model directory
You can use various models and embeddings. Just go to model.py and change the model and embeddings as you like.
Make sure to keep your PDF in Scanned_Documents
Go to chat_PDF_Mistral7B.py or chat_PDF_OpenAI.py and update the prompt.

Things to improve

Fix the vector database calling
Inefficient Architecture: Need to fix the database area.
Loading the files in the database might be a better option. Might increase the speed. But if for only one time use, I think this architecture is fine. If user wants to check on the data again in the future, the cold start approach will be an issue.
Metrics to check the text generation quality by using various RAG techniques. hit_rate and mrr
All the code running on CPU. But still it's fast. Too lazy to reinstall llama_cpp_pythonusing CUBLAS.
Run the code using arguments
Run the code asynchronously
Sometimes the sentence is not complete proving that there's a limit on Mistral 7B.

Weird things in the code

Used Mistral7B but llamaindex still needs OPENAI API keys to run some function like VectorStoreIndex
So, reloading the database is a bit of challenge with this code because of the OPENAI requests even if I am not using OPENAI!!!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
RAG		RAG
Source_Documents		Source_Documents
.gitignore		.gitignore
README.md		README.md
chatPDF_Mistral7B.py		chatPDF_Mistral7B.py
chatPDF_OpenAI.py		chatPDF_OpenAI.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChatPDF

Please read the limitations at the end.

Setup

Things to improve

Weird things in the code

About

Releases

Packages

Languages

rohit7044/ChatPDF

Folders and files

Latest commit

History

Repository files navigation

ChatPDF

Please read the limitations at the end.

Setup

Things to improve

Weird things in the code

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages