You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description: Implement functionality to load external data into the vector database. This involves developing scripts or tools to import data from various sources such as DOCX or PDF files and store them in the vector database.
Tasks:
- Develop a script/tool to parse data from DOCX/PDF files.
- Design a mechanism to transform the parsed data into vector representations.
- Implement logic to store the vectorized data in the database.
The text was updated successfully, but these errors were encountered:
Step 1) Parsing the PDF/DOCX using PyMuPDF(for text) or OCR(for images) or similar python libraries.
Step2) Choosing an embedding model for converting this to embeddings.
Step 3) Connecting to ChromaDB or FAISS using their APIs/Documentation
Description: Implement functionality to load external data into the vector database. This involves developing scripts or tools to import data from various sources such as DOCX or PDF files and store them in the vector database.
Tasks:
- Develop a script/tool to parse data from DOCX/PDF files.
- Design a mechanism to transform the parsed data into vector representations.
- Implement logic to store the vectorized data in the database.
The text was updated successfully, but these errors were encountered: