Documentation on vector stores + vector benchmark (#1245)

Added documentation for vector stores including usage examples, dependencies and other requirements.
georgia-tech-db · Oct 2, 2023 · e59092d · e59092d
1 parent 277161e
commit e59092d
Show file tree

Hide file tree

Showing 7 changed files with 120 additions and 0 deletions.
diff --git a/docs/_toc.yml b/docs/_toc.yml
@@ -69,6 +69,15 @@ parts:
           - file: source/reference/databases/mariadb
           - file: source/reference/databases/github
 
+      - file: source/reference/vector_stores/index
+        title: Vector Stores
+        sections: 
+          - file: source/reference/vector_stores/faiss
+          - file: source/reference/vector_stores/chromadb
+          - file: source/reference/vector_stores/qdrant
+          - file: source/reference/vector_stores/pgvector
+          - file: source/reference/vector_stores/pinecone
+
       - file: source/reference/ai/index
         title: AI Engines
         sections:

diff --git a/docs/source/reference/vector_stores/chromadb.rst b/docs/source/reference/vector_stores/chromadb.rst
@@ -0,0 +1,17 @@
+ChromaDB
+==========
+
+ChromaDB is an open-source embedding database which makes it easy to build LLM apps by making knowledge, facts, and skills pluggable for LLMs.
+The connection to ChromaDB is based on the `chromadb <https://pypi.org/project/chromadb/>`_ library.
+
+Dependency
+----------
+
+* chromadb
+
+Create Index
+-----------------
+
+.. code-block:: text
+
+   CREATE INDEX index_name ON table_name (data) USING CHROMADB;
diff --git a/docs/source/reference/vector_stores/faiss.rst b/docs/source/reference/vector_stores/faiss.rst
@@ -0,0 +1,18 @@
+Faiss
+==========
+
+Faiss is a library for efficient similarity search and clustering of dense vectors.
+It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM.
+The connection to Faiss is based on the `faiss-cpu <https://faiss.ai/index.html>`_ or `faiss-gpu <https://faiss.ai/index.html>`_ library.
+
+Dependency
+----------
+
+* faiss-cpu (or) faiss-gpu
+
+Create Index
+-----------------
+
+.. code-block:: text
+
+   CREATE INDEX index_name ON table_name (data) USING FAISS;
diff --git a/docs/source/reference/vector_stores/index.rst b/docs/source/reference/vector_stores/index.rst
@@ -0,0 +1,13 @@
+.. _vector_stores:
+
+Vector Stores
+=============
+
+In the realm of handling unstructured data, a prevalent method involves embedding the data and storing the resultant vectors. When querying, the
+unstructured query undergoes a similar embedding process, allowing retrieval of vectors most akin to the embedded query. This process is streamlined
+by a vector store, which handles the storage of embedded data and facilitates seamless vector searches on your behalf.
+
+EvaDB supports the Vector Stores listed below. You can find a comprehensive benchmark of the vector stores in the following `link <https://medium.com/evadb-blog/how-to-pick-a-vector-database-quantitative-analysis-afae5ea9e5b1>`_.
+
+
+.. tableofcontents::
diff --git a/docs/source/reference/vector_stores/pgvector.rst b/docs/source/reference/vector_stores/pgvector.rst
@@ -0,0 +1,17 @@
+pgvector
+==========
+
+pgvector is an open-source vector similarity search for Postgres. EvaDB uses its native support for Postgres while creating pgvector indices.
+The connection to pgvector is based on the `pgvector <https://github.com/pgvector/pgvector>`_ library.
+
+Dependency
+----------
+
+* pgvector
+
+Create Index
+-----------------
+
+.. code-block:: text
+
+   CREATE INDEX index_name ON table_name (data) USING PGVECTOR;
diff --git a/docs/source/reference/vector_stores/pinecone.rst b/docs/source/reference/vector_stores/pinecone.rst
@@ -0,0 +1,29 @@
+Pinecone
+==========
+
+Pinecone is a managed, cloud-native vector database with a simple API and no infrastructure hassles.
+The connection to Pincone is based on the `pinecone-client <https://docs.pinecone.io/docs/python-client>`_ library.
+
+Dependency
+----------
+
+* pinecone-client
+
+Parameters
+----------
+
+To use pinecone you must have an API key. Here are the `installation instructions <https://docs.pinecone.io/docs/quickstart>`_.
+Once you get an API key, you can also view the corresponding environment details in the same page. Both of the above details
+will be needed to establish a connection to the server.
+
+* `API_KEY` is the Pinecone API key.
+* `ENVIRONMENT` is the environment detail for the API key.
+
+The above values can either be set in the evadb.yml config file, or in the os environment fields "PINECONE_API_KEY", "PINECONE_ENV"
+
+Create Index
+-----------------
+
+.. code-block:: text
+
+   CREATE INDEX index_name ON table_name (data) USING PINECONE;
diff --git a/docs/source/reference/vector_stores/qdrant.rst b/docs/source/reference/vector_stores/qdrant.rst
@@ -0,0 +1,17 @@
+Qdrant
+==========
+
+Qdrant is a vector similarity search engine. Qdrant’s expanding features allow for all sorts of neural network or semantic-based matching, faceted search, and other applications.
+The connection to Qdrant is based on the `qdrant-client <https://qdrant.tech/documentation/>`_ library.
+
+Dependency
+----------
+
+* qdrant-client
+
+Create Index
+-----------------
+
+.. code-block:: text
+
+   CREATE INDEX index_name ON table_name (data) USING QDRANT;