Skip to content

Latest commit

 

History

History
70 lines (49 loc) · 1.67 KB

README.md

File metadata and controls

70 lines (49 loc) · 1.67 KB

vers

Lightweight, simple, single instance, local in-memory vector database written in Rust.

Currently supports the following indexing strategies:

  1. IVFFlat (k-means for partitioning)
  2. Locality-sensitive hashing (LSH) heavily inspired by fennel.ai's blog post.

Getting Started

Like any sensible package, the API aims to be dead simple.

  1. Import, obviously:
    use vers::indexes::base::{Index, Vector};
    use vers::indexes::ivfflat::IVFFlatIndex;
  1. Build an index:
    let mut index = IVFFlatIndex::build_index(
        num_clusters,
        num_attempts,
        max_iterations,
        &vectors
    );
  1. Add an embedding vector into the index:
    index.add(Vector(*emb), emb_unique_id);
  1. Persist the index to disk:
    let _ = index.save_index("wiki.index");
  1. Load the index from disk:
    let index = match IVFFlatIndex::load_index("wiki.index") {
        Ok(index) => index,
        Err(e) => panic!("Failed to load index! {}", e),
    };
  1. And of course, actually search the index:
    let results = index.search_approximate(
        embs.get("king"),   // query vector
        10                  // top_k
    ); // kings, queen, monarch, ...

That said, the API is unstable and subject to change. In particular, I really dislike having to pass in the unique vector ID into search_approximate.

Coming soon

  1. Python bindings
  2. Performance improvements (building IVFFlat index is slow, vectorization)
  3. Benchmarks (comparisons with popular ANN search indexes, e.g. faiss, and exhaustive searches)

Contributions are welcomed.