Optimize Request for Embedding in Vector Store #719

ricken07 · 2024-05-13T15:34:01Z

Currently, vector store automatically calls the embedding client to generate the document embedding without checking whether the document already had an embedding.

In this PR, I first check if the document doesn't already have an embedding before calling the client to generate an embedding. This prevents too many calls to generate an embedding.

Tests are green for impacted vector stores

tzolov · 2024-08-21T05:00:36Z

If not mistaken this is the same or related to #413 ?

But this change comes with some risks. For example, it is not clear when one would have to invalidate the pre-computed embedding (e.g. the index). Likely when
Also I'm not sure how useful this feature would be. What is the use case where you will use repeatedly the same Documents (with pre-computed embeddings) for searching? Or what are the reasons you might what to re-add a document that has precomputed embedding?

Maybe I'm missing some interesting use cases?

Right now we do not allow the Vector Store to use other embeddings but those computed by the embedding-model registered with the VectorStore. Using the embedding field would allow one to pre-compute the embeddings externally using different embedding-model and then the VectorStore will store the document with the externally computed embedding.
But I'm not sure if this is a real or needed use case, nor if this is the right approach to support it.

tzolov · 2024-08-21T05:08:26Z

If the pre-computed embeddings are not applicable/useful for real use cases, IMO, we should remove the embedding field from the Document class.

markpollack · 2024-11-25T21:46:17Z

See #1781

I think this was a design mistake to begin with, we shouldn't be caching/storing the embedding in the document in the first place.

optimize request for embedding in vector store

1caef95

markpollack added this to the 1.0.0-M2 milestone May 24, 2024

markpollack removed this from the 1.0.0-M2 milestone Aug 22, 2024

asaikali added the vector store label Nov 11, 2024

markpollack added this to the 1.0.0-M5 milestone Nov 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize Request for Embedding in Vector Store #719

Optimize Request for Embedding in Vector Store #719

ricken07 commented May 13, 2024

tzolov commented Aug 21, 2024 •

edited

Loading

tzolov commented Aug 21, 2024 •

edited

Loading

markpollack commented Nov 25, 2024

Optimize Request for Embedding in Vector Store #719

Are you sure you want to change the base?

Optimize Request for Embedding in Vector Store #719

Conversation

ricken07 commented May 13, 2024

tzolov commented Aug 21, 2024 • edited Loading

tzolov commented Aug 21, 2024 • edited Loading

markpollack commented Nov 25, 2024

tzolov commented Aug 21, 2024 •

edited

Loading

tzolov commented Aug 21, 2024 •

edited

Loading