Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor VectoStore's usage on Document.getEmbedding #1826

Open
ilayaperumalg opened this issue Nov 25, 2024 · 0 comments
Open

Refactor VectoStore's usage on Document.getEmbedding #1826

ilayaperumalg opened this issue Nov 25, 2024 · 0 comments
Assignees
Milestone

Comments

@ilayaperumalg
Copy link
Member

As a result of deprecating embedding from the Document object, refer: #1781, some of the vector store operations, specifically the doSimilaritySearch method relies on Document#getEmbedding.

This issue addresses the refactoring of this dependency.

@ilayaperumalg ilayaperumalg added this to the 1.0.0-M5 milestone Nov 25, 2024
@ilayaperumalg ilayaperumalg self-assigned this Nov 25, 2024
ilayaperumalg added a commit to ilayaperumalg/spring-ai that referenced this issue Nov 26, 2024
…ch and EmbeddingModel's embed

  - Since the Document object's reference to the `embedding` is deprecated and will be removed, the VectorStore's similaritySearch operation requires the reference to the document for the corresponding vectors. To achieve this, changed the EmbeddingModel#embed(List<Document> documents, EmbeddingOptions options, BatchingStrategy batchingStrategy)` method return type to `Map<String, float[]>` from `List<float[]` with the Map's key representing the document id.

    - Refactored the vector store implementations to update this change
ilayaperumalg added a commit to ilayaperumalg/spring-ai that referenced this issue Nov 27, 2024
  - Since the Document object's reference to the `embedding` is deprecated and will be removed, the VectorStore implementations require a way to store the embedding of the corresponding Document objects
     - One way to fix this is, to have the EmbeddingModel#embed to return the embeddings in the same order as that of the Documents passed to it.
       - Since both the Document and embedding collections use the List object, their iteration operation will make sure to keep them in line with the same order.
       - A fix is required to preserve the order when batching strategy is applied.
	  - Updated the Javadoc for BatchingStrategy
          - Fixed the Document List order in TokenCountBatchingStrategy

    - Refactored the vector store implementations to update this change

Resolves #spring-projectsGH-1826
ilayaperumalg added a commit to ilayaperumalg/spring-ai that referenced this issue Nov 27, 2024
  - Since the Document object's reference to the `embedding` is deprecated and will be removed, the VectorStore implementations require a way to store the embedding of the corresponding Document objects
     - One way to fix this is, to have the EmbeddingModel#embed to return the embeddings in the same order as that of the Documents passed to it.
       - Since both the Document and embedding collections use the List object, their iteration operation will make sure to keep them in line with the same order.
       - A fix is required to preserve the order when batching strategy is applied.
	  - Updated the Javadoc for BatchingStrategy
          - Fixed the Document List order in TokenCountBatchingStrategy

    - Refactored the vector store implementations to update this change

Resolves #spring-projectsGH-1826
ilayaperumalg added a commit to ilayaperumalg/spring-ai that referenced this issue Nov 27, 2024
  - Since the Document object's reference to the `embedding` is deprecated and will be removed, the VectorStore implementations require a way to store the embedding of the corresponding Document objects

     - One way to fix this is, to have the EmbeddingModel#embed to return the embeddings in the same order as that of the Documents passed to it.

       - Since both the Document and embedding collections use the List object, their iteration operation will make sure to keep them in line with the same order.

       - A fix is required to preserve the order when batching strategy is applied.
	  - Updated the Javadoc for BatchingStrategy
          - Fixed the Document List order in TokenCountBatchingStrategy

    - Refactored the vector store implementations to update this change

Resolves #spring-projectsGH-1826
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant