Add support for integrated vectorization

jelledruyts · Nov 28, 2023 · 6b1f29f · 6b1f29f
1 parent e6ede18
commit 6b1f29f
Show file tree

Hide file tree

Showing 14 changed files with 312 additions and 165 deletions.
diff --git a/README.md b/README.md
@@ -38,22 +38,24 @@ graph TD
   acs[Azure AI Search]
   aoai[Azure OpenAI]
   webapp[Web App]
-  functionapp[Function App]
+  functionapp[Function Apps]
   storage[Storage Account]
 
-  webapp -->|Generate query embeddings for vector search| aoai
+  webapp -->|Generate query embeddings for vector search (for external vectorization)| aoai
   webapp -->|Send chat requests| aoai
   webapp -->|Send search requests| acs
   webapp -->|Upload new documents| storage
-  functionapp -->|Generate embeddings for chunks| aoai
+  functionapp -->|Generate embeddings for chunks (for external vectorization)| aoai
+  functionapp -->|Push chunks into search index (for push model)| acs
+  acs -->|Generate embeddings for chunks and search queries (for integrated vectorization)| aoai
   acs -->|Populate search index from documents| storage
-  acs -->|Generate chunks and embeddings to index| functionapp
+  acs -->|Generate chunks and embeddings to index (for external vectorization)| functionapp
   aoai -->|Find relevant context to build prompt for Azure OpenAI on your data| acs
 ```
 
 When you deploy the solution, it creates an [Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search) service which indexes document content from a blob storage container. (Note that documents are assumed to be in English.)
 
-The documents in the index are also chunked into smaller pieces, and vector embeddings are created for these chunks using a Function App based on the [Azure OpenAI Embeddings Generator power skill](https://github.com/Azure-Samples/azure-search-power-skills/tree/main/Vector/EmbeddingGenerator). This allows you to easily try out [vector and hybrid search](https://learn.microsoft.com/azure/search/vector-search-overview). With Azure AI Search on its own, the responses *always* come directly from the source data, rather than being generated by an AI model. You can optionally use [semantic ranking](https://learn.microsoft.com/azure/search/semantic-search-overview) which *does* use AI, not to generate content but to increase the relevancy of the results and provide semantic answers and captions.
+The documents in the index are also chunked into smaller pieces, and vector embeddings are created for these chunks using either [integrated vectorization](https://learn.microsoft.com/azure/search/vector-search-integrated-vectorization), or external vectorization using a Function App. This allows you to easily try out [vector and hybrid search](https://learn.microsoft.com/azure/search/vector-search-overview). With Azure AI Search on its own, the responses *always* come directly from the source data, rather than being generated by an AI model. You can optionally use [semantic ranking](https://learn.microsoft.com/azure/search/semantic-search-overview) which *does* use AI, not to generate content but to increase the relevancy of the results and provide semantic answers and captions.
 
 The solution also deploys an [Azure OpenAI](https://learn.microsoft.com/azure/ai-services/openai/overview) service. It provides an embeddings model to generate the vector representations of the document chunks and search queries, and a GPT model to generate answers to your search queries. If you choose the option to use [Azure OpenAI "on your data"](https://learn.microsoft.com/azure/ai-services/openai/concepts/use-your-data), these AI-generated responses can be grounded in (and even limited to) the information in your Azure AI Search indexes. This option allows you to let Azure OpenAI orchestrate the [Retrieval Augmented Generation (RAG)](https://aka.ms/what-is-rag) pattern. This means your search query will first be used to retrieve the most relevant documents (or preferably *smaller chunks of those documents*) from your private data source. Those search results are then used as context in the prompt that gets sent to the AI model, along with the original search query. This allows the AI model to generate a response based on the most relevant source data, rather than the public data that was used to train the model. Next to letting Azure OpenAI orchestrate the RAG pattern, the web application can also use [Semantic Kernel](https://learn.microsoft.com/semantic-kernel/overview/) to perform that orchestration, using a prompt and other parameters you can control yourself.
 
@@ -109,9 +111,9 @@ This can easily be done by setting up the built-in [authentication and authoriza
 
 ## Configuration
 
-The ARM template deploys the services and sets the configuration settings for the Web App and Function App. Most of these shouldn't be changed as they contain connection settings between the various services, but you can change the settings below for the App Service Web App.
+The ARM template deploys the services and sets the configuration settings for the Web App and Function Apps. Most of these shouldn't be changed as they contain connection settings between the various services, but you can change the settings below for the App Service Web App.
 
-> Note that the settings of the Function App shouldn't be changed, as the [power skill](https://github.com/Azure-Samples/azure-search-power-skills/tree/main/Vector/EmbeddingGenerator) was tweaked for this project to take any relevant settings from the request sent by the Azure AI Search skillset instead of from configuration (for example, the embedding model and chunk size to use).
+> Note that the settings of the Function Apps shouldn't be changed, as the [power skill](https://github.com/Azure-Samples/azure-search-power-skills/tree/main/Vector/EmbeddingGenerator) was tweaked for this project to take any relevant settings from the request sent by the Azure AI Search skillset instead of from configuration (for example, the embedding model and chunk size to use).
 
 | Setting | Purpose | Default value |
 | ------- | ------- | ------------- |
@@ -121,12 +123,14 @@ The ARM template deploys the services and sets the configuration settings for th
 | `OpenAIGptDeployment` | The deployment name of the [Azure OpenAI GPT model](https://learn.microsoft.com/azure/ai-services/openai/concepts/models) to use | `gpt-35-turbo` |
 | `StorageContainerNameBlobDocuments`* | The name of the storage container that contains the documents | `blob-documents` |
 | `StorageContainerNameBlobChunks`* | The name of the storage container that contains the document chunks | `blob-chunks` |
-| `TextEmbedderNumTokens` | The number of tokens per chunk when splitting documents into smaller pieces | `2048` |
-| `TextEmbedderTokenOverlap` | The number of tokens to overlap between consecutive chunks | `0` |
-| `TextEmbedderMinChunkSize` | The minimum number of tokens of a chunk (smaller chunks are excluded) | `10` |
+| `TextChunkerPageLength` | In case of integrated vectorization, the number of characters per page (chunk) when splitting documents into smaller pieces | `2000` |
+| `TextChunkerPageOverlap` | In case of integrated vectorization, the number of characters to overlap between consecutive pages (chunks) | `500` |
+| `TextEmbedderNumTokens` | In case of external vectorization, the number of tokens per chunk when splitting documents into smaller pieces | `2048` |
+| `TextEmbedderTokenOverlap` | In case of external vectorization, the number of tokens to overlap between consecutive chunks | `0` |
+| `TextEmbedderMinChunkSize` | In case of external vectorization, the minimum number of tokens of a chunk (smaller chunks are excluded) | `10` |
 | `SearchIndexNameBlobDocuments`* | The name of the search index that contains the documents | `blob-documents` |
 | `SearchIndexNameBlobChunks`* | The name of the search index that contains the document chunks | `blob-chunks` |
-| `SearchIndexerSkillType`* | The type of chunking and embedding skill to use as part of the documents indexer: `pull` uses a [knowledge store](https://learn.microsoft.com/azure/search/knowledge-store-concept-intro) to store the chunk data in blobs and a separate indexer to pull these into the document chunks index; `push` directly uploads the data from the custom skill into the document chunks index | `pull` |
+| `SearchIndexerSkillType`* | The type of chunking and embedding skill to use as part of the documents indexer: `integrated` uses [integrated vectorization](https://learn.microsoft.com/azure/search/vector-search-integrated-vectorization); `pull` uses a custom skill with a [knowledge store](https://learn.microsoft.com/azure/search/knowledge-store-concept-intro) to store the chunk data in blobs and a separate indexer to pull these into the document chunks index; `push` directly uploads the data from a custom skill into the document chunks index | `integrated` |
 | `SearchIndexerScheduleMinutes`* | The number of minutes between indexer executions in Azure AI Search | `5` |
 | `InitialDocumentUrls` | A space-separated list of URLs for the documents to include by default | A [resiliency](https://azure.microsoft.com/mediahandler/files/resourcefiles/resilience-in-azure-whitepaper/Resiliency-whitepaper.pdf) and [compliance](https://azure.microsoft.com/mediahandler/files/resourcefiles/data-residency-data-sovereignty-and-compliance-in-the-microsoft-cloud/Data_Residency_Data_Sovereignty_Compliance_Microsoft_Cloud.pdf) document |
 | `DefaultSystemRoleInformation` | The default instructions for the AI model | "You are an AI assistant that helps people find information." |

diff --git a/azuredeploy.json b/azuredeploy.json
@@ -61,6 +61,8 @@
         "openaiApiVersion": "2023-06-01-preview",
         "storageContainerNameBlobDocuments": "blob-documents",
         "storageContainerNameBlobChunks": "blob-chunks",
+        "textChunkerPageLength": 2000,
+        "textChunkerPageOverlap": 500,
         "textEmbedderNumTokens": 2048,
         "textEmbedderTokenOverlap": 0,
         "textEmbedderMinChunkSize": 10,
@@ -534,6 +536,14 @@
             "type": "string",
             "value": "[variables('functionApiKey')]"
         },
+        "textChunkerPageLength": {
+            "type": "string",
+            "value": "[variables('textChunkerPageLength')]"
+        },
+        "textChunkerPageOverlap": {
+            "type": "string",
+            "value": "[variables('textChunkerPageOverlap')]"
+        },
         "textEmbedderNumTokens": {
             "type": "int",
             "value": "[variables('textEmbedderNumTokens')]"

diff --git a/src/Azure.AISearch.WebApp/AppSettings.cs b/src/Azure.AISearch.WebApp/AppSettings.cs
@@ -14,6 +14,8 @@ public class AppSettings
     public string? TextEmbedderFunctionEndpointPython { get; set; }
     public string? TextEmbedderFunctionEndpointDotNet { get; set; }
     public string? TextEmbedderFunctionApiKey { get; set; }
+    public int? TextChunkerPageLength { get; set; } // If unspecified, will use 2000 characters per page.
+    public int? TextChunkerPageOverlap { get; set; } // If unspecified, will use 500 characters overlap.
     public int? TextEmbedderNumTokens { get; set; } // If unspecified, will use the default as configured in the text embedder Function App.
     public int? TextEmbedderTokenOverlap { get; set; } // If unspecified, will use the default as configured in the text embedder Function App.
     public int? TextEmbedderMinChunkSize { get; set; } // If unspecified, will use the default as configured in the text embedder Function App.
@@ -22,7 +24,7 @@ public class AppSettings
     public string? SearchServiceSku { get; set; }
     public string? SearchIndexNameBlobDocuments { get; set; }
     public string? SearchIndexNameBlobChunks { get; set; }
-    public string? SearchIndexerSkillType { get; set; } // If unspecified, will use the "pull" model.
+    public string? SearchIndexerSkillType { get; set; } // If unspecified, will use the "integrated" model.
     public int? SearchIndexerScheduleMinutes { get; set; } // If unspecified, will be set to 5 minutes.
     public string? InitialDocumentUrls { get; set; }
     public string? DefaultSystemRoleInformation { get; set; }

diff --git a/src/Azure.AISearch.WebApp/Constants.cs b/src/Azure.AISearch.WebApp/Constants.cs
@@ -7,11 +7,14 @@ public static class Constants
     public static class ConfigurationNames
     {
         public const string SemanticConfigurationNameDefault = "default";
-        public const string VectorSearchConfigurationNameDefault = "default";
+        public const string VectorSearchProfileNameDefault = "default-profile";
+        public const string VectorSearchAlgorithNameDefault = "default-algorithm";
+        public const string VectorSearchVectorizerNameDefault = "default-vectorizer";
     }
 
     public static class SearchIndexerSkillTypes
     {
+        public const string Integrated = "integrated";
         public const string Pull = "pull";
         public const string Push = "push";
     }

diff --git a/src/Azure.AISearch.WebApp/Models/AppSettingsOverride.cs b/src/Azure.AISearch.WebApp/Models/AppSettingsOverride.cs
@@ -4,9 +4,11 @@ namespace Azure.AISearch.WebApp.Models;
 // as they are not used anywhere else and don't depend on other settings.
 public class AppSettingsOverride
 {
+    public int? TextChunkerPageLength { get; set; } // If unspecified, will use 2000 characters per page.
+    public int? TextChunkerPageOverlap { get; set; } // If unspecified, will use 500 characters overlap.
     public int? TextEmbedderNumTokens { get; set; } // If unspecified, will use the default as configured in the text embedder Function App.
     public int? TextEmbedderTokenOverlap { get; set; } // If unspecified, will use the default as configured in the text embedder Function App.
     public int? TextEmbedderMinChunkSize { get; set; } // If unspecified, will use the default as configured in the text embedder Function App.
-    public string? SearchIndexerSkillType { get; set; } // If unspecified, will use the "pull" model.
+    public string? SearchIndexerSkillType { get; set; } // If unspecified, will use the "integrated" model.
     public int? SearchIndexerScheduleMinutes { get; set; } // If unspecified, will be set to 5 minutes.
 }
diff --git a/src/Azure.AISearch.WebApp/Models/DocumentChunk.cs b/src/Azure.AISearch.WebApp/Models/DocumentChunk.cs
@@ -3,9 +3,6 @@ namespace Azure.AISearch.WebApp.Models;
 public class DocumentChunk
 {
     public string? Id { get; set; }
-    public long ChunkIndex { get; set; }
-    public long ChunkOffset { get; set; }
-    public long ChunkLength { get; set; }
     public string? Content { get; set; }
     public IReadOnlyList<float>? ContentVector { get; set; }
     public string? SourceDocumentId { get; set; }

diff --git a/src/Azure.AISearch.WebApp/Models/SearchRequest.cs b/src/Azure.AISearch.WebApp/Models/SearchRequest.cs
@@ -10,6 +10,7 @@ public class SearchRequest
     public QuerySyntax QuerySyntax { get; set; } = QuerySyntax.Simple;
     public DataSourceType DataSource { get; set; } = DataSourceType.None;
     public string? OpenAIGptDeployment { get; set; }
+    public bool UseIntegratedVectorization { get; set; }
     public int? VectorNearestNeighborsCount { get; set; } = Constants.Defaults.VectorNearestNeighborsCount;
     public bool LimitToDataSource { get; set; } = true; // "Limit responses to your data content"
     public string? SystemRoleInformation { get; set; } // Give the model instructions about how it should behave and any context it should reference when generating a response. You can describe the assistant’s personality, tell it what it should and shouldn’t answer, and tell it how to format responses. There’s no token limit for this section, but it will be included with every API call, so it counts against the overall token limit.

diff --git a/src/Azure.AISearch.WebApp/Models/SearchResult.cs b/src/Azure.AISearch.WebApp/Models/SearchResult.cs
@@ -6,7 +6,6 @@ public class SearchResult
     public string? SearchIndexKey { get; set; }
     public string? DocumentId { get; set; }
     public string? DocumentTitle { get; set; }
-    public int? ChunkIndex { get; set; }
     public double? Score { get; set; }
     public IDictionary<string, IList<string>> Highlights { get; set; } = new Dictionary<string, IList<string>>();
     public IList<string> Captions { get; set; } = new List<string>();