This comprehensive guide provides both users and administrators with detailed instructions on deploying, configuring, and managing the GPT-RAG solution within a Zero Trust Architecture.
- Overview
- Concepts
- How-to: User
- How-to: Administration
- Reference
- Troubleshooting
The GPT-RAG Solution Accelerator enables organizations to enhance customer support, decision-making, and data-driven processes with Generative AI, empowering systems to handle complex inquiries across extensive datasets. It offers secure, efficient deployment for easy integration with existing operations, adaptable for both simple and advanced information retrieval.
Beyond classical Retrieval-Augmented Generation (RAG) capabilities, the accelerator incorporates agents that support sophisticated scenarios such as NL2SQL query generation and other context-aware data interactions. This flexibility enables advanced use cases where AI can seamlessly retrieve and interpret information, meeting diverse technical requirements.
The GPT-RAG Solution Accelerator follows a modular approach, consisting of three components: Data Ingestion, Orchestrator, and App Front-End, which utilizes the Backend for Front-End pattern to provide a scalable and efficient web interface.
Adopting a Zero Trust approach in Azure, as implemented by the GPT-RAG Solution Accelerator, provides a strong security foundation to safeguard your organization’s data and resources. Instead of using public endpoints, which expose services to the internet and increase susceptibility to cyber threats, this architecture ensures all access occurs within a secure, isolated network environment, reducing the attack surface and mitigating the risk of unauthorized access.
GPT-RAG's Zero Trust architecture with private endpoints ensures network isolation for sensitive data, enabling efficient Azure service integration without public IP exposure. This approach mitigates risks like data breaches and unauthorized access, creating a controlled environment that strengthens data integrity and confidentiality.
The GPT-RAG Solution Accelerator's agentic orchestration allows organizations to design tailored orchestration flows that coordinate multiple specialized agents. This customization ensures that complex queries are handled with precision and efficiency, leading to more accurate and contextually relevant AI responses.
Additionally, the solution’s custom chunking strategy tailors content segmentation to fit the unique characteristics of different data types and document structures. Aligning chunking methods to data specifics enhances retrieval speed, accuracy, and AI responsiveness, ensuring information is precise and contextually relevant.
The solution leverages a Zero Trust Architecture to ensure maximum security and compliance. All components are securely integrated within a virtual network, and communication between services is strictly controlled.
The diagram above illustrates the Zero Trust architecture. The GPT-RAG Solution Accelerator scope encompasses components within the Enterprise RAG resource group, providing essential Zero Trust functionalities.
- Virtual Network (VNet): Isolates resources and controls inbound and outbound traffic.
- Azure App Service: Hosts the front-end application.
- Azure Functions: Executes serverless functions for data ingestion and orchestration.
- Azure Storage Account: Stores data blobs for retrieval.
- Azure AI Search: Indexes and searches data efficiently.
- Azure OpenAI: Generates responses and vector embeddings.
- Azure AI Services: Reads documents for data Ingestion.
- Azure CosmosDB: Stores conversation history and metadata to improve quality.
- Azure Key Vault: Manages secrets used by the solution.
- Azure Private Endpoints: Secures network communication between services.
- Data Science VM: Provides a secure Bastion environment for admins and developers to configure the solution.
For more information about Zero Trust architecture, see the Enterprise RAG (GPT-RAG) Architecture page.
Tip
Need the Visio diagrams used in this documentation? You can easily download them here: Enterprise RAG.
Data ingestion is a crucial part of the solution, enabling the system to retrieve accurate and up-to-date information.
- Data Collection: Documents are ingested from
documents
blob container in the GPT-RAG storage account. - Data Preprocessing: The documents are prepared for indexing in AI Search Index, including breaking them into smaller chunks and optimizing them for efficient searchability.
- Indexing for Search: The chunks are then indexed within Azure AI Search, allowing for efficient retrieval during query processing.
Note
The ingestion process uses a pull approach: an Azure AI Search indexer checks blob storage hourly, triggering a Function App to preprocess and chunk new documents for indexing. Execution frequency is configurable.
For more information about the data ingestion process take a look at the GPT-RAG ingestion function app repo.
The solution uses an Agentic Orchestration approach, enabling agents to operate autonomously for efficient handling of user requests. The orchestration flow described below provides a typical structure but can be tailored to meet specific requirements.
- User Interaction: The user submits a query through the front-end application.
- Orchestration Process: The Orchestrator initiates a group chat with specialized agents to address the query.
- Agents retrieve relevant data from an AI Search index or a SQL database.
- The GPT model generates a response based on the collected information.
- Response Delivery: The front-end application returns a grounded answer to the user.
Note
Running as a Function App, the orchestrator offers scalable and efficient management of agent operations.
For more information about the agentic orchestration take a look at the GPT-RAG orchestration function app repo.
The networking architecture for the GPT-RAG solution leverages Azure’s advanced features to ensure a secure, flexible, and isolated environment, adhering to Zero Trust principles through the use of private endpoints and stringent access controls.
-
Azure Virtual Network (VNet): Provides a logically isolated network environment, segmented into subnets for different application tiers, ensuring organized and secure deployment of resources.
-
Azure Private Link and Private Endpoints: Establish secure, private connections to Azure services, keeping traffic within the Microsoft backbone network and minimizing exposure to the public internet.
-
Private DNS Zones: Enable internal name resolution within the Virtual Network, ensuring secure communication between services without public exposure.
-
Network Security Groups (NSGs): Control and restrict inbound and outbound traffic to Azure resources with granular security rules, enhancing the protection of your network.
-
Azure Bastion: Provides secure access for the development team to the VM used for GPT-RAG deployment via the Azure portal, without exposing it to the internet, ensuring secure and controlled deployment activities.
Note
The Infrastructure as Code (IaC) Bicep templates included in this solution accelerator allow you to automatically provision networking resources. The templates support customization so you can follow your organization's naming conventions and address range standards. Alternatively, if you prefer, you can choose to set up these resources manually.
External users can connect securely via Azure Front Door with WAF. Internal users can access through VPN or ExpressRoute. VNet Peering or Private Endpoints can be configured to connect GPT-RAG resources to your VNet. Further configuration details will follow.
The solution utilizes Azure Entra ID (formerly Azure Active Directory) for authenticating users accessing the front-end application. This ensures secure access control and integration with organizational identity management.
Authorization is managed by defining specific Entra ID users and groups that are permitted to use the application. These allowed users and groups are configured directly in the App Service settings, ensuring that only authorized individuals have access to the application.
This section guides users through essential tasks required to interact with the GPT-RAG Solution Accelerator. This section is divided into three main parts:
- Accessing the Application
- Managing Document Uploads
- Reindexing Documents in AI Search
To connect to the web frontend of the GPT-RAG Solution Accelerator:
- Navigate to the Web Application:
- Open your preferred web browser.
- Enter the web application endpoint URL provided by your administrator. The endpoint will follow a format similar to
webgpt0-[random_suffix].azurewebsites.net
, where[random_suffix]
is a unique identifier assigned during deployment. - Example:
https://webgpt0-abc123.azurewebsites.net
- Log in using your authorized credentials to access the application's interface.
This task updates documents to be indexed and is intended for users responsible for performing these updates. Users who are not involved in updating documents do not need to handle this task.
Before uploading documents, ensure that you have the necessary permissions in Azure:
- Azure Role Required: You must have the Storage Blob Data Contributor role assigned in Azure Entra ID for the storage account you will be accessing. This role allows you to upload and manage blobs within the storage containers.
Note
If you do not have the required role, contact your Azure administrator to obtain the necessary permissions.
-
Log in to the Azure Portal:
- Navigate to Azure Portal and sign in with your Azure credentials.
-
Locate the Storage Account:
- In the Azure Portal, go to the Storage Accounts section.
-
Select the storage account name provided by your administrator.
Tip
It is the storage account without the suffixes "ing" or "orc".
-
Navigate to the Documents Container:
- Within the selected storage account, click on Containers in the left-hand menu.
- Locate and select the Documents container from the list.
-
Upload Your Files:
- Click the Upload button at the top of the container view.
- In the upload pane, click Browse to select the files you wish to upload from your local machine.
- After selecting the files, click Upload to begin the process.
- Wait for the upload to complete. Once finished, your documents will be available in the Documents container for ingestion.
Sample document upload screen - Azure PortalAutomatic Indexing: The AI Search indexer automatically checks for new documents every hour, ensuring that uploaded documents are indexed without manual intervention. If you prefer to index documents immediately, refer to the Reindexing Content in AI Search section.
If you need to perform updates frequently, consider using Azure Storage Explorer for a more streamlined experience. Visit the Azure Storage Explorer page to download and learn more about this convenient tool.
This task updates the retrieval index to ensure that search results remain accurate and efficient. It is designated for users responsible for indexing operations. Users who are not handling the retrieval index do not need to perform this task.
Before reindexing, ensure that you have the necessary permissions:
- Azure Role Required: You must have the Cognitive Search Contributor or Cognitive Search Index Administrator role assigned in Azure Entra ID for the AI Search resource.
Note
If you do not have the required role, contact your Azure administrator to obtain the necessary permissions.
-
Log in to the Azure Portal:
- Navigate to Azure Portal and sign in with your Azure credentials.
-
Navigate to AI Search:
- In the Azure Portal, go to the Resource Groups section.
- Select the resource group associated with your application.
- Within the resource group, locate and select the AI Search resource.
-
Open Search Management:
- In the AI Search resource overview, click on Search Management in the left-hand menu.
-
Access GPT-RAG Indexer:
- Click on Indexers to view the list of available indexers.
- Locate and select the ragindex-indexer-chunk-documents.
-
Run the Search Index:
- Click the Run button to initiate the reindexing process for ragindex-indexer-chunk-documents indxer.
Tip
If you wish to reindex all content, click Reset before running the search index.
This section provides step-by-step guides for common administrative tasks.
This setup guide provides step-by-step instructions for provisioning a resource group with all the necessary components to ensure the solution operates efficiently. The diagram below highlights the resource group and its corresponding components, outlined in red, that will be provisioned during this process.
GPT-RAG Zero Trust Architecture
- Azure subscription with access to Azure OpenAI.
- You need either the Owner role or both the Contributor and User Access Administrator roles at the subscription level. Alternatively, you can create a custom role. Learn how to create a Custom Role here.
- Confirm you have the required quota to provision resources in the chosen Azure region for deployment.
For details on resources and SKUs, refer to the Azure Resources. - Agree to the Responsible AI terms by initiating the creation of an Azure AI service resource in the portal.
Note
The last step is unnecessary if an Azure AI service resource already exists in the subscription.
This guide will walk you through each step required to deploy the GPT-RAG solution in a Zero Trust architecture. Follow these steps to ensure a smooth and successful deployment. You can check the Installation and Post-Installation Checklist to verify that everything is as expected.
- Plan Your Deployment
- Download the Repository
- Select Zero Trust Installation
- Define Your Network Creation Scenario
- Set Network Address Range (Optional)
- Customize Resource Names (Optional)
- Reuse Existing Resources (Optional)
- Log In to Azure
- Provision Infrastructure Components
- Manually Configure Network Resources (Optional)
- Deploy Application Components
Ensure you have the following details before starting:
- Subscription Name
- Resource Group Name
- Azure Region
- Azure Environment Name (e.g., gpt-rag-dev, gpt-rag-poc)
Note
Choose a region with sufficient service quotas. Commonly tested regions include northcentralus
, eastus2
, eastus
, and westus
.
Select your preferred network setup:
-
Automatic Setup with Default Address Range
Automatically creates network resources with default address ranges in the GPT-RAG resource group.Network Item Address Range ai-vnet 10.0.0.0/23 ai-subnet 10.0.0.0/26 app-services-subnet 10.0.0.192/26 database-subnet 10.0.1.0/26 app-int-subnet 10.0.0.128/26 AzureBastionSubnet 10.0.0.64/26
Note
Each /26
subnet provides 59 usable IP addresses, with Azure reserving 5 addresses per subnet.
- Automatic Setup with Custom Address Ranges
If custom addressing is required, you can adjust address ranges in the configuration files to prevent overlap with existing networks.
Tip
Choose this option if you want custom addressing to avoid overlap with existing networks. This helps prevent issues with direct connections using VNet peering, VPN gateways, or ExpressRoute.
- Manual Network Setup
If you prefer to manually create the VNet, subnets, and other network resources, you can configure these outside of the Bicep templates, which will then deploy only the non-network resources.
Tip
Choose this option if deploying across subscriptions (e.g., Connectivity subscription) or if you want to adopt a network topology different from the provided architecture.
To optimize your setup, you can reuse non-networking resources already deployed within the same subscription, such as Azure OpenAI, Cosmos DB, and Key Vault. If you choose to reuse any of these resources, ensure you have their names and resource group details ready.
Important
If you’re reusing an existing Virtual Network (VNet), you must manually create all related network resources. This includes configuring subnets, private endpoints, and network interfaces as outlined in the Manual Network Setup scenario. In this case, the Bicep templates will not deploy network resources automatically when an existing VNet is reused.
By default, azd
generates a unique, random name for each resource based on the environment name, subscription, and location. You can customize resource names during solution provisioning if desired, so this is a good time to note down your preferred names.
If you use tags to manage resources (e.g., business-unit
, cost-center
), define them in advance to apply during installation.
Initialize the repository:
azd init -t azure/gpt-rag
Note: Add
-b agentic
if using the Agentic AutoGen-based orchestrator.azd init -t azure/gpt-rag -b agentic
Enable network isolation:
azd env set AZURE_NETWORK_ISOLATION true
Select a network setup option:
- For Automatic Setup with Default Address Ranges, skip to Step 6.
- For Automatic Setup with Custom Address Ranges, proceed to Step 5.
- For Manual Network Setup, run the following command before continuing to Step 6.
More details on manual setup are provided later:
azd env set VNET_REUSE true
To set custom address ranges, use the variables below with azd env set
:
Environment Variable | Network Item |
---|---|
AZURE_VNET_ADDRESS |
AI VNet |
AZURE_AI_SUBNET_PREFIX |
AI Subnet |
AZURE_APP_INT_SUBNET_PREFIX |
App Internal Subnet |
AZURE_APP_SERVICES_SUBNET_PREFIX |
App Services Subnet |
AZURE_BASTION_SUBNET_PREFIX |
Bastion Subnet |
AZURE_DATABASE_SUBNET_PREFIX |
Database Subnet |
Example:
azd env set AZURE_VNET_ADDRESS 10.1.0.0/23
azd env set AZURE_AI_SUBNET_PREFIX 10.1.0.0/26
azd env set AZURE_APP_SERVICES_SUBNET_PREFIX 10.1.0.64/26
azd env set AZURE_DATABASE_SUBNET_PREFIX 10.1.1.0/26
azd env set AZURE_APP_INT_SUBNET_PREFIX 10.1.0.128/26
azd env set AZURE_BASTION_SUBNET_PREFIX 10.1.0.192/26
To customize names, set environment variables for each resource. For example, the following command set the name of the Storage Account:
azd env set AZURE_STORAGE_ACCOUNT_NAME <yourResourceName>
See Customizing resource names to find out which variables correspond to each resource.
You can set environment variables if you want to reuse existing resources in the same subscription.
Example for AI Services:
azd env set AI_SERVICES_REUSE true
azd env set AI_SERVICES_RESOURCE_GROUP_NAME rg-gptrag-common
azd env set AI_SERVICES_NAME my-shared-ai-service
More details at Bring Your Own Resources.
Azure Developer CLI:
azd auth login
Azure CLI:
az login
Run:
azd provision
This section is intended for those who have chosen to manually create their network resources. If this is not the case, skip to Step 11.
- Create your VNet and subnets based on your organization's network architecture.
- Ensure that address ranges do not overlap with existing VNets to maintain connectivity.
- Check this reference to learn how to create VNets and Subnets in the Azure Portal:
Quickstart: Create a virtual network using the Azure portal
We recommend following this network topology to align with the Zero Trust architecture. However, you may use your organization’s VNet and subnet standards if preferred. For reference, the default addressing used in the Bicep template is shown below:
Name | Address Range |
---|---|
ai-vnet | 10.0.0.0/23 |
ai-subnet | 10.0.0.0/26 |
app-services-subnet | 10.0.0.192/26 |
database-subnet | 10.0.1.0/26 |
app-int-subnet | 10.0.0.128/26 |
AzureBastionSubnet | 10.0.0.64/26 |
Important
Use network addressing that avoids overlaps with your existing VNets. Overlapping address ranges prevent direct connections via VNet peering, VPN gateways, or ExpressRoute.
- Manually create private endpoints for the following Azure services:
- Data Ingestion Function App
- Azure Storage Account
- Azure Cosmos DB
- Azure Key Vault
- Orchestrator Function App
- Frontend Web App
- Azure AI Services
- Azure OpenAI
- Azure Search
- Ensure they are correctly associated with the appropriate subnets.
- Check this reference to learn how to create Private Endpoints in Azure Portal:
Create a private endpoint using the Azure portal - When creating the private endpoint in the portal, a Private DNS Zone for name resolution will also be set up. Ensure that all Private DNS Zones are created correctly.
- Define NSGs with rules that align with your security policies.
- Apply NSGs to the five subnets to control traffic flow.
- Check this reference to learn how to create Network Security Groups in Azure Portal:
Filter network traffic with a network security group using the Azure portal
Azure AI Search will access the Blob Storage Account and the Function App where the chunking function is located.
- Configure Shared Private Link with the Storage Account.
- Configure Shared Private Link with the Function App that performs the chunking.
- Check this reference to learn how to Configure shared private link for Azure AI Search
The App Service Plan supporting GPT-RAG Function Apps needs integration with the ai-vnet
.
- Configure VNet integration for the App Service Plan.
- Check this reference to learn how to Integrate your app with an Azure virtual network
- Create a Data Science Virtual Machine and configure a bastion for VM Access.
- Check this reference to learn how to Provision a Data Science Virtual Machine
- Check this reference to learn how to Connect to a Windows VM using Azure Bastion
Use this configuration for the VM:
- Operating System: Windows (Windows Server 2019 Datacenter)
- SKU: Standard_D4s_v3 (4 vCPUs, 16 GiB memory)
- Image Publisher:
microsoft-dsvm
(Data Science VM) - Image Offer:
dsvm-win-2019
- Ensure all network resources are deployed successfully.
- Verify that access controls and network configurations align with Zero Trust principles.
Deploy the application components by connecting through the Data Science VM with Bastion (Step 9 or manual setup in Step 10) or by directly accessing the VNet via a secure connection like ExpressRoute or VPN.
Note
If you have direct VNet access, you can deploy from your own machine, eliminating the need for a Bastion VM. Instructions for both options are below.
Access VNet from Your Machine:
azd package
azd deploy
Access VNet from the Data Science VM (Bastion):
-
Log in to the VM using the password stored in Key Vault.
-
Update
azd
:choco upgrade azd
-
Create a new directory and initialize deployment:
Important
Use the same environment name, subscription, and region as initial provisioning.
mkdir deploy
cd deploy
azd init -t azure/gpt-rag
azd auth login
azd env refresh
azd package
azd deploy
🎉 Congratulations! Your Zero Trust deployment is now complete.
Note
After the initial deployment, you may choose to customize or update specific features, such as adjusting prompts, adding a logo to the frontend, testing different chunking strategies, or configuring a custom orchestration strategy like NL2SQL. For detailed guidance on these optional customizations, refer to the deployment section in each component's repository. Orchestrator, Front-end, Data Ingestion.
This section outlines the various network configuration scenarios for deploying the GPT-RAG Solution Accelerator. Depending on your requirements and existing infrastructure, you can choose one of the following approaches to manage network resources:
- Automatic Network Creation
- Automatic Creation with Custom Addressing
- Manual Network Setup
For a straightforward deployment, GPT-RAG can automatically create all essential network resources. Simply set azd env set NETWORK_ISOLATION
before running azd provision
to enable this option.
The setup includes a VNet, five subnets, a Network Security Group (NSG) for each subnet, a private endpoint for each service, a private DNS Zone, and a Network Interface for each private endpoint.
Default Address Ranges:
Network Item | Address Range |
---|---|
ai-vnet | 10.0.0.0/23 |
ai-subnet | 10.0.0.0/26 |
app-services-subnet | 10.0.0.192/26 |
database-subnet | 10.0.1.0/26 |
app-int-subnet | 10.0.0.128/26 |
AzureBastionSubnet | 10.0.0.64/26 |
This option is ideal for users who want a hassle-free setup with optimal security and connectivity configurations predefined by GPT-RAG.
Note
DNS Configuration: When allowing GPT-RAG to create Private DNS Zones automatically, they will be created within the GPT-RAG resource group. If you prefer to configure them in your Connectivity subscription, choose the manual network configuration option (Scenario 3).
For deployments integrating with existing infrastructure, GPT-RAG allows you to adjust network addressing in the configuration files, preventing address overlaps while automating resource creation.
Adjust network addressing to avoid overlaps with existing VNets, as overlapping address ranges prevent direct connections via VNet peering, VPN gateways, or ExpressRoute. The default address ranges are:
Network Item | Address Range |
---|---|
AI VNet | 10.0.0.0/23 |
ai-subnet | 10.0.0.0/26 |
app-services-subnet | 10.0.0.192/26 |
database-subnet | 10.0.1.0/26 |
app-int-subnet | 10.0.0.128/26 |
AzureBastionSubnet | 10.0.0.64/26 |
Each /26
subnet offers 59 usable IP addresses, as Azure reserves 5 IP addresses in each subnet. The /23
VNet allows 507 usable IP addresses. To customize address ranges, set the following environment variables:
Environment Variable | Network Item |
---|---|
AZURE_VNET_ADDRESS |
AI VNet |
AZURE_AI_SUBNET_PREFIX |
AI Subnet |
AZURE_APP_INT_SUBNET_PREFIX |
App Internal Subnet |
AZURE_APP_SERVICES_SUBNET_PREFIX |
App Services Subnet |
AZURE_BASTION_SUBNET_PREFIX |
Bastion Subnet |
AZURE_DATABASE_SUBNET_PREFIX |
Database Subnet |
Set the desired address range with azd env
command after azd int
and before az provision
.
Example: azd env set AZURE_AI_SUBNET_PREFIX 10.0.0.16/26
.
If you need full control over network resources or are integrating GPT-RAG into a complex environment, you can manually create the necessary network resources. This approach is ideal for organizations with specific networking requirements or strict security policies. Refer to Manually Configure Network Resources in the deployment procedure for more details on setting up your network.
Note
User Connectivity: Regardless of the network configuration approach you select, you may need to configure additional network settings to enable connectivity for external or internal users. Refer to the Internal User Access and Internal User Access sections in this guide for detailed instructions tailored to your specific access requirements.
After deploying the Solution Accelerator, administrators and dev teams may need to access a Test Virtual Machine (VM) for configuration, customization, or deployment tasks. This section outlines the procedure for connecting to the Data Science VM using Azure Bastion.
Note
If these users already have secure access to the VNet through ExpressRoute or VPN, they can perform the required tasks directly from their own machines, removing the need for a Bastion VM and making its creation optional.
-
Azure Permissions:
- Virtual Machine Contributor role or higher on the resource group containing the Bastion and VM.
-
Access Credentials:
- Access to the Azure Key Vault containing the Bastion credentials.
Follow these steps to securely connect to the Data Science VM through Azure Bastion:
-
Go to the Bastion blade:
- In the Azure Bastion overview page, log into the created VM with the user gptrag and authenticate with the password stored in the key vault, similar to the figure below:
Note
The Data Science VM accessed through Bastion is intended solely for administrators and configuration personnel and is not meant for end-users. It is designed for individuals responsible for configuring, customizing, or updating the solution.
After deploying GPT-RAG, you may want to configure additional network settings to allow secure access for internal users. You can achieve this by setting up one of two network configurations designed for internal connectivity.
-
VNet Peering:
Connects your internal network to the GPT-RAG VNet, allowing users to access services through existing Private Endpoints. -
Private Endpoints:
Create Private Endpoints within a VNet that your internal users already use, such as a Hub VNet, enabling secure access without the need for VNet peering.
Choose the option that best fits your network setup and security requirements.
Establish VNet Peering to enable secure and efficient communication between virtual networks for users connected through ExpressRoute or VPN. This setup ensures that internal users can securely access your App Service, Storage Accounts, and Search Service
The following diagram illustrates a scenario using VNet Peering to allow internal users to access the application, along with a DNS configuration based on Azure DNS Private Resolver. This setup ensures that devices on the private network can resolve the Private Endpoints associated with the services.
To simplify, the diagram only includes the App Service frontend’s Private Endpoint and DNS configuration azurewebsites.net. Uploading documents requires DNS setup for the Storage Account blob.core.windows.net, and reindexing the AI Search index needs DNS configuration for the search service domain search.windows.net.
-
Azure Permissions:
- Network Contributor role or higher on both virtual networks involved in the peering.
-
Network Configuration:
- Ensure that the virtual networks do not have overlapping IP address spaces.
- Both virtual networks must reside within the same Azure region or in regions that support peering.
For step-by-step configuration instructions, refer to the Create a VNet Peering Procedure.
VNet Peering will enable connectivity with your Private Endpoint. However, to configure name resolution for the Private Endpoint address, DNS settings must be adjusted according to the Private Endpoint DNS Integration Scenarios you intend to use. For more information on scenarios and how to configure them, please see Private Endpoint DNS Integration Scenarios.
Implement Private Endpoints for your App Service, Storage Accounts, and Search Service to ensure secure, direct access for users connected through ExpressRoute or VPN. This setup offers an alternative to VNet Peering by allowing these Private Endpoints to reside in your Connectivity subscription. As a result, DNS will resolve directly to these Private Endpoints, eliminating the need for VNet Peering.
The following diagram illustrates a scenario using Private Endpoints within your Connectivity subscription and a DNS configuration based on Azure DNS Private Resolver. This ensures that devices on the private network can resolve the Private Endpoints associated with the services.
Private Endpoints in Your Connectivity Subscription
To keep things simple, the diagram only includes the Private Endpoint and DNS setup for the App Service that runs the application's frontend using azurewebsites.net. If users need to upload documents, you'll also need to configure DNS for the Storage Account service at blob.core.windows.net. Additionally, if there's a need to reindex the AI Search index, you'll have to set up DNS for the search service domain at search.windows.net.
-
Azure Permissions:
- Network Contributor or Private Endpoint Contributor role on the virtual network where the Private Endpoint will be deployed.
- Contributor role or higher on the App Service and Storage Account resources.
-
Network Configuration:
- Ensure DNS settings are configured to resolve the Private Endpoints correctly.
- Verify that the virtual network has sufficient IP address space to accommodate the Private Endpoints.
For detailed configuration guidance for the App Service, see Connect privately to an App Service app using a Private Endpoint. The steps for creating a Private Endpoint for a Storage Account and Search Service are similar to those for the App Service.
The previous steps explain how to create the Private Endpoint and provide guidance on DNS configuration. If you need more information about DNS integration scenarios with Private Endpoints and how to configure them, please refer to Private Endpoint DNS Integration Scenarios.
Provide user access to external users via secure network configurations.
Configure Azure Front Door in conjunction with a Web Application Firewall (WAF) to manage external user access. This setup provides global load balancing, ensures high availability, and protects the application from common web threats and vulnerabilities.
-
Azure Permissions:
- Contributor role or higher on the Azure subscription or the specific resource group where Front Door and WAF will be deployed, in general deployed in a Connectivity Subscription.
-
Configuration Requirements:
- Custom domain ownership if you plan to use custom domains with Front Door.
- SSL certificates for securing HTTPS traffic, if applicable.
To set up Front Door and WAF, follow the instructions in the Create an Azure Front Door using Azure portal page.
Note
Alternatively, Front Door and WAF can be deployed within the same Subscription and resource group as GPT-RAG to streamline the configuration process.
With the Private Endpoint already set up for App Service, you can still configure an IP allowlist for specific cases, such as temporary access to the frontend for quick testing. This setup ensures that only trusted sources with pre-approved IPs can access the service publicly when necessary.
Important
Use this approach for short-term access, such as quick testing or setup for a small group of users. It’s a simple control but relies on a public endpoint, so apply it only when necessary for specific cases.
-
Azure Permissions:
- Contributor role or higher on the Azure subscription or the specific resource group containing the App Service.
-
Network Configuration:
- A list of trusted IP addresses or ranges that will be allowed access to the public endpoints.
-
Access the Azure Portal:
- Sign in to the Azure Portal.
-
Restrict Access to App Services:
- Navigate to App Services and select the target App Service.
- Go to Networking and configure Access Restrictions by adding rules to allow specific IP addresses with assigned priorities.
- Save the changes to enforce the restrictions.
Configuring IP Allowlist for Public Endpoints
- Test access to the App Service and Storage Account from both permitted and non-permitted IP addresses.
- Monitor access logs regularly to ensure only authorized IPs have access.
This section outlines the steps to configure Azure Entra authentication for Front-end app service.
- The front-end app deployed in App Service.
- Permission to register your application in Entra ID.*
* Use one of these Entra roles: Application Administrator, Cloud Application Administrator, or Global Administrator.
If you have the necessary permissions to register a new application in Azure Entra ID, simply follow step 3 of the procedure outlined on this page: Add app authentication or watch this brief tutorial for step-by-step instructions.
If you do not have permission to register a new application in Azure Entra ID, that’s not a problem. You can still set up authentication by collaborating with an Entra ID administrator. Simply follow the procedure described on this page: How to Apply Easy Auth on Web App under a High-security policy environment.
- Access your web app URL.
- You should be redirected to the Azure AD sign-in page.
- Upon successful login, you should be redirected back to your app.
Control user access within front-end application using user principal IDs, usernames, or groups.
- Configured Authentication: Ensure that Entra ID authentication is properly set up in your app service.
- List of Authorized Entities: Compile lists of authorized user principal IDs, usernames and/or group names.
- Delegated Microsoft Graph Permissions (To define allowed groups): Permission to consent your application the Microsoft Graph
Group.Read.All
permission in Entra ID.
* Use one of these Entra roles: Application Administrator, Cloud Application Administrator, or Global Administrator.
-
Identify Authorized Users and Groups
- User Principal IDs: Unique identifiers for users (e.g.,
user-principal-id-1
). - Usernames: Typically the user's email address (e.g.,
[email protected]
). - Group Names (Optional): Entra ID group names.
- User Principal IDs: Unique identifiers for users (e.g.,
-
Configure Microsoft Graph Permissions (If you are allowing Access to Groups)
If you plan to use group-based authorization:
-
Navigate to API Permissions:
- In the registered application, go to API permissions.
-
Add
Group.Read.All
Permission:- Click Add a permission > Microsoft Graph > Application permissions.
- Search for and select
Group.Read.All
. - Click Add permissions.
-
Grant Admin Consent:
- Click on Grant admin consent for [Your Tenant Name].
- Confirm the action.
-
-
Set Environment Variables
In your application's settings, populate the environment variables needed to define which users or groups can access the application. You don’t need to create all three—just the ones relevant to your authorization setup:
- Use
AUTHORIZED_USER_PRINCIPALS
if you want to specify user principal IDs (e.g.,user-principal-id-1
). - Use
AUTHORIZED_USER_NAMES
if you want to specify usernames (e.g.,[email protected],[email protected]
). - Use
AUTHORIZED_GROUP_NAMES
if you want to specify group names.
- Use
-
Restart the Application
After making changes to environment variables and permissions, restart your application to apply the updates.
Based on your authorization setup, validate access:
- User Principal ID: Log in as a user in
AUTHORIZED_USER_PRINCIPALS
to check access. - Username: Log in as a username in
AUTHORIZED_USER_NAMES
and confirm access. - Group Membership: Log in as a group member from
AUTHORIZED_GROUP_NAMES
to ensure access.
Note
Use and test only the methods you have configured to ensure access controls are functioning correctly.
The SharePoint connector indexes and purges files using scheduled Azure Functions to maintain an up-to-date Azure AI Search Index. For more information on how this works, see the Sharepoint section on the Data Ingestion Page. For detailed instructions on setting up SharePoint for data ingestion, please refer to the SharePoint Setup Guide.
This customization is particularly valuable in scenarios where sensitive documents need to be accessed by specific groups or individuals within an organization. With this feature you can ensure that AI Search returns results tailored to each user’s access (no RBAC permissions), please take a look at the Filter Files with AI Search Using Security Trimming page.
The GPT-RAG Solution Accelerator comprises four Git repositories, each housing the code for specific application components. Whether you're using GitHub, Azure Repos in Azure DevOps, or another Git service, this section outlines the organization of the codebase and provides instructions for integrating it into your own Git repositories. You can incorporate the Solution Accelerator's code into your Git repositories either by using the repositories as templates or by forking and then creating pull requests in case you want to contribute to the GPT-RAG repo.
The Solution Accelerator is structured across four primary Git repositories:
- gpt-rag: The main repository containing Infrastructure as Code (IaC) templates and comprehensive documentation for the Solution Accelerator.
- gpt-rag-ingestion: Manages the Data Ingestion component, optimizing data chunking and indexing for the Retrieval-Augmented Generation (RAG) retrieval step.
- gpt-rag-agentic: Serves as the orchestrator, coordinating the flow to retrieve information and generate user responses using agents.
- gpt-rag-frontend: Provides the front-end application, delivering a scalable and efficient web interface for the Solution Accelerator.
If you'd like to use the repositories as a starting point without making updates to the original, you can use GitHub's template feature. This will create an independent copy of the repository, which you can fully customize. However, keep in mind that this option won’t automatically sync with future updates from the original repository.
Note
The following steps should be performed for each of the four Solution Accelerator repositories: gpt-rag, gpt-rag-ingestion, gpt-rag-agentic, and gpt-rag-frontend.
In this case we will use GitHub's template feature to create a copy of the repository.
Prerequisites
- Read Access to the template repositories.
- Create Repository permission in your account or organization.
Procedure
-
Navigate to the Repository:
- Visit the GitHub page of the repository you wish to use as a template (e.g., gpt-rag-agentic).
-
Use as Template:
- Click the Use this template button located above the repository files.
- In the dialog that appears, enter your new repository name, select the owner (your account or organization), and choose the visibility (public or private).
-
Create Repository:
- Click Create repository from template. GitHub will generate a new repository in your account with the contents of the template repository.
Prerequisites
- Access to Azure DevOps Organization and Project.
- Repository Creation Rights within the target project.
Procedure
-
Prepare Azure DevOps Project:
- Ensure you have an Azure DevOps organization with the necessary permissions, typically as a Project Administrator or with explicit repository creation rights.
-
Access Azure Repos:
- Navigate to your Azure DevOps project.
- Go to Repos > Files.
-
Import Repository:
- Click the Import a repository button.
- In the import dialog, enter the Clone URL of the GitHub repository you wish to import (e.g.,
https://github.com/Azure/gpt-rag-agentic.git
).
-
Authentication for Private Repositories:
- If importing a private repository, provide the necessary credentials, such as a Personal Access Token (PAT), to authorize the import.
-
Start Import:
- Click Import to begin the process. Azure DevOps will clone the repository into your Azure Repos.
-
Verify Import:
- Once the import is complete, verify that the repository and its branches have been correctly imported by browsing the files in Azure Repos.
Reference: For detailed instructions and advanced import scenarios, refer to the Importing a GitHub repository into Azure DevOps documentation.
Procedure
-
Create a New Repository:
- Set up a new repository on your preferred Git service (e.g., GitLab, Bitbucket).
-
Download and Extract:
- Download the repository as a ZIP file from GitHub, extract the contents to your local machine.
-
Add to Your Git Repository:
- Initialize your local repository, add the extracted files, commit, and push them to your Git service.
-
Customize:
- Modify the code as per your requirements and push updates to your repository.
If you intend to contribute to the ongoing development of the Solution Accelerator by submitting pull requests, please refer to our Contribution Guidelines for detailed instructions on how to fork repositories and create pull requests.
Here is the complete list of resources for a standard Zero Trust deployment, including descriptions and SKUs. These defaults have been extensively tested in the automated installation. You can review them to adjust to your needs, considering usage factors like user volume and data.
Tip
Review this list before deploying to ensure you have the necessary quota for deployment in the desired subscription and region.
- App Service Plan
Hosts the frontend and function apps.- SKU: P0v3
- Operating System: Linux
- Zone Redundant: Disabled
- Function App (Orchestrator)
Orchestrates the RAG flow.- Operating System: Linux
- LinuxFxVersion: python|3.11
- Function App (Data Ingestion)
Supports the Data Ingestion Pipeline.- Operating System: Linux
- LinuxFxVersion: python|3.11
- App Service (Frontend)
Provides the Web User Interface.- Operating System: Linux
- LinuxFxVersion: python|3.12
- Application Insights
Provides real-time monitoring for apps.- Type: Classic
- Key Vault (Application)
Stores API keys when needed.- SKU: Standard
- Soft Delete: Enabled
- Purge Protection: Enabled
- Key Vault (Test VM Bastion)
Used by Bastion to store the Test VM password.- SKU: Standard
- Soft Delete: Enabled
- Purge Protection: Enabled
- Azure AI Services Multi-Service Account
Reads documents (Data Ingestion) and interacts with users (Web UI).- SKU: Standard
- Azure OpenAI
Generates responses and vector embeddings.- SKU: Standard
- Deployments:
- Regional gpt-4o, 40 TPM.
- text-embedding-ada-002, 40 TPM.
- Search Service
Provides vector indexes for the retrieval step.- SKU: Standard2
- Replicas: 1
- Partitions: 1
- Virtual Machine (Test VM)
Provides access to configure and test the solution after disabling public endpoints.- Operating System: Windows (Windows Server 2019 Datacenter)
- SKU: Standard_D4s_v3 (4 vCPUs, 16 GiB memory)
- Image Publisher: microsoft-dsvm (Data Science VM)
- Image Offer: dsvm-win-2019
- Storage Account (Documents)
Stores content used for grounding responses.- Performance: Standard
- Replication: Locally-redundant storage (LRS)
- Account Type: StorageV2 (general purpose v2)
- Storage Account (Orchestrator Function App)
Stores logs, code, and execution state for the Orchestrator Function App.- Performance: Standard
- Replication: Locally-redundant storage (LRS)
- Account Type: Storage (general purpose v1)
- Storage Account (Data Ingestion Function App)
Stores logs, code, and execution state for the Data Ingestion Function App.- Performance: Standard
- Replication: Locally-redundant storage (LRS)
- Account Type: Storage (general purpose v1)
- Test VM Disk
Disk for the Test VM.- Disk Size: 128 GiB
- Storage Type: Premium SSD LRS
- Operating System: Windows
- Azure Cosmos DB
Stores conversation history and metadata to improve quality.- Kind: GlobalDocumentDB
- Database Account Offer Type: Standard
- Capacity Mode: Provisioned throughput
- Virtual Network
AI Services VNet.- Address Space: 10.0.0.0/23
Address range is a suggestion, you should use what works for you.
-
Subnets
Designate network segments in the AI Services VNet to organize and secure traffic.- Subnets:
- ai-subnet
10.0.0.0/26 - app-services-subnet
10.0.0.192/26 - database-subnet
10.0.1.0/26 - app-int-subnet
10.0.0.128/26 - AzureBastionSubnet
10.0.0.64/26
- ai-subnet
The address ranges are suggestions; please adjust them to fit your specific network requirements.
- Subnets:
-
Private Endpoints
Enable private, secure access to Azure services via a virtual network.- Private Endpoints (PEs):
- AI Search Private Endpoint
- AI Services Private Endpoint
- Azure OpenAI Private Endpoint
- CosmosDB Private Endpoint
- Data Ingestion Function App Private Endpoint
- Frontend App Service Private Endpoint
- Key Vault Private Endpoint
- Orchestrator Function App Private Endpoint
- Storage Account (Documents) Private Endpoint
- Private Endpoints (PEs):
-
Private DNS Zones
Resolve private endpoints to private IPs within a virtual network.- Private DNS Zones:
- App Service and Function Apps Private DNS
privatelink.azurewebsites.net - AI Services Private DNS
privatelink.cognitiveservices.azure.com - Azure OpenAI Private DNS
privatelink.openai.azure.com - Storage Account (Documents) Private DNS
privatelink.blob.core.windows.net - CosmosDB Private DNS
privatelink.documents.azure.com - AI Search Private DNS
privatelink.search.windows.net - Key Vault Private DNS
privatelink.vaultcore.azure.net
- App Service and Function Apps Private DNS
- Private DNS Zones:
-
Network Interfaces
Provide connectivity to private endpoints and virtual machines within the AI Services VNet.- Interfaces:
- AI Search PE's Network Interface
- AI Services PE's Network Interface
- Azure OpenAI PE's Network Interface
- CosmosDB PE's Network Interface
- Data Ingestion Function App PE's Network Interface
- Frontend App Service PE's Network Interface
- Key Vault PE's Network Interface
- Orchestrator Function App PE's Network Interface
- Storage Account (Documents) PE's Network Interface
- Test Virtual Machine Network Interface
- Interfaces:
-
Bastion
Enables private and secure access to the Test VM without exposing the VM directly to the internet.- Tier: Standard
-
Public IP
Used by Bastion to enable secure access to the Test VM.- SKU: Standard
- Tier: Regional
Refer to the Troubleshooting Guide for common issues and resolutions related to the GPT-RAG Solution Accelerator.