Update docs

sourcenetwork · Dec 16, 2024 · 137c634 · 137c634
1 parent 7a49015
commit 137c634
Show file tree

Hide file tree

Showing 2 changed files with 16 additions and 16 deletions.
diff --git a/client/README.md b/client/README.md
@@ -2,4 +2,4 @@ The `client` package is the primary access point for interacting with an embedde
 
 [Data definition overview](./data_definition.md) - How the shape of documents are defined and grouped.
 
-[Secondary indexes](./secondary_indexes.md) - How secondary indexes work in DefraDB and how to use them.
+[Secondary indexes](./secondary_indexes.md) - Using secondary indexes in DefraDB.
diff --git a/client/secondary_indexes.md b/client/secondary_indexes.md
@@ -1,14 +1,14 @@
-# Secondary Indexing in DefraDB
+# Secondary indexing in DefraDB
 
 DefraDB provides a powerful and flexible secondary indexing system that enables efficient document lookups and queries. This document explains the architecture, implementation details, and usage patterns of the indexing system.
 
 ## Overview
 
 The indexing system consists of two main components. The first is index storage, which handles storing and maintaining index information. The second is index-based document fetching, which manages retrieving documents using these indexes. Together, these components provide a robust foundation for efficient data access patterns.
 
-## Index Storage
+## Index storage
 
-### Core Types
+### Core types
 
 The indexing system is built around several key types that define how indexes are structured and managed. At its heart is the IndexedFieldDescription, which describes a single field being indexed, including its name and whether it should be ordered in descending order. These field descriptions are combined into an IndexDescription, which provides a complete picture of an index including its name, ID, fields, and whether it enforces uniqueness.
 
@@ -38,7 +38,7 @@ type CollectionIndex interface {
 }
 ```
 
-### Key Structure
+### Key structure
 
 Index keys in DefraDB follow a carefully designed format that enables efficient lookups and range scans. For regular indexes, the key format is:
 ```
@@ -49,15 +49,15 @@ Unique indexes follow a similar pattern but store the document ID as the value i
 <collection_id>/<index_id>(/<field_value>)+ -> <doc_id>
 ```
 
-### Value Encoding
+### Value encoding
 
 While DefraDB primarily uses CBOR for encoding, the indexing system employs a custom encoding/decoding solution inspired by CockroachDB. This decision was made because CBOR doesn't guarantee ordering preservation, which is crucial for index functionality. Our custom encoding ensures that numeric values maintain their natural ordering, strings are properly collated, and complex types like arrays and objects have deterministic ordering.
 
-### Index Maintenance
+### Index maintenance
 
 Index maintenance happens through three primary operations: document creation, updates, and deletion. When a new document is saved, the system indexes all configured fields, generating entries according to the key format and validating any unique constraints. During updates, the system carefully manages both the removal of old index entries and the creation of new ones, ensuring consistency through atomic transactions. For deletions, all associated index entries are cleaned up along with related metadata.
 
-## Index-Based Document Fetching
+## Index-based document fetching
 
 The IndexFetcher is the cornerstone of document retrieval, orchestrating the process of fetching documents using indexes. It operates in two phases: first retrieving indexed fields (including document IDs), then using a standard fetcher to get any additional requested fields.
 
@@ -67,13 +67,13 @@ The performance characteristics of these operations vary. Direct match operation
 
 Note: the index fetcher can not benefit at the moment from ordered indexes, as the underlying storage does not support such range queries yet.
 
-## Performance Considerations
+## Performance considerations
 
 When working with indexes, it's important to understand their impact on system performance. Each index increases write amplification as every document modification must update all relevant indexes. However, this cost is often outweighed by the dramatic improvement in read performance for indexed queries.
 
 Index selection should be driven by your query patterns and data distribution. Indexing fields that are frequently used in query filters can significantly improve performance, but indexing rarely-queried fields only adds overhead. For unique indexes, the additional validation requirements make this trade-off even more important to consider.
 
-## Indexing Related Objects
+## Indexing related objects
 
 DefraDB's indexing system provides powerful capabilities for handling relationships between documents. Let's explore how this works with a practical example.
 
@@ -108,7 +108,7 @@ query {
 For requests on not indexed relations, the normal approach is from top to bottom, meaning that first all `User` documents are fetched and then for each `User` document the corresponding `Address` document is fetched. This can be very inefficient for large collections.
 With indexing, we use so called inverted fetching, meaning that we first fetch the `Address` documents with the matching `city` value and then for each `Address` document the corresponding `User` document is fetched. This is much more efficient as we can use the index to directly fetch the `User` document.
 
-### Relationship Cardinality Through Indexes
+### Relationship cardinality using indexes
 
 The indexing system also plays a crucial role in enforcing relationship cardinality. By marking an index as unique, you can enforce one-to-one relationships between documents. Here's how you would modify the schema to ensure each User has exactly one Address:
 
@@ -128,11 +128,11 @@ type Address {
 
 The unique index constraint ensures that no two Users can reference the same Address document. Without the unique constraint, the relationship would be one-to-many by default, allowing multiple Users to reference the same Address.
 
-## JSON Field Indexing
+## JSON field indexing
 
 DefraDB implements a specialized indexing system for JSON fields that differs from how other field types are handled. While a document in DefraDB can contain various field types (Int, String, Bool, JSON, etc.), JSON fields require special treatment due to their hierarchical nature.
 
-#### The JSON Interface
+#### JSON interface
 
 The indexing system relies on the `JSON` interface defined in `client/json.go`. This interface is crucial for handling JSON fields as it enables traversal of all leaf nodes within a JSON document. A `JSON` value in DefraDB can represent either an entire JSON document or a single node within it. Each `JSON` value maintains its path information, which is essential for indexing.
 
@@ -149,7 +149,7 @@ For example, given this JSON document:
 
 The system can represent the "iPhone" value as a `JSON` type with its complete path `[]string{"user", "device", "model"}`. This path-aware representation is fundamental to how the indexing system works.
 
-#### Inverted Indexes for JSON
+#### Inverted indexes for JSON
 
 For JSON fields, DefraDB uses inverted indexes with the following key format:
 ```
@@ -160,13 +160,13 @@ The term "inverted" comes from how these indexes reverse the typical document-to
 
 This approach differs from traditional secondary indexes in DefraDB. While regular fields map to single index entries, a JSON field generates multiple index entries - one for each leaf node in its structure. The system traverses the entire JSON structure during indexing, creating entries that combine the path and value information.
 
-#### Value Normalization and JSON
+#### Value normalization and JSON
 
 The indexing system integrates with DefraDB's value normalization through `client.NormalValue`. While the encoding/decoding package handles scalar types directly, JSON values maintain additional path information. Each JSON node is encoded with both its normalized value and its path information, allowing the system to reconstruct the exact location of any value within the JSON structure.
 
 Similar to how other field types are normalized (e.g., integers to int64), JSON leaf values are normalized based on their type before being included in the index. This ensures consistent ordering and comparison operations.
 
-#### Integration with Index Infrastructure
+#### Integration with index infrastructure
 
 When a document with a JSON field is indexed, the system:
 1. Uses the JSON interface to traverse the document structure
Original file line number	Diff line number	Diff line change
Expand Up		@@ -2,4 +2,4 @@ The `client` package is the primary access point for interacting with an embedde

		[Data definition overview](./data_definition.md) - How the shape of documents are defined and grouped.

		[Secondary indexes](./secondary_indexes.md) - How secondary indexes work in DefraDB and how to use them.
		[Secondary indexes](./secondary_indexes.md) - Using secondary indexes in DefraDB.