From 956145cc0d7a64297ae990151b7e82544a0b81bb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Fri, 25 Oct 2024 14:43:07 +0200 Subject: [PATCH 01/29] glossary: Add more abbreviations --- glossary.md | 140 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 139 insertions(+), 1 deletion(-) diff --git a/glossary.md b/glossary.md index 4dd8c9f95a045..85e7f478df7c1 100644 --- a/glossary.md +++ b/glossary.md @@ -30,6 +30,10 @@ Batch Create Table is a feature introduced in TiDB v6.0.0. This feature is enabl Baseline Capturing captures queries that meet capturing conditions and create bindings for them. It is used for [preventing regression of execution plans during an upgrade](/sql-plan-management.md#prevent-regression-of-execution-plans-during-an-upgrade). +### BR + +BR is the Backup and Restore tool for TiDB. See [BR Overview](/br/backup-and-restore-overview.md) for more information. + ### Bucket A [Region](#regionpeerraft-group) is logically divided into several small ranges called bucket. TiKV collects query statistics by buckets and reports the bucket status to PD. For details, see the [Bucket design doc](https://github.com/tikv/rfcs/blob/master/text/0082-dynamic-size-region.md#bucket). @@ -40,6 +44,10 @@ A [Region](#regionpeerraft-group) is logically divided into several small ranges With the cached table feature, TiDB loads the data of an entire table into the memory of the TiDB server, and TiDB directly gets the table data from the memory without accessing TiKV, which improves the read performance. +### CF + +CF is short for Column Family as used by RocksDB / TiKV. + ### Coalesce Partition Coalesce Partition is a way of decreasing the number of partitions in a Hash or Key partitioned table. For more information, see [Manage Hash and Key partitions](/partitioned-table.md#manage-hash-and-key-partitions). @@ -48,14 +56,72 @@ Coalesce Partition is a way of decreasing the number of partitions in a Hash or Introduced in TiDB 5.3.0, Continuous Profiling is a way to observe resource overhead at the system call level. With the support of Continuous Profiling, TiDB provides performance insight as clear as directly looking into the database source code, and helps R&D and operation and maintenance personnel to locate the root cause of performance problems using a flame graph. For details, see [TiDB Dashboard Instance Profiling - Continuous Profiling](/dashboard/continuous-profiling.md). +### CTE + +A Common Table Expression (CTE) is part of the SQL standard and uses [`WITH`](/sql-statements/sql-statement-with.md) statements. + ## D +### DDL + +Data Definition Language (DDL) is the part of the SQL standard that deals with creating, modifying and deleting tables and other objects. + +### DM + +Data Migration is the tool that allows MySQL to TiDB migration by reading data from a source instance and applying it to a target MySQL instance. See [DM Overview](/dm/dm-overview.md) for more information. + +### DML + +Data Modification Language (DML) is the part of the SQL standard that deals with inserting, updating and deleting rows in tables. + +### DMR + +Development Milestone Release (DMR) is a version of TiDB that provides users with the latest features but doesn't provide long term support. See [TiDB Versioning](/releases/versioning.md) for more information. + +### DR + +Disaster Recovery (DR) describes solutions that can be used to recover from a disaster in the future. This includes things like backups and standby clusters. + +### DXF + +Distributed eXecution Framework (DXF) is the framework used by TiDB to speedup index creation and data import by distributing tasks over all available resources. See [DXF Introduction](/tidb-distributed-execution-framework.md) for more details + ### Dynamic Pruning Dynamic pruning mode is one of the modes that TiDB accesses partitioned tables. In dynamic pruning mode, each operator supports direct access to multiple partitions. Therefore, TiDB no longer uses Union. Omitting the Union operation can improve the execution efficiency and avoid the problem of Union concurrent execution. +## E + +### EC2 + +Elastic Compute Cloud (EC2) is an AWS service that provides compute resources. This can be used with TiUP to run a TiDB Cluster. + +## G + +### GA + +General Available (GA) is the first non-beta version of a software product. + +### GC + +Garbage Collection (GC) is the process to cleanup unused resources. See [GC](/garbage-collection-overview.md) for the GC process of TiKV. + +### GTID + +Global Transactions ID's (GTIDs) are used by recent MySQL versions to indicate what transactions have been replicated and which have not. This information can be used by DM. + +## H + +### HTAP + +Hybrid Transactional Analytical Process (HTAP) is a database feature that allows both OLTP and OLAP workloads on the same database. For TiDB the HTAP feature is provided by using both TiKV for row storage and TiFlash for columnar storeage. See [the definition of HTAP on the Gartner website](https://www.gartner.com/en/information-technology/glossary/htap-enabling-memory-computing-technologies) for more information. + ## I +### IMDS + +Instance Metadata Service (IMDS) is a AWS service that can be used to manage EC2 instances. See [Instance metadata](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html) for more information. + ### Index Merge Index Merge is a method introduced in TiDB v4.0 to access tables. Using this method, the TiDB optimizer can use multiple indexes per table and merge the results returned by each index. In some scenarios, this method makes the query more efficient by avoiding full table scans. Since v5.4, Index Merge has become a GA feature. @@ -64,8 +130,26 @@ Index Merge is a method introduced in TiDB v4.0 to access tables. Using this met The in-memory pessimistic lock is a new feature introduced in TiDB v6.0.0. When this feature is enabled, pessimistic locks are usually stored in the memory of the Region leader only, and are not persisted to disk or replicated through Raft to other replicas. This feature can greatly reduce the overhead of acquiring pessimistic locks and improve the throughput of pessimistic transactions. +## K + +### KMS + +Key Management Service (KMS) allows the storage and retrieval of secret keys in a secure way. Examples of this are the AWS KMS, GCP KMS and HashiCorp Vault. Various TiDB components can use this to manage the keys that are used for storage encryption and related services. + +### KV + +Key-Value (KV) is a way storing information that allows easy retrieval by specifiying the key. Multiple values can be stored under a single key by encoding them. TiKV is implementing this. + ## L +### LDAP + +Lightweight Directory Access Protocol (LDAP) is a standardized way of accessing a directory with information. This is often used to store information on accounts. This is used in TiDB by [LDAP authentication plugins](/security-compatibility-with-mysql.md#authentication-plugin-status). + +### LTS + +Long Term Support (LTS) are software versions that are considered stable and are supported for a long term. See [TiDB Versioning](/releases/versioning.md) for more details. + ### leader/follower/learner Leader/Follower/Learner each corresponds to a role in a Raft group of [peers](#regionpeerraft-group). The leader services all client requests and replicates data to the followers. If the group leader fails, one of the followers will be elected as the new leader. Learners are non-voting followers that only serves in the process of replica addition. @@ -82,10 +166,22 @@ Starting from v5.0, TiDB introduces Massively Parallel Processing (MPP) architec ## O +### OLAP + +OnLine Analytical Processing (OLAP) are describing database workloads that mostly deal with analytical workloads like reporting. The characteristics of this is read heavy queries that process many rows. + ### Old value The "original value" in the incremental change log output by TiCDC. You can specify whether the incremental change log output by TiCDC contains the "original value". +### OLTP + +OnLine Transaction Processing (OLTP) are describing database workloads that mostly deal with transactioonal workloads like inserting, updating and deleting small sets of records. + +## OOM + +Out of Memory (OOM) is a situation where a system fails due to a a lack of available memory. See [Troubleshoot TiDB OOM Issues](/troubleshoot-tidb-oom.md) for more details. + ### Operator An operator is a collection of actions that applies to a Region for scheduling purposes. Operators perform scheduling tasks such as "migrate the leader of Region 2 to Store 5" and "migrate replicas of Region 2 to Store 1, 4, 5". @@ -111,10 +207,18 @@ Currently, available steps generated by PD include: [Partitioning](/partitioned-table.md) refers to physically dividing a table into smaller table partitions, which can be done by partition methods such as RANGE, LIST, HASH, and KEY partitioning. +### PD + +Placement Driver (PD) is an important component of the [TiDB Architecture](/tidb-architecture.md#placement-driver-pd-server) that is responsible to store metadata and run the [TSO](/tso.md) that hands out timestamps that are used for transactions. It also orchestrates the data placement on TiKV and runs the [TiDB Dashboard](/dashboard/dashboard-overview.md). + ### pending/down "Pending" and "down" are two special states of a peer. Pending indicates that the Raft log of followers or learners is vastly different from that of leader. Followers in pending cannot be elected as leader. "Down" refers to a state that a peer ceases to respond to leader for a long time, which usually means the corresponding node is down or isolated from the network. +### PiTR + +Point in Time Recovery (PiTR) is a database feature that allows the user to restore to a specific point in time (for example just before an accidental `DELETE` statement). See [TiDB Log Backup and PITR Architecture](/br/br-log-architecture.md) for more details. + ### Point Get Point get means reading a single row of data by a unique index or primary index, the returned resultset is up to one row. @@ -125,6 +229,10 @@ In most cases, when executing SQL statements, the optimizer only uses statistics ## Q +### QPS + +Queries Per Second (QPS) is a performance metric of a database service. + ### Quota Limiter Quota Limiter is an experimental feature introduced in TiDB v6.0.0. If the machine on which TiKV is deployed has limited resources, for example, with only 4v CPU and 16 G memory, and the foreground of TiKV processes too many read and write requests, the CPU resources used by the background are occupied to help process such requests, which affects the performance stability of TiKV. To avoid this situation, the [quota-related configuration items](/tikv-configuration-file.md#quota) can be set to limit the CPU resources to be used by the foreground. @@ -135,6 +243,10 @@ Quota Limiter is an experimental feature introduced in TiDB v6.0.0. If the machi Raft Engine is an embedded persistent storage engine with a log-structured design. It is built for TiKV to store multi-Raft logs. Since v5.4, TiDB supports using Raft Engine as the log storage engine. For details, see [Raft Engine](/tikv-configuration-file.md#raft-engine). +### RAG + +Retrieval-Augmented Generation (RAG). See [Vector Search Overview](/vector-search-overview.md#use-cases) for more details. + ### Region/peer/Raft group Region is the minimal piece of data storage in TiKV, each representing a range of data (256 MiB by default). Each Region has three replicas by default. A replica of a Region is called a peer. Multiple peers of the same Region replicate data via the Raft consensus algorithm, so peers are also members of a Raft instance. TiKV uses Multi-Raft to manage data. That is, for each Region, there is a corresponding, isolated Raft group. @@ -145,10 +257,18 @@ Regions are generated as data writes increase. The process of splitting is calle The mechanism of Region split is to use one initial Region to cover the entire key space, and generate new Regions through splitting existing ones every time the size of the Region or the number of keys has reached a threshold. -### restore +### Restore Restore is the reverse of the backup operation. It is the process of bringing back the system to an earlier state by retrieving data from a prepared backup. +### RPC + +Remote Procedure Call (RPC) is a way for software components to communicate. In a TiDB cluster gRPC standard is used for communication between TiKV and TiDB. + +### RU + +Request Unit (RU) is used in TiDB to describe the unit for the resource usage. This is used with [Resource Control](/tidb-resource-control.md) to manage resource usage. + ## S ### scheduler @@ -160,6 +280,10 @@ Schedulers are components in PD that generate scheduling tasks. Each scheduler i - `hot-region-scheduler`: Balances the distribution of hot Regions - `evict-leader-{store-id}`: Evicts all leaders of a node (often used for rolling upgrades) +### SST + +Static Sorted Table is the store format of RocksDB. + ### Store A store refers to the storage node in the TiKV cluster (an instance of `tikv-server`). Each store has a corresponding TiKV instance. @@ -170,6 +294,20 @@ A store refers to the storage node in the TiKV cluster (an instance of `tikv-ser Top SQL helps locate SQL queries that contribute to a high load of a TiDB or TiKV node in a specified time range. For details, see [Top SQL user document](/dashboard/top-sql.md). +### TPS + +Transactions Per Second (TPS) is a performance metric of a database. + ### TSO Because TiKV is a distributed storage system, it requires a global timing service, Timestamp Oracle (TSO), to assign a monotonically increasing timestamp. In TiKV, such a feature is provided by PD, and in Google [Spanner](http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf), this feature is provided by multiple atomic clocks and GPS. For details, see [TSO](/tso.md). + +## U + +### URI + +Uniform Resource Identifier (URI) is a uniform way of describing a resource. See [Uniform Resource Identifier](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier) on Wikipedia for more information. + +### UUID + +Universally Unique Identifier (UUID) is a 128-bit (16 byte) generated ID that can be used to identify records in a database. See [UUID](/best-practices/uuid.md) for more information on how UUID's are used in TiDB. \ No newline at end of file From f3186ce0c5bc8d4da73d934dabb5c143f002bb02 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Tue, 29 Oct 2024 10:51:16 +0100 Subject: [PATCH 02/29] Apply suggestions from code review Co-authored-by: Mattias Jonsson --- glossary.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/glossary.md b/glossary.md index 85e7f478df7c1..8cbe052f7b797 100644 --- a/glossary.md +++ b/glossary.md @@ -64,7 +64,7 @@ A Common Table Expression (CTE) is part of the SQL standard and uses [`WITH`](/s ### DDL -Data Definition Language (DDL) is the part of the SQL standard that deals with creating, modifying and deleting tables and other objects. +Data Definition Language (DDL) is the part of the SQL standard that deals with creating, modifying and deleting tables, indexes, columns and other objects. ### DM @@ -114,7 +114,7 @@ Global Transactions ID's (GTIDs) are used by recent MySQL versions to indicate w ### HTAP -Hybrid Transactional Analytical Process (HTAP) is a database feature that allows both OLTP and OLAP workloads on the same database. For TiDB the HTAP feature is provided by using both TiKV for row storage and TiFlash for columnar storeage. See [the definition of HTAP on the Gartner website](https://www.gartner.com/en/information-technology/glossary/htap-enabling-memory-computing-technologies) for more information. +Hybrid Transactional Analytical Process (HTAP) is a database feature that allows both OLTP and OLAP workloads on the same database. For TiDB the HTAP feature is provided by using both TiKV for row storage and TiFlash for columnar storage. See [the definition of HTAP on the Gartner website](https://www.gartner.com/en/information-technology/glossary/htap-enabling-memory-computing-technologies) for more information. ## I From e4c82a575eb38b6310e8892f383e33ef46201451 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Tue, 29 Oct 2024 10:52:32 +0100 Subject: [PATCH 03/29] Apply suggestions from code review Co-authored-by: Mattias Jonsson --- glossary.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/glossary.md b/glossary.md index 8cbe052f7b797..e9dfdfbc83a69 100644 --- a/glossary.md +++ b/glossary.md @@ -108,7 +108,7 @@ Garbage Collection (GC) is the process to cleanup unused resources. See [GC](/ga ### GTID -Global Transactions ID's (GTIDs) are used by recent MySQL versions to indicate what transactions have been replicated and which have not. This information can be used by DM. +Global Transactions ID's (GTIDs) are used by recent MySQL versions binary log to indicate what transactions have been replicated and which have not. This information can be used by DM. ## H From 7379630a4d24bf430271d3ba15864a9faed3da0b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Tue, 29 Oct 2024 11:49:13 +0100 Subject: [PATCH 04/29] Update glossary.md Co-authored-by: Mattias Jonsson --- glossary.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/glossary.md b/glossary.md index e9dfdfbc83a69..4d7c47fda6f08 100644 --- a/glossary.md +++ b/glossary.md @@ -282,7 +282,7 @@ Schedulers are components in PD that generate scheduling tasks. Each scheduler i ### SST -Static Sorted Table is the store format of RocksDB. +Static Sorted Table, Sorted String Table or Sorted Sequence Table (SST) is the file storage format of RocksDB. ### Store From 9fbf684df561ef0a420f99ee89cced22cdbef308 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Tue, 29 Oct 2024 11:50:58 +0100 Subject: [PATCH 05/29] Apply suggestions from code review Co-authored-by: Mattias Jonsson --- glossary.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/glossary.md b/glossary.md index 4d7c47fda6f08..cdb6250156069 100644 --- a/glossary.md +++ b/glossary.md @@ -138,7 +138,7 @@ Key Management Service (KMS) allows the storage and retrieval of secret keys in ### KV -Key-Value (KV) is a way storing information that allows easy retrieval by specifiying the key. Multiple values can be stored under a single key by encoding them. TiKV is implementing this. +Key-Value (KV) is a way storing information that allows easy store and retrieval by specifying the key. Multiple values can be stored under a single key by encoding them. TiKV is implementing this by TiDB mapping tables and indexes into Key-Value entries. ## L @@ -148,7 +148,7 @@ Lightweight Directory Access Protocol (LDAP) is a standardized way of accessing ### LTS -Long Term Support (LTS) are software versions that are considered stable and are supported for a long term. See [TiDB Versioning](/releases/versioning.md) for more details. +Long Term Support (LTS) are software versions that are well tested, production ready and are supported for a long term. See [TiDB Versioning](/releases/versioning.md) for more details. ### leader/follower/learner @@ -176,7 +176,7 @@ The "original value" in the incremental change log output by TiCDC. You can spec ### OLTP -OnLine Transaction Processing (OLTP) are describing database workloads that mostly deal with transactioonal workloads like inserting, updating and deleting small sets of records. +OnLine Transaction Processing (OLTP) are describing database workloads that mostly deal with transactional workloads like selecting, inserting, updating and deleting small sets of records. ## OOM @@ -263,7 +263,7 @@ Restore is the reverse of the backup operation. It is the process of bringing ba ### RPC -Remote Procedure Call (RPC) is a way for software components to communicate. In a TiDB cluster gRPC standard is used for communication between TiKV and TiDB. +Remote Procedure Call (RPC) is a way for software components to communicate. In a TiDB cluster gRPC standard is used for communication between different components such as TiDB, TiKV and TiFlash. ### RU From bd239747595857492a62ee25bb3f45adefdcd1d4 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Fri, 1 Nov 2024 15:49:10 +0800 Subject: [PATCH 06/29] refine wording --- glossary.md | 66 ++++++++++++++++++++++++++--------------------------- 1 file changed, 33 insertions(+), 33 deletions(-) diff --git a/glossary.md b/glossary.md index cdb6250156069..3ea4bfd1b911e 100644 --- a/glossary.md +++ b/glossary.md @@ -32,7 +32,7 @@ Baseline Capturing captures queries that meet capturing conditions and create bi ### BR -BR is the Backup and Restore tool for TiDB. See [BR Overview](/br/backup-and-restore-overview.md) for more information. +BR is the Backup and Restore tool for TiDB. For more information, see [BR Overview](/br/backup-and-restore-overview.md). ### Bucket @@ -46,7 +46,7 @@ With the cached table feature, TiDB loads the data of an entire table into the m ### CF -CF is short for Column Family as used by RocksDB / TiKV. +In RocksDB and TiKV, a Column Family (CF) represents a logical grouping of key-value pairs within a database. ### Coalesce Partition @@ -58,33 +58,33 @@ Introduced in TiDB 5.3.0, Continuous Profiling is a way to observe resource over ### CTE -A Common Table Expression (CTE) is part of the SQL standard and uses [`WITH`](/sql-statements/sql-statement-with.md) statements. +A Common Table Expression (CTE) enables you to define a temporary result set that can be referred multiple times within a SQL statement using the [`WITH`](/sql-statements/sql-statement-with.md) clause. For more information, see [Common Table Expression](/develop/dev-guide-use-common-table-expression.md). ## D ### DDL -Data Definition Language (DDL) is the part of the SQL standard that deals with creating, modifying and deleting tables, indexes, columns and other objects. +Data Definition Language (DDL) statements enables you to create, modify, and drop tables, indexes, columns, and other database objects. ### DM -Data Migration is the tool that allows MySQL to TiDB migration by reading data from a source instance and applying it to a target MySQL instance. See [DM Overview](/dm/dm-overview.md) for more information. +Data Migration (DM) is a tool for migrating data from MySQL-compatible databases into TiDB. It reads data from an instance of MySQL-compatible database and applies it to a TiDB target instance. For more information, see [DM Overview](/dm/dm-overview.md). ### DML -Data Modification Language (DML) is the part of the SQL standard that deals with inserting, updating and deleting rows in tables. +Data Modification Language (DML) statements enables you to with insert, update, and delete rows in tables. ### DMR -Development Milestone Release (DMR) is a version of TiDB that provides users with the latest features but doesn't provide long term support. See [TiDB Versioning](/releases/versioning.md) for more information. +Development Milestone Release (DMR) is a TiDB version that introduces the latest features but does not offer long-term support. For more information, see [TiDB Versioning](/releases/versioning.md). ### DR -Disaster Recovery (DR) describes solutions that can be used to recover from a disaster in the future. This includes things like backups and standby clusters. +Disaster Recovery (DR) includes solutions that can be used to recover data from a disaster in the future. These solutions typically involve backups and standby clusters. For more information, see [Overview of TiDB Disaster Recovery Solutions](dr-solution-introduction). ### DXF -Distributed eXecution Framework (DXF) is the framework used by TiDB to speedup index creation and data import by distributing tasks over all available resources. See [DXF Introduction](/tidb-distributed-execution-framework.md) for more details +Distributed eXecution Framework (DXF) is the framework used by TiDB for accelerating index creation and data import by distributing tasks over all available resources. For more information, see [DXF Introduction](/tidb-distributed-execution-framework.md). ### Dynamic Pruning @@ -94,33 +94,33 @@ Dynamic pruning mode is one of the modes that TiDB accesses partitioned tables. ### EC2 -Elastic Compute Cloud (EC2) is an AWS service that provides compute resources. This can be used with TiUP to run a TiDB Cluster. +[Elastic Compute Cloud (EC2)](https://aws.amazon.com/pm/ec2/) is an AWS service that provides scalable compute resources. It can be used with TiUP to deploy and manage a TiDB cluster. ## G ### GA -General Available (GA) is the first non-beta version of a software product. +If a feature is General Available (GA), it indicates it is fully tested and can be used in production environments. Note that even if a feature is GA in a [DMR](#dmr) version, it is recommended to use the feature in production environments in a later [LTS](#lts) version. ### GC -Garbage Collection (GC) is the process to cleanup unused resources. See [GC](/garbage-collection-overview.md) for the GC process of TiKV. +Garbage Collection (GC) is a process that clears obsolete data to free up resources. For information on TiKV GC process, see [Garbage Collection overview](/garbage-collection-overview.md). ### GTID -Global Transactions ID's (GTIDs) are used by recent MySQL versions binary log to indicate what transactions have been replicated and which have not. This information can be used by DM. +Global Transaction Identifiers (GTIDs) are unique transaction IDs used in MySQL binary logs to track which transactions have been replicated. [Data Migration (DM)](/dm/dm-overview.md) uses these IDs to ensure consistent replication. ## H ### HTAP -Hybrid Transactional Analytical Process (HTAP) is a database feature that allows both OLTP and OLAP workloads on the same database. For TiDB the HTAP feature is provided by using both TiKV for row storage and TiFlash for columnar storage. See [the definition of HTAP on the Gartner website](https://www.gartner.com/en/information-technology/glossary/htap-enabling-memory-computing-technologies) for more information. +Hybrid Transactional and Analytical Processing (HTAP) is a database feature that enables both OLTP (Online Transactional Processing) and OLAP (Online Analytical Processing) workloads within the same database. For TiDB, the HTAP feature is provided by using TiKV for row storage and TiFlash for columnar storage. For more information, see [the definition of HTAP on the Gartner website](https://www.gartner.com/en/information-technology/glossary/htap-enabling-memory-computing-technologies). ## I ### IMDS -Instance Metadata Service (IMDS) is a AWS service that can be used to manage EC2 instances. See [Instance metadata](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html) for more information. +Instance Metadata Service (IMDS) is an AWS service designed to manage and retrieve metadata for [EC2](#ec2) instances. For more information, see [Instance metadata](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html). ### Index Merge @@ -134,21 +134,21 @@ The in-memory pessimistic lock is a new feature introduced in TiDB v6.0.0. When ### KMS -Key Management Service (KMS) allows the storage and retrieval of secret keys in a secure way. Examples of this are the AWS KMS, GCP KMS and HashiCorp Vault. Various TiDB components can use this to manage the keys that are used for storage encryption and related services. +Key Management Service (KMS) enables the storage and retrieval of secret keys in a secure way. Examples include AWS KMS, GCP KMS, and HashiCorp Vault. Various TiDB components can use KMS to manage keys for storage encryption and related services. ### KV -Key-Value (KV) is a way storing information that allows easy store and retrieval by specifying the key. Multiple values can be stored under a single key by encoding them. TiKV is implementing this by TiDB mapping tables and indexes into Key-Value entries. +Key-Value (KV) is a way of storing information by associating values with unique keys, allowing quick data retrieval. TiDB uses TiKV to map tables and indexes into key-value pairs, enabling efficient data storage and access across the database. ## L ### LDAP -Lightweight Directory Access Protocol (LDAP) is a standardized way of accessing a directory with information. This is often used to store information on accounts. This is used in TiDB by [LDAP authentication plugins](/security-compatibility-with-mysql.md#authentication-plugin-status). +Lightweight Directory Access Protocol (LDAP) is a standardized way of accessing a directory with information. It is commonly used for account and user data management. TiDB supports LDAP via [LDAP authentication plugins](/security-compatibility-with-mysql.md#authentication-plugin-status). ### LTS -Long Term Support (LTS) are software versions that are well tested, production ready and are supported for a long term. See [TiDB Versioning](/releases/versioning.md) for more details. +Long Term Support (LTS) refers to software versions that are extensively tested and maintained for extended periods. For more information, see [TiDB Versioning](/releases/versioning.md). ### leader/follower/learner @@ -168,7 +168,7 @@ Starting from v5.0, TiDB introduces Massively Parallel Processing (MPP) architec ### OLAP -OnLine Analytical Processing (OLAP) are describing database workloads that mostly deal with analytical workloads like reporting. The characteristics of this is read heavy queries that process many rows. +Online Analytical Processing (OLAP) refers to database workloads focused on analytical tasks, such as data reporting and complex queries. OLAP is characterized by read-heavy queries that process large volumes of data across many rows. ### Old value @@ -176,11 +176,11 @@ The "original value" in the incremental change log output by TiCDC. You can spec ### OLTP -OnLine Transaction Processing (OLTP) are describing database workloads that mostly deal with transactional workloads like selecting, inserting, updating and deleting small sets of records. +Online Transaction Processing (OLTP) refers to database workloads focused on transactional tasks, such as selecting, inserting, updating, and deleting small sets of records. ## OOM -Out of Memory (OOM) is a situation where a system fails due to a a lack of available memory. See [Troubleshoot TiDB OOM Issues](/troubleshoot-tidb-oom.md) for more details. +Out of Memory (OOM) is a situation where a system fails due to insufficient memory. For more information, see [Troubleshoot TiDB OOM Issues](/troubleshoot-tidb-oom.md). ### Operator @@ -209,15 +209,15 @@ Currently, available steps generated by PD include: ### PD -Placement Driver (PD) is an important component of the [TiDB Architecture](/tidb-architecture.md#placement-driver-pd-server) that is responsible to store metadata and run the [TSO](/tso.md) that hands out timestamps that are used for transactions. It also orchestrates the data placement on TiKV and runs the [TiDB Dashboard](/dashboard/dashboard-overview.md). +Placement Driver (PD) is a core component in the [TiDB Architecture](/tidb-architecture.md#placement-driver-pd-server) responsible for storing metadata, assigning [Timestamp Oracle (TSO)](/tso.md) for transaction timestamps, orchestrating data placement on TiKV, and running [TiDB Dashboard](/dashboard/dashboard-overview.md). For more information, see [TiDB Scheduling](/tidb-scheduling.md). ### pending/down "Pending" and "down" are two special states of a peer. Pending indicates that the Raft log of followers or learners is vastly different from that of leader. Followers in pending cannot be elected as leader. "Down" refers to a state that a peer ceases to respond to leader for a long time, which usually means the corresponding node is down or isolated from the network. -### PiTR +### PITR -Point in Time Recovery (PiTR) is a database feature that allows the user to restore to a specific point in time (for example just before an accidental `DELETE` statement). See [TiDB Log Backup and PITR Architecture](/br/br-log-architecture.md) for more details. +Point in Time Recovery (PITR) enables you to restore data to a specific point in time (for example, just before an unintended `DELETE` statement). For more information, see [TiDB Log Backup and PITR Architecture](/br/br-log-architecture.md). ### Point Get @@ -231,7 +231,7 @@ In most cases, when executing SQL statements, the optimizer only uses statistics ### QPS -Queries Per Second (QPS) is a performance metric of a database service. +Queries Per Second (QPS) is the number of queries a database service handles per second, serving as a key performance metric for database throughput. ### Quota Limiter @@ -245,7 +245,7 @@ Raft Engine is an embedded persistent storage engine with a log-structured desig ### RAG -Retrieval-Augmented Generation (RAG). See [Vector Search Overview](/vector-search-overview.md#use-cases) for more details. +Retrieval-Augmented Generation (RAG) is an architecture designed to optimize the output of Large Language Models (LLMs). For more information, See [Vector Search Overview](/vector-search-overview.md#use-cases). ### Region/peer/Raft group @@ -263,11 +263,11 @@ Restore is the reverse of the backup operation. It is the process of bringing ba ### RPC -Remote Procedure Call (RPC) is a way for software components to communicate. In a TiDB cluster gRPC standard is used for communication between different components such as TiDB, TiKV and TiFlash. +Remote Procedure Call (RPC) is a communication way between software components. In a TiDB cluster, the gRPC standard is used for communication between different components such as TiDB, TiKV, and TiFlash. ### RU -Request Unit (RU) is used in TiDB to describe the unit for the resource usage. This is used with [Resource Control](/tidb-resource-control.md) to manage resource usage. +Request Unit (RU) is a unified abstraction unit for the resource usage in TiDB. It is used with [Resource Control](/tidb-resource-control.md) to manage resource usage. ## S @@ -282,7 +282,7 @@ Schedulers are components in PD that generate scheduling tasks. Each scheduler i ### SST -Static Sorted Table, Sorted String Table or Sorted Sequence Table (SST) is the file storage format of RocksDB. +Static Sorted Table, Sorted String Table, or Sorted Sequence Table (SST) is a file storage format used in RocksDB. ### Store @@ -296,7 +296,7 @@ Top SQL helps locate SQL queries that contribute to a high load of a TiDB or TiK ### TPS -Transactions Per Second (TPS) is a performance metric of a database. +Transactions Per Second (TPS) is the number of transactions a database processes per second, serving as a key metric for measuring database performance and throughput. ### TSO @@ -306,8 +306,8 @@ Because TiKV is a distributed storage system, it requires a global timing servic ### URI -Uniform Resource Identifier (URI) is a uniform way of describing a resource. See [Uniform Resource Identifier](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier) on Wikipedia for more information. +Uniform Resource Identifier (URI) is a standardized format for identifying a resource. For more information, see [Uniform Resource Identifier](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier) on Wikipedia. ### UUID -Universally Unique Identifier (UUID) is a 128-bit (16 byte) generated ID that can be used to identify records in a database. See [UUID](/best-practices/uuid.md) for more information on how UUID's are used in TiDB. \ No newline at end of file +Universally Unique Identifier (UUID) is a 128-bit (16-byte) generated ID used to uniquely identify records in a database. For more information, see [UUID](/best-practices/uuid.md). \ No newline at end of file From fc1402c68369213cfaef1f3c6d0f3d9c9a8cb0ce Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Fri, 1 Nov 2024 16:02:03 +0800 Subject: [PATCH 07/29] GCP KMS -> Google Cloud KMS --- glossary.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/glossary.md b/glossary.md index 3ea4bfd1b911e..9172dbdbbe7dd 100644 --- a/glossary.md +++ b/glossary.md @@ -134,7 +134,7 @@ The in-memory pessimistic lock is a new feature introduced in TiDB v6.0.0. When ### KMS -Key Management Service (KMS) enables the storage and retrieval of secret keys in a secure way. Examples include AWS KMS, GCP KMS, and HashiCorp Vault. Various TiDB components can use KMS to manage keys for storage encryption and related services. +Key Management Service (KMS) enables the storage and retrieval of secret keys in a secure way. Examples include AWS KMS, Google Cloud KMS, and HashiCorp Vault. Various TiDB components can use KMS to manage keys for storage encryption and related services. ### KV From c3c192a869c8f1207d3efff3d4082e4e95509924 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 6 Nov 2024 08:12:33 +0100 Subject: [PATCH 08/29] Update based on review --- glossary.md | 76 ++++++++++++++++++++++++++--------------------------- 1 file changed, 38 insertions(+), 38 deletions(-) diff --git a/glossary.md b/glossary.md index 9172dbdbbe7dd..21891f1a88d1e 100644 --- a/glossary.md +++ b/glossary.md @@ -30,7 +30,7 @@ Batch Create Table is a feature introduced in TiDB v6.0.0. This feature is enabl Baseline Capturing captures queries that meet capturing conditions and create bindings for them. It is used for [preventing regression of execution plans during an upgrade](/sql-plan-management.md#prevent-regression-of-execution-plans-during-an-upgrade). -### BR +### Backup and Restore (BR) BR is the Backup and Restore tool for TiDB. For more information, see [BR Overview](/br/backup-and-restore-overview.md). @@ -44,7 +44,7 @@ A [Region](#regionpeerraft-group) is logically divided into several small ranges With the cached table feature, TiDB loads the data of an entire table into the memory of the TiDB server, and TiDB directly gets the table data from the memory without accessing TiKV, which improves the read performance. -### CF +### Column Family (CF) In RocksDB and TiKV, a Column Family (CF) represents a logical grouping of key-value pairs within a database. @@ -56,35 +56,35 @@ Coalesce Partition is a way of decreasing the number of partitions in a Hash or Introduced in TiDB 5.3.0, Continuous Profiling is a way to observe resource overhead at the system call level. With the support of Continuous Profiling, TiDB provides performance insight as clear as directly looking into the database source code, and helps R&D and operation and maintenance personnel to locate the root cause of performance problems using a flame graph. For details, see [TiDB Dashboard Instance Profiling - Continuous Profiling](/dashboard/continuous-profiling.md). -### CTE +### Common Table Expression (CTE) A Common Table Expression (CTE) enables you to define a temporary result set that can be referred multiple times within a SQL statement using the [`WITH`](/sql-statements/sql-statement-with.md) clause. For more information, see [Common Table Expression](/develop/dev-guide-use-common-table-expression.md). ## D -### DDL +### Data Definition Language (DDL) -Data Definition Language (DDL) statements enables you to create, modify, and drop tables, indexes, columns, and other database objects. +Data Definition Language (DDL) is the part of the SQL standard that deals with creating, modifying and deleting tables and other objects. For details, see [DDL Introduction](/ddl-introduction.md). -### DM +### Data Migration (DM) -Data Migration (DM) is a tool for migrating data from MySQL-compatible databases into TiDB. It reads data from an instance of MySQL-compatible database and applies it to a TiDB target instance. For more information, see [DM Overview](/dm/dm-overview.md). +Data Migration (DM) is a tool for migrating data from MySQL-compatible databases into TiDB. DM reads data from a MySQL-compatible database instance and applies it to a TiDB target instance. For more information, see [DM Overview](/dm/dm-overview.md). -### DML +### Data Modification Language (DML) -Data Modification Language (DML) statements enables you to with insert, update, and delete rows in tables. +Data Modification Language (DML) is the part of the SQL standard that describes statements which enable you to insert, update, and delete rows in tables. -### DMR +### Development Milestone Release (DMR) -Development Milestone Release (DMR) is a TiDB version that introduces the latest features but does not offer long-term support. For more information, see [TiDB Versioning](/releases/versioning.md). +Development Milestone Releases (DMR) are TiDB releases that introduce the latest features but do not offer long-term support. For more information, see [TiDB Versioning](/releases/versioning.md). -### DR +### Disaster Recovery (DR) -Disaster Recovery (DR) includes solutions that can be used to recover data from a disaster in the future. These solutions typically involve backups and standby clusters. For more information, see [Overview of TiDB Disaster Recovery Solutions](dr-solution-introduction). +Disaster Recovery (DR) includes solutions that can be used to recover data and services from a disaster in the future. TiDB offers a variety of solutions for delivering Disaster Recovery including backups and replication to standby clusters. For more information, see [Overview of TiDB Disaster Recovery Solutions](/dr-solution-introduction.md). -### DXF +### Distributed eXecution Framework (DXF) -Distributed eXecution Framework (DXF) is the framework used by TiDB for accelerating index creation and data import by distributing tasks over all available resources. For more information, see [DXF Introduction](/tidb-distributed-execution-framework.md). +Distributed eXecution Framework (DXF) is the framework used by TiDB to distribute tasks across the TiDB cluster. DXF is designed to efficiently use the cluster resources to execute tasks (like index creation or data import) while controlling the resource usage and impact on core business transactions. For more information, see [DXF Introduction](/tidb-distributed-execution-framework.md). ### Dynamic Pruning @@ -98,21 +98,21 @@ Dynamic pruning mode is one of the modes that TiDB accesses partitioned tables. ## G -### GA +### General Availability (GA) -If a feature is General Available (GA), it indicates it is fully tested and can be used in production environments. Note that even if a feature is GA in a [DMR](#dmr) version, it is recommended to use the feature in production environments in a later [LTS](#lts) version. +General Availability (GA) of a feature is when it is, fully tested and is Generally Available for use in production environments. TiDB features may be released as Generally Available in both [DMR](#development-milestone-release-dmr) and [LTS](#long-term-support-lts) releases. However, as TiDB does not provide patch releases based on DMR it is generally recommended to use the LTS product release for production use. -### GC +### Garbage Collection (GC) Garbage Collection (GC) is a process that clears obsolete data to free up resources. For information on TiKV GC process, see [Garbage Collection overview](/garbage-collection-overview.md). -### GTID +### Global Transaction Identifiers (GTIDs) Global Transaction Identifiers (GTIDs) are unique transaction IDs used in MySQL binary logs to track which transactions have been replicated. [Data Migration (DM)](/dm/dm-overview.md) uses these IDs to ensure consistent replication. ## H -### HTAP +### Hybrid Transactional and Analytical Processing (HTAP) Hybrid Transactional and Analytical Processing (HTAP) is a database feature that enables both OLTP (Online Transactional Processing) and OLAP (Online Analytical Processing) workloads within the same database. For TiDB, the HTAP feature is provided by using TiKV for row storage and TiFlash for columnar storage. For more information, see [the definition of HTAP on the Gartner website](https://www.gartner.com/en/information-technology/glossary/htap-enabling-memory-computing-technologies). @@ -132,25 +132,25 @@ The in-memory pessimistic lock is a new feature introduced in TiDB v6.0.0. When ## K -### KMS +### Key Management Service (KMS) Key Management Service (KMS) enables the storage and retrieval of secret keys in a secure way. Examples include AWS KMS, Google Cloud KMS, and HashiCorp Vault. Various TiDB components can use KMS to manage keys for storage encryption and related services. -### KV +### Key-Value (KV) Key-Value (KV) is a way of storing information by associating values with unique keys, allowing quick data retrieval. TiDB uses TiKV to map tables and indexes into key-value pairs, enabling efficient data storage and access across the database. ## L -### LDAP +### Lightweight Directory Access Protocol (LDAP) Lightweight Directory Access Protocol (LDAP) is a standardized way of accessing a directory with information. It is commonly used for account and user data management. TiDB supports LDAP via [LDAP authentication plugins](/security-compatibility-with-mysql.md#authentication-plugin-status). -### LTS +### Long Term Support (LTS) Long Term Support (LTS) refers to software versions that are extensively tested and maintained for extended periods. For more information, see [TiDB Versioning](/releases/versioning.md). -### leader/follower/learner +### Leader/Follower/Learner Leader/Follower/Learner each corresponds to a role in a Raft group of [peers](#regionpeerraft-group). The leader services all client requests and replicates data to the followers. If the group leader fails, one of the followers will be elected as the new leader. Learners are non-voting followers that only serves in the process of replica addition. @@ -166,7 +166,7 @@ Starting from v5.0, TiDB introduces Massively Parallel Processing (MPP) architec ## O -### OLAP +### Online Analytical Processing (OLAP) Online Analytical Processing (OLAP) refers to database workloads focused on analytical tasks, such as data reporting and complex queries. OLAP is characterized by read-heavy queries that process large volumes of data across many rows. @@ -174,11 +174,11 @@ Online Analytical Processing (OLAP) refers to database workloads focused on anal The "original value" in the incremental change log output by TiCDC. You can specify whether the incremental change log output by TiCDC contains the "original value". -### OLTP +### Online Transaction Processing (OLTP) Online Transaction Processing (OLTP) refers to database workloads focused on transactional tasks, such as selecting, inserting, updating, and deleting small sets of records. -## OOM +## Out of Memory (OOM) Out of Memory (OOM) is a situation where a system fails due to insufficient memory. For more information, see [Troubleshoot TiDB OOM Issues](/troubleshoot-tidb-oom.md). @@ -207,7 +207,7 @@ Currently, available steps generated by PD include: [Partitioning](/partitioned-table.md) refers to physically dividing a table into smaller table partitions, which can be done by partition methods such as RANGE, LIST, HASH, and KEY partitioning. -### PD +### Placement Driver (PD) Placement Driver (PD) is a core component in the [TiDB Architecture](/tidb-architecture.md#placement-driver-pd-server) responsible for storing metadata, assigning [Timestamp Oracle (TSO)](/tso.md) for transaction timestamps, orchestrating data placement on TiKV, and running [TiDB Dashboard](/dashboard/dashboard-overview.md). For more information, see [TiDB Scheduling](/tidb-scheduling.md). @@ -215,7 +215,7 @@ Placement Driver (PD) is a core component in the [TiDB Architecture](/tidb-archi "Pending" and "down" are two special states of a peer. Pending indicates that the Raft log of followers or learners is vastly different from that of leader. Followers in pending cannot be elected as leader. "Down" refers to a state that a peer ceases to respond to leader for a long time, which usually means the corresponding node is down or isolated from the network. -### PITR +### Point in Time Recovery (PITR) Point in Time Recovery (PITR) enables you to restore data to a specific point in time (for example, just before an unintended `DELETE` statement). For more information, see [TiDB Log Backup and PITR Architecture](/br/br-log-architecture.md). @@ -261,11 +261,11 @@ The mechanism of Region split is to use one initial Region to cover the entire k Restore is the reverse of the backup operation. It is the process of bringing back the system to an earlier state by retrieving data from a prepared backup. -### RPC +### Remote Procedure Call (RPC) Remote Procedure Call (RPC) is a communication way between software components. In a TiDB cluster, the gRPC standard is used for communication between different components such as TiDB, TiKV, and TiFlash. -### RU +### Request Unit (RU) Request Unit (RU) is a unified abstraction unit for the resource usage in TiDB. It is used with [Resource Control](/tidb-resource-control.md) to manage resource usage. @@ -280,9 +280,9 @@ Schedulers are components in PD that generate scheduling tasks. Each scheduler i - `hot-region-scheduler`: Balances the distribution of hot Regions - `evict-leader-{store-id}`: Evicts all leaders of a node (often used for rolling upgrades) -### SST +### Static Sorted Table / Sorted String Table (SST) -Static Sorted Table, Sorted String Table, or Sorted Sequence Table (SST) is a file storage format used in RocksDB. +Static Sorted Table or Sorted String Table is a file storage format used in RocksDB (a component used by the [TiKV Storage Engine](/tikv-overview.md)). ### Store @@ -294,20 +294,20 @@ A store refers to the storage node in the TiKV cluster (an instance of `tikv-ser Top SQL helps locate SQL queries that contribute to a high load of a TiDB or TiKV node in a specified time range. For details, see [Top SQL user document](/dashboard/top-sql.md). -### TPS +### Transactions Per Second (TPS) Transactions Per Second (TPS) is the number of transactions a database processes per second, serving as a key metric for measuring database performance and throughput. -### TSO +### Timestamp Oracle (TSO) Because TiKV is a distributed storage system, it requires a global timing service, Timestamp Oracle (TSO), to assign a monotonically increasing timestamp. In TiKV, such a feature is provided by PD, and in Google [Spanner](http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf), this feature is provided by multiple atomic clocks and GPS. For details, see [TSO](/tso.md). ## U -### URI +### Uniform Resource Identifier (URI) Uniform Resource Identifier (URI) is a standardized format for identifying a resource. For more information, see [Uniform Resource Identifier](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier) on Wikipedia. -### UUID +### Universally Unique Identifier (UUID) Universally Unique Identifier (UUID) is a 128-bit (16-byte) generated ID used to uniquely identify records in a database. For more information, see [UUID](/best-practices/uuid.md). \ No newline at end of file From d0ef6ee1ea8b08ff47ddfadaa20812f8e62ebcc0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 6 Nov 2024 08:19:18 +0100 Subject: [PATCH 09/29] Updated RocksDB link --- glossary.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/glossary.md b/glossary.md index 21891f1a88d1e..cea010756e4bb 100644 --- a/glossary.md +++ b/glossary.md @@ -282,7 +282,7 @@ Schedulers are components in PD that generate scheduling tasks. Each scheduler i ### Static Sorted Table / Sorted String Table (SST) -Static Sorted Table or Sorted String Table is a file storage format used in RocksDB (a component used by the [TiKV Storage Engine](/tikv-overview.md)). +Static Sorted Table or Sorted String Table is a file storage format used in RocksDB (a component used by the [TiKV Storage Engine](/storage-engine/rocksdb-overview.md). ### Store From 539af48534bdf529bb30c9cf944639fbdb5169b7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 6 Nov 2024 08:30:57 +0100 Subject: [PATCH 10/29] Consistency updates --- glossary.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/glossary.md b/glossary.md index cea010756e4bb..1353959350175 100644 --- a/glossary.md +++ b/glossary.md @@ -118,7 +118,7 @@ Hybrid Transactional and Analytical Processing (HTAP) is a database feature that ## I -### IMDS +### Instance Metadata Service (IMDS) Instance Metadata Service (IMDS) is an AWS service designed to manage and retrieve metadata for [EC2](#ec2) instances. For more information, see [Instance metadata](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html). @@ -156,7 +156,7 @@ Leader/Follower/Learner each corresponds to a role in a Raft group of [peers](#r ## M -### MPP +### Massively Parallel Processing (MPP) Starting from v5.0, TiDB introduces Massively Parallel Processing (MPP) architecture through TiFlash nodes, which shares the execution workloads of large join queries among TiFlash nodes. When the MPP mode is enabled, TiDB, based on cost, determines whether to use the MPP framework to perform the calculation. In the MPP mode, the join keys are redistributed through the Exchange operation while being calculated, which distributes the calculation pressure to each TiFlash node and speeds up the calculation. For more information, see [Use TiFlash MPP Mode](/tiflash/use-tiflash-mpp-mode.md). @@ -229,7 +229,7 @@ In most cases, when executing SQL statements, the optimizer only uses statistics ## Q -### QPS +### Queries Per Second (QPS) Queries Per Second (QPS) is the number of queries a database service handles per second, serving as a key performance metric for database throughput. @@ -243,7 +243,7 @@ Quota Limiter is an experimental feature introduced in TiDB v6.0.0. If the machi Raft Engine is an embedded persistent storage engine with a log-structured design. It is built for TiKV to store multi-Raft logs. Since v5.4, TiDB supports using Raft Engine as the log storage engine. For details, see [Raft Engine](/tikv-configuration-file.md#raft-engine). -### RAG +### Retrieval-Augmented Generation (RAG) Retrieval-Augmented Generation (RAG) is an architecture designed to optimize the output of Large Language Models (LLMs). For more information, See [Vector Search Overview](/vector-search-overview.md#use-cases). From b07f67f9a0aaa77db6165ce8cf6bb41a2a2c068a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 6 Nov 2024 08:32:10 +0100 Subject: [PATCH 11/29] Update CTE --- glossary.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/glossary.md b/glossary.md index 1353959350175..d0c3a1b24a13c 100644 --- a/glossary.md +++ b/glossary.md @@ -58,7 +58,7 @@ Introduced in TiDB 5.3.0, Continuous Profiling is a way to observe resource over ### Common Table Expression (CTE) -A Common Table Expression (CTE) enables you to define a temporary result set that can be referred multiple times within a SQL statement using the [`WITH`](/sql-statements/sql-statement-with.md) clause. For more information, see [Common Table Expression](/develop/dev-guide-use-common-table-expression.md). +A Common Table Expression (CTE) enables you to define a temporary result set that can be referred to multiple times within a SQL statement using the [`WITH`](/sql-statements/sql-statement-with.md) clause. For more information, see [Common Table Expression](/develop/dev-guide-use-common-table-expression.md). ## D From ae0e2bfb4d031ef26f617ffe6e8636b26f604757 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 6 Nov 2024 08:40:38 +0100 Subject: [PATCH 12/29] Fix issues found by linters --- br/br-snapshot-guide.md | 6 +++--- br/br-snapshot-manual.md | 2 +- latency-breakdown.md | 2 +- tiflash/use-tiflash-mpp-mode.md | 2 +- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/br/br-snapshot-guide.md b/br/br-snapshot-guide.md index d4b0ec67ad49b..4ef4a957630bc 100644 --- a/br/br-snapshot-guide.md +++ b/br/br-snapshot-guide.md @@ -33,7 +33,7 @@ tiup br backup full --pd "${PD_IP}:2379" \ In the preceding command: -- `--backupts`: The time point of the snapshot. The format can be [TSO](/glossary.md#tso) or timestamp, such as `400036290571534337` or `2018-05-11 01:42:23 +08:00`. If the data of this snapshot is garbage collected, the `tiup br backup` command returns an error and `br` exits. When backing up using a timestamp, it is recommended to specify the time zone as well. Otherwise, `br` uses the local time zone to construct the timestamp by default, which might lead to an incorrect backup time point. If you leave this parameter unspecified, `br` picks the snapshot corresponding to the backup start time. +- `--backupts`: The time point of the snapshot. The format can be [TSO](/tso.md) or timestamp, such as `400036290571534337` or `2018-05-11 01:42:23 +08:00`. If the data of this snapshot is garbage collected, the `tiup br backup` command returns an error and `br` exits. When backing up using a timestamp, it is recommended to specify the time zone as well. Otherwise, `br` uses the local time zone to construct the timestamp by default, which might lead to an incorrect backup time point. If you leave this parameter unspecified, `br` picks the snapshot corresponding to the backup start time. - `--storage`: The storage address of the backup data. Snapshot backup supports Amazon S3, Google Cloud Storage, and Azure Blob Storage as backup storage. The preceding command uses Amazon S3 as an example. For more details, see [URI Formats of External Storage Services](/external-storage-uri.md). - `--ratelimit`: The maximum speed **per TiKV** performing backup tasks. The unit is in MiB/s. @@ -129,8 +129,8 @@ tiup br restore full \ ### Restore tables in the `mysql` schema -- Starting from BR v5.1.0, when you back up snapshots, BR automatically backs up the **system tables** in the `mysql` schema, but does not restore these system tables by default. -- Starting from v6.2.0, BR lets you specify `--with-sys-table` to restore **data in some system tables**. +- Starting from BR v5.1.0, when you back up snapshots, BR automatically backs up the **system tables** in the `mysql` schema, but does not restore these system tables by default. +- Starting from v6.2.0, BR lets you specify `--with-sys-table` to restore **data in some system tables**. - Starting from v7.6.0, BR enables `--with-sys-table` by default, which means that BR restores **data in some system tables** by default. **BR can restore data in the following system tables:** diff --git a/br/br-snapshot-manual.md b/br/br-snapshot-manual.md index e0c52559887a5..a59c7fca90675 100644 --- a/br/br-snapshot-manual.md +++ b/br/br-snapshot-manual.md @@ -42,7 +42,7 @@ tiup br backup full \ In the preceding command: -- `--backupts`: The time point of the snapshot. The format can be [TSO](/glossary.md#tso) or timestamp, such as `400036290571534337` or `2024-06-28 13:30:00 +08:00`. If the data of this snapshot is garbage collected, the `tiup br backup` command returns an error and 'br' exits. If you leave this parameter unspecified, `br` picks the snapshot corresponding to the backup start time. +- `--backupts`: The time point of the snapshot. The format can be [TSO](/tso.md) or timestamp, such as `400036290571534337` or `2024-06-28 13:30:00 +08:00`. If the data of this snapshot is garbage collected, the `tiup br backup` command returns an error and 'br' exits. If you leave this parameter unspecified, `br` picks the snapshot corresponding to the backup start time. - `--ratelimit`: The maximum speed **per TiKV** performing backup tasks. The unit is in MiB/s. - `--log-file`: The target file where `br` log is written. diff --git a/latency-breakdown.md b/latency-breakdown.md index bd037016f8e54..56fb881e7faf1 100644 --- a/latency-breakdown.md +++ b/latency-breakdown.md @@ -104,7 +104,7 @@ tidb_session_execute_duration_seconds{type="general"} = read value duration ``` -`pd_client_cmd_handle_cmds_duration_seconds{type="wait"}` records the duration of fetching [TSO (Timestamp Oracle)](/glossary.md#tso) from PD. When reading in an auto-commit transaction mode with a clustered primary index or from a snapshot, the value will be zero. +`pd_client_cmd_handle_cmds_duration_seconds{type="wait"}` records the duration of fetching [TSO (Timestamp Oracle)](/tso.md) from PD. When reading in an auto-commit transaction mode with a clustered primary index or from a snapshot, the value will be zero. The `read handle duration` and `read value duration` are calculated as: diff --git a/tiflash/use-tiflash-mpp-mode.md b/tiflash/use-tiflash-mpp-mode.md index 7f09ca3be0807..ac87a49adc83f 100644 --- a/tiflash/use-tiflash-mpp-mode.md +++ b/tiflash/use-tiflash-mpp-mode.md @@ -7,7 +7,7 @@ summary: Learn the MPP mode of TiFlash and how to use it. -This document introduces the [Massively Parallel Processing (MPP)](/glossary.md#mpp) mode of TiFlash and how to use it. +This document introduces the [Massively Parallel Processing (MPP)](/glossary.md#massively-parallel-processing-mpp) mode of TiFlash and how to use it. From d5daf41df2ebf51327a61b807ddadbb5be468449 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 6 Nov 2024 08:49:17 +0100 Subject: [PATCH 13/29] Fix link --- tiflash/tiflash-mintso-scheduler.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tiflash/tiflash-mintso-scheduler.md b/tiflash/tiflash-mintso-scheduler.md index 6cb5eda77e866..4cb237cd66b27 100644 --- a/tiflash/tiflash-mintso-scheduler.md +++ b/tiflash/tiflash-mintso-scheduler.md @@ -5,7 +5,7 @@ summary: Learn the implementation principles of the TiFlash MinTSO Scheduler. # TiFlash MinTSO Scheduler -The TiFlash MinTSO scheduler is a distributed scheduler for [MPP](/glossary.md#mpp) tasks in TiFlash. This document describes the implementation principles of the TiFlash MinTSO scheduler. +The TiFlash MinTSO scheduler is a distributed scheduler for [MPP](/glossary.md#massively-parallel-processing-mpp) tasks in TiFlash. This document describes the implementation principles of the TiFlash MinTSO scheduler. ## Background From 41897b1a7140aae4392bc02f65721f2d17199e02 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 6 Nov 2024 09:25:39 +0100 Subject: [PATCH 14/29] Add script to check glossary --- scripts/check-glossary.py | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100755 scripts/check-glossary.py diff --git a/scripts/check-glossary.py b/scripts/check-glossary.py new file mode 100755 index 0000000000000..82e872e010f0a --- /dev/null +++ b/scripts/check-glossary.py @@ -0,0 +1,31 @@ +#!/bin/python3 +import sys +from difflib import unified_diff + +print("Checking alphabetic sorting of glossary.md") + +with open("glossary.md") as fh: + # Extract the lines that start with ### into itemsA (unsorted) + itemsA = "" + for line in fh.readlines(): + if line.startswith("###"): + itemsA += line + fh.seek(0) + + # Extract the lines that start with ### into itemsB (sorted) + itemsB = "" + for line in sorted(fh.readlines()): + if line.startswith("###"): + itemsB += line + + if itemsA == itemsB: + print("result: OK") + sys.exit(0) + + print("result: differences found, see diff for details") + # diff itemsA and itemsB + diff = unified_diff( + itemsA.splitlines(keepends=True), itemsB.splitlines(keepends=True), fromfile="before", tofile="after" + ) + sys.stdout.writelines(diff) + sys.exit(1) From 288dee73c705917442f1dc1b27900541f7d6a980 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 6 Nov 2024 09:44:55 +0100 Subject: [PATCH 15/29] Sort entries --- glossary.md | 89 ++++++++++++++++++++++++++--------------------------- 1 file changed, 43 insertions(+), 46 deletions(-) diff --git a/glossary.md b/glossary.md index d0c3a1b24a13c..779ba32a7c632 100644 --- a/glossary.md +++ b/glossary.md @@ -22,17 +22,17 @@ ACID refers to the four key properties of a transaction: atomicity, consistency, ## B -### Batch Create Table +### Backup and Restore (BR) -Batch Create Table is a feature introduced in TiDB v6.0.0. This feature is enabled default. When restoring data with a large number of tables (nearly 50000) using BR (Backup & Restore), the feature can greatly speed up the restore process by creating tables in batches. For details, see [Batch Create Table](/br/br-batch-create-table.md). +BR is the Backup and Restore tool for TiDB. For more information, see [BR Overview](/br/backup-and-restore-overview.md). ### Baseline Capturing Baseline Capturing captures queries that meet capturing conditions and create bindings for them. It is used for [preventing regression of execution plans during an upgrade](/sql-plan-management.md#prevent-regression-of-execution-plans-during-an-upgrade). -### Backup and Restore (BR) +### Batch Create Table -BR is the Backup and Restore tool for TiDB. For more information, see [BR Overview](/br/backup-and-restore-overview.md). +Batch Create Table is a feature introduced in TiDB v6.0.0. This feature is enabled default. When restoring data with a large number of tables (nearly 50000) using BR (Backup & Restore), the feature can greatly speed up the restore process by creating tables in batches. For details, see [Batch Create Table](/br/br-batch-create-table.md). ### Bucket @@ -44,22 +44,22 @@ A [Region](#regionpeerraft-group) is logically divided into several small ranges With the cached table feature, TiDB loads the data of an entire table into the memory of the TiDB server, and TiDB directly gets the table data from the memory without accessing TiKV, which improves the read performance. -### Column Family (CF) - -In RocksDB and TiKV, a Column Family (CF) represents a logical grouping of key-value pairs within a database. - ### Coalesce Partition Coalesce Partition is a way of decreasing the number of partitions in a Hash or Key partitioned table. For more information, see [Manage Hash and Key partitions](/partitioned-table.md#manage-hash-and-key-partitions). -### Continuous Profiling +### Column Family (CF) -Introduced in TiDB 5.3.0, Continuous Profiling is a way to observe resource overhead at the system call level. With the support of Continuous Profiling, TiDB provides performance insight as clear as directly looking into the database source code, and helps R&D and operation and maintenance personnel to locate the root cause of performance problems using a flame graph. For details, see [TiDB Dashboard Instance Profiling - Continuous Profiling](/dashboard/continuous-profiling.md). +In RocksDB and TiKV, a Column Family (CF) represents a logical grouping of key-value pairs within a database. ### Common Table Expression (CTE) A Common Table Expression (CTE) enables you to define a temporary result set that can be referred to multiple times within a SQL statement using the [`WITH`](/sql-statements/sql-statement-with.md) clause. For more information, see [Common Table Expression](/develop/dev-guide-use-common-table-expression.md). +### Continuous Profiling + +Introduced in TiDB 5.3.0, Continuous Profiling is a way to observe resource overhead at the system call level. With the support of Continuous Profiling, TiDB provides performance insight as clear as directly looking into the database source code, and helps R&D and operation and maintenance personnel to locate the root cause of performance problems using a flame graph. For details, see [TiDB Dashboard Instance Profiling - Continuous Profiling](/dashboard/continuous-profiling.md). + ## D ### Data Definition Language (DDL) @@ -98,14 +98,14 @@ Dynamic pruning mode is one of the modes that TiDB accesses partitioned tables. ## G -### General Availability (GA) - -General Availability (GA) of a feature is when it is, fully tested and is Generally Available for use in production environments. TiDB features may be released as Generally Available in both [DMR](#development-milestone-release-dmr) and [LTS](#long-term-support-lts) releases. However, as TiDB does not provide patch releases based on DMR it is generally recommended to use the LTS product release for production use. - ### Garbage Collection (GC) Garbage Collection (GC) is a process that clears obsolete data to free up resources. For information on TiKV GC process, see [Garbage Collection overview](/garbage-collection-overview.md). +### General Availability (GA) + +General Availability (GA) of a feature is when it is, fully tested and is Generally Available for use in production environments. TiDB features may be released as Generally Available in both [DMR](#development-milestone-release-dmr) and [LTS](#long-term-support-lts) releases. However, as TiDB does not provide patch releases based on DMR it is generally recommended to use the LTS product release for production use. + ### Global Transaction Identifiers (GTIDs) Global Transaction Identifiers (GTIDs) are unique transaction IDs used in MySQL binary logs to track which transactions have been replicated. [Data Migration (DM)](/dm/dm-overview.md) uses these IDs to ensure consistent replication. @@ -118,17 +118,17 @@ Hybrid Transactional and Analytical Processing (HTAP) is a database feature that ## I -### Instance Metadata Service (IMDS) +### In-Memory Pessimistic Lock -Instance Metadata Service (IMDS) is an AWS service designed to manage and retrieve metadata for [EC2](#ec2) instances. For more information, see [Instance metadata](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html). +The in-memory pessimistic lock is a new feature introduced in TiDB v6.0.0. When this feature is enabled, pessimistic locks are usually stored in the memory of the Region leader only, and are not persisted to disk or replicated through Raft to other replicas. This feature can greatly reduce the overhead of acquiring pessimistic locks and improve the throughput of pessimistic transactions. ### Index Merge Index Merge is a method introduced in TiDB v4.0 to access tables. Using this method, the TiDB optimizer can use multiple indexes per table and merge the results returned by each index. In some scenarios, this method makes the query more efficient by avoiding full table scans. Since v5.4, Index Merge has become a GA feature. -### In-Memory Pessimistic Lock +### Instance Metadata Service (IMDS) -The in-memory pessimistic lock is a new feature introduced in TiDB v6.0.0. When this feature is enabled, pessimistic locks are usually stored in the memory of the Region leader only, and are not persisted to disk or replicated through Raft to other replicas. This feature can greatly reduce the overhead of acquiring pessimistic locks and improve the throughput of pessimistic transactions. +Instance Metadata Service (IMDS) is an AWS service designed to manage and retrieve metadata for [EC2](#ec2) instances. For more information, see [Instance metadata](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html). ## K @@ -142,6 +142,10 @@ Key-Value (KV) is a way of storing information by associating values with unique ## L +### Leader/Follower/Learner + +Leader/Follower/Learner each corresponds to a role in a Raft group of [peers](#regionpeerraft-group). The leader services all client requests and replicates data to the followers. If the group leader fails, one of the followers will be elected as the new leader. Learners are non-voting followers that only serves in the process of replica addition. + ### Lightweight Directory Access Protocol (LDAP) Lightweight Directory Access Protocol (LDAP) is a standardized way of accessing a directory with information. It is commonly used for account and user data management. TiDB supports LDAP via [LDAP authentication plugins](/security-compatibility-with-mysql.md#authentication-plugin-status). @@ -150,10 +154,6 @@ Lightweight Directory Access Protocol (LDAP) is a standardized way of accessing Long Term Support (LTS) refers to software versions that are extensively tested and maintained for extended periods. For more information, see [TiDB Versioning](/releases/versioning.md). -### Leader/Follower/Learner - -Leader/Follower/Learner each corresponds to a role in a Raft group of [peers](#regionpeerraft-group). The leader services all client requests and replicates data to the followers. If the group leader fails, one of the followers will be elected as the new leader. Learners are non-voting followers that only serves in the process of replica addition. - ## M ### Massively Parallel Processing (MPP) @@ -166,14 +166,14 @@ Starting from v5.0, TiDB introduces Massively Parallel Processing (MPP) architec ## O -### Online Analytical Processing (OLAP) - -Online Analytical Processing (OLAP) refers to database workloads focused on analytical tasks, such as data reporting and complex queries. OLAP is characterized by read-heavy queries that process large volumes of data across many rows. - ### Old value The "original value" in the incremental change log output by TiCDC. You can specify whether the incremental change log output by TiCDC contains the "original value". +### Online Analytical Processing (OLAP) + +Online Analytical Processing (OLAP) refers to database workloads focused on analytical tasks, such as data reporting and complex queries. OLAP is characterized by read-heavy queries that process large volumes of data across many rows. + ### Online Transaction Processing (OLTP) Online Transaction Processing (OLTP) refers to database workloads focused on transactional tasks, such as selecting, inserting, updating, and deleting small sets of records. @@ -207,22 +207,22 @@ Currently, available steps generated by PD include: [Partitioning](/partitioned-table.md) refers to physically dividing a table into smaller table partitions, which can be done by partition methods such as RANGE, LIST, HASH, and KEY partitioning. +### Pending/Down + +"Pending" and "down" are two special states of a peer. Pending indicates that the Raft log of followers or learners is vastly different from that of leader. Followers in pending cannot be elected as leader. "Down" refers to a state that a peer ceases to respond to leader for a long time, which usually means the corresponding node is down or isolated from the network. + ### Placement Driver (PD) Placement Driver (PD) is a core component in the [TiDB Architecture](/tidb-architecture.md#placement-driver-pd-server) responsible for storing metadata, assigning [Timestamp Oracle (TSO)](/tso.md) for transaction timestamps, orchestrating data placement on TiKV, and running [TiDB Dashboard](/dashboard/dashboard-overview.md). For more information, see [TiDB Scheduling](/tidb-scheduling.md). -### pending/down +### Point Get -"Pending" and "down" are two special states of a peer. Pending indicates that the Raft log of followers or learners is vastly different from that of leader. Followers in pending cannot be elected as leader. "Down" refers to a state that a peer ceases to respond to leader for a long time, which usually means the corresponding node is down or isolated from the network. +Point get means reading a single row of data by a unique index or primary index, the returned resultset is up to one row. ### Point in Time Recovery (PITR) Point in Time Recovery (PITR) enables you to restore data to a specific point in time (for example, just before an unintended `DELETE` statement). For more information, see [TiDB Log Backup and PITR Architecture](/br/br-log-architecture.md). -### Point Get - -Point get means reading a single row of data by a unique index or primary index, the returned resultset is up to one row. - ### Predicate columns In most cases, when executing SQL statements, the optimizer only uses statistics of some columns (such as columns in the `WHERE`, `JOIN`, `ORDER BY`, and `GROUP BY` statements). These used columns are called predicate columns. For details, see [Collect statistics on some columns](/statistics.md#collect-statistics-on-some-columns). @@ -243,13 +243,6 @@ Quota Limiter is an experimental feature introduced in TiDB v6.0.0. If the machi Raft Engine is an embedded persistent storage engine with a log-structured design. It is built for TiKV to store multi-Raft logs. Since v5.4, TiDB supports using Raft Engine as the log storage engine. For details, see [Raft Engine](/tikv-configuration-file.md#raft-engine). -### Retrieval-Augmented Generation (RAG) - -Retrieval-Augmented Generation (RAG) is an architecture designed to optimize the output of Large Language Models (LLMs). For more information, See [Vector Search Overview](/vector-search-overview.md#use-cases). - -### Region/peer/Raft group - -Region is the minimal piece of data storage in TiKV, each representing a range of data (256 MiB by default). Each Region has three replicas by default. A replica of a Region is called a peer. Multiple peers of the same Region replicate data via the Raft consensus algorithm, so peers are also members of a Raft instance. TiKV uses Multi-Raft to manage data. That is, for each Region, there is a corresponding, isolated Raft group. ### Region split @@ -257,9 +250,9 @@ Regions are generated as data writes increase. The process of splitting is calle The mechanism of Region split is to use one initial Region to cover the entire key space, and generate new Regions through splitting existing ones every time the size of the Region or the number of keys has reached a threshold. -### Restore +### Region/peer/Raft group -Restore is the reverse of the backup operation. It is the process of bringing back the system to an earlier state by retrieving data from a prepared backup. +Region is the minimal piece of data storage in TiKV, each representing a range of data (256 MiB by default). Each Region has three replicas by default. A replica of a Region is called a peer. Multiple peers of the same Region replicate data via the Raft consensus algorithm, so peers are also members of a Raft instance. TiKV uses Multi-Raft to manage data. That is, for each Region, there is a corresponding, isolated Raft group. ### Remote Procedure Call (RPC) @@ -269,9 +262,13 @@ Remote Procedure Call (RPC) is a communication way between software components. Request Unit (RU) is a unified abstraction unit for the resource usage in TiDB. It is used with [Resource Control](/tidb-resource-control.md) to manage resource usage. +### Restore + +Restore is the reverse of the backup operation. It is the process of bringing back the system to an earlier state by retrieving data from a prepared backup. + ## S -### scheduler +### Scheduler Schedulers are components in PD that generate scheduling tasks. Each scheduler in PD runs independently and serves different purposes. The commonly used schedulers are: @@ -290,6 +287,10 @@ A store refers to the storage node in the TiKV cluster (an instance of `tikv-ser ## T +### Timestamp Oracle (TSO) + +Because TiKV is a distributed storage system, it requires a global timing service, Timestamp Oracle (TSO), to assign a monotonically increasing timestamp. In TiKV, such a feature is provided by PD, and in Google [Spanner](http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf), this feature is provided by multiple atomic clocks and GPS. For details, see [TSO](/tso.md). + ### Top SQL Top SQL helps locate SQL queries that contribute to a high load of a TiDB or TiKV node in a specified time range. For details, see [Top SQL user document](/dashboard/top-sql.md). @@ -298,10 +299,6 @@ Top SQL helps locate SQL queries that contribute to a high load of a TiDB or TiK Transactions Per Second (TPS) is the number of transactions a database processes per second, serving as a key metric for measuring database performance and throughput. -### Timestamp Oracle (TSO) - -Because TiKV is a distributed storage system, it requires a global timing service, Timestamp Oracle (TSO), to assign a monotonically increasing timestamp. In TiKV, such a feature is provided by PD, and in Google [Spanner](http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf), this feature is provided by multiple atomic clocks and GPS. For details, see [TSO](/tso.md). - ## U ### Uniform Resource Identifier (URI) From 633306b1985118d9f24350cf60f79d87a466a832 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 6 Nov 2024 09:49:39 +0100 Subject: [PATCH 16/29] Ignore differences in case --- scripts/check-glossary.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/check-glossary.py b/scripts/check-glossary.py index 82e872e010f0a..0d19c0ee0cda1 100755 --- a/scripts/check-glossary.py +++ b/scripts/check-glossary.py @@ -14,7 +14,7 @@ # Extract the lines that start with ### into itemsB (sorted) itemsB = "" - for line in sorted(fh.readlines()): + for line in sorted(fh.readlines(), key=str.casefold): if line.startswith("###"): itemsB += line From 1a32f1fac8837acdd2157d993f154c95d88a1ece Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 6 Nov 2024 10:56:27 +0100 Subject: [PATCH 17/29] Fix MD012 --- glossary.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/glossary.md b/glossary.md index 779ba32a7c632..33e1927a0108b 100644 --- a/glossary.md +++ b/glossary.md @@ -243,7 +243,6 @@ Quota Limiter is an experimental feature introduced in TiDB v6.0.0. If the machi Raft Engine is an embedded persistent storage engine with a log-structured design. It is built for TiKV to store multi-Raft logs. Since v5.4, TiDB supports using Raft Engine as the log storage engine. For details, see [Raft Engine](/tikv-configuration-file.md#raft-engine). - ### Region split Regions are generated as data writes increase. The process of splitting is called Region split. @@ -307,4 +306,4 @@ Uniform Resource Identifier (URI) is a standardized format for identifying a res ### Universally Unique Identifier (UUID) -Universally Unique Identifier (UUID) is a 128-bit (16-byte) generated ID used to uniquely identify records in a database. For more information, see [UUID](/best-practices/uuid.md). \ No newline at end of file +Universally Unique Identifier (UUID) is a 128-bit (16-byte) generated ID used to uniquely identify records in a database. For more information, see [UUID](/best-practices/uuid.md). From 3d2c76103a54273654293d53a6d5e2e2002977f5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 6 Nov 2024 11:07:19 +0100 Subject: [PATCH 18/29] Format script with black --- scripts/check-glossary.py | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/scripts/check-glossary.py b/scripts/check-glossary.py index 0d19c0ee0cda1..7a0824769a214 100755 --- a/scripts/check-glossary.py +++ b/scripts/check-glossary.py @@ -25,7 +25,10 @@ print("result: differences found, see diff for details") # diff itemsA and itemsB diff = unified_diff( - itemsA.splitlines(keepends=True), itemsB.splitlines(keepends=True), fromfile="before", tofile="after" + itemsA.splitlines(keepends=True), + itemsB.splitlines(keepends=True), + fromfile="before", + tofile="after", ) sys.stdout.writelines(diff) sys.exit(1) From 18dcc30b038dde6e36f024161470f5a8a934c5d5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 6 Nov 2024 11:27:54 +0100 Subject: [PATCH 19/29] Link between glossaries --- dm/dm-glossary.md | 2 ++ glossary.md | 7 +++++++ tidb-lightning/tidb-lightning-glossary.md | 2 ++ 3 files changed, 11 insertions(+) diff --git a/dm/dm-glossary.md b/dm/dm-glossary.md index a92a834e43a4d..3c1fdafeb5a03 100644 --- a/dm/dm-glossary.md +++ b/dm/dm-glossary.md @@ -8,6 +8,8 @@ aliases: ['/docs/tidb-data-migration/dev/glossary/'] This document lists the terms used in the logs, monitoring, configurations, and documentation of TiDB Data Migration (DM). +For TiDB-related terms and definitions, refer to [TiDB glossary](/glossary.md). + ## B ### Binlog diff --git a/glossary.md b/glossary.md index 33e1927a0108b..4339b3533d8b7 100644 --- a/glossary.md +++ b/glossary.md @@ -6,6 +6,13 @@ aliases: ['/docs/dev/glossary/'] # Glossary +This is the general glossary describing terms related to the TiDB platform. + +Other available glossaries: +- [DM Glossary](/dm/dm-glossary.md) +- [TiCDC Glossary](/ticdc/ticdc-glossary.md) +- [TiDB Lightning Glossary](/tidb-lightning/tidb-lightning-glossary.md) + ## A ### ACID diff --git a/tidb-lightning/tidb-lightning-glossary.md b/tidb-lightning/tidb-lightning-glossary.md index 3645dc3997970..ded5bcd5e276b 100644 --- a/tidb-lightning/tidb-lightning-glossary.md +++ b/tidb-lightning/tidb-lightning-glossary.md @@ -8,6 +8,8 @@ aliases: ['/docs/dev/tidb-lightning/tidb-lightning-glossary/','/docs/dev/referen This page explains the special terms used in TiDB Lightning's logs, monitoring, configurations, and documentation. +For TiDB-related terms and definitions, refer to [TiDB glossary](/glossary.md). + ## A From bda5a82fcc7280910786c139ff65644a044be5ad Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 6 Nov 2024 12:01:20 +0100 Subject: [PATCH 20/29] Fix MD032 --- glossary.md | 1 + 1 file changed, 1 insertion(+) diff --git a/glossary.md b/glossary.md index 4339b3533d8b7..b2e8e0052b87e 100644 --- a/glossary.md +++ b/glossary.md @@ -9,6 +9,7 @@ aliases: ['/docs/dev/glossary/'] This is the general glossary describing terms related to the TiDB platform. Other available glossaries: + - [DM Glossary](/dm/dm-glossary.md) - [TiCDC Glossary](/ticdc/ticdc-glossary.md) - [TiDB Lightning Glossary](/tidb-lightning/tidb-lightning-glossary.md) From 075afc4e219de236210ed7a76dfa61f4b1272935 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Thu, 7 Nov 2024 14:24:16 +0800 Subject: [PATCH 21/29] minor wording updates --- dm/dm-glossary.md | 2 +- glossary.md | 16 ++++++++-------- ticdc/ticdc-glossary.md | 2 +- tidb-lightning/tidb-lightning-glossary.md | 2 +- 4 files changed, 11 insertions(+), 11 deletions(-) diff --git a/dm/dm-glossary.md b/dm/dm-glossary.md index 3c1fdafeb5a03..b301baf00f797 100644 --- a/dm/dm-glossary.md +++ b/dm/dm-glossary.md @@ -8,7 +8,7 @@ aliases: ['/docs/tidb-data-migration/dev/glossary/'] This document lists the terms used in the logs, monitoring, configurations, and documentation of TiDB Data Migration (DM). -For TiDB-related terms and definitions, refer to [TiDB glossary](/glossary.md). +For TiDB-related terms and definitions, see [TiDB glossary](/glossary.md). ## B diff --git a/glossary.md b/glossary.md index b2e8e0052b87e..e43f06ed130a2 100644 --- a/glossary.md +++ b/glossary.md @@ -6,7 +6,7 @@ aliases: ['/docs/dev/glossary/'] # Glossary -This is the general glossary describing terms related to the TiDB platform. +This glossary provides definitions for key terms related to the TiDB platform. Other available glossaries: @@ -72,15 +72,15 @@ Introduced in TiDB 5.3.0, Continuous Profiling is a way to observe resource over ### Data Definition Language (DDL) -Data Definition Language (DDL) is the part of the SQL standard that deals with creating, modifying and deleting tables and other objects. For details, see [DDL Introduction](/ddl-introduction.md). +Data Definition Language (DDL) is a part of the SQL standard that deals with creating, modifying, and dropping tables and other objects. For more information, see [DDL Introduction](/ddl-introduction.md). ### Data Migration (DM) -Data Migration (DM) is a tool for migrating data from MySQL-compatible databases into TiDB. DM reads data from a MySQL-compatible database instance and applies it to a TiDB target instance. For more information, see [DM Overview](/dm/dm-overview.md). +Data Migration (DM) is a tool for migrating data from MySQL-compatible databases into TiDB. DM reads data from a MySQL-compatible database instance and applies it to a TiDB target instance. For more information, see [DM Overview](/dm/dm-overview.md). ### Data Modification Language (DML) -Data Modification Language (DML) is the part of the SQL standard that describes statements which enable you to insert, update, and delete rows in tables. +Data Modification Language (DML) is a part of the SQL standard that deals with inserting, updating, and dropping rows in tables. ### Development Milestone Release (DMR) @@ -88,11 +88,11 @@ Development Milestone Releases (DMR) are TiDB releases that introduce the latest ### Disaster Recovery (DR) -Disaster Recovery (DR) includes solutions that can be used to recover data and services from a disaster in the future. TiDB offers a variety of solutions for delivering Disaster Recovery including backups and replication to standby clusters. For more information, see [Overview of TiDB Disaster Recovery Solutions](/dr-solution-introduction.md). +Disaster Recovery (DR) includes solutions that can be used to recover data and services from a disaster in the future. TiDB offers various Disaster Recovery solutions such as backups and replication to standby clusters. For more information, see [Overview of TiDB Disaster Recovery Solutions](/dr-solution-introduction.md). ### Distributed eXecution Framework (DXF) -Distributed eXecution Framework (DXF) is the framework used by TiDB to distribute tasks across the TiDB cluster. DXF is designed to efficiently use the cluster resources to execute tasks (like index creation or data import) while controlling the resource usage and impact on core business transactions. For more information, see [DXF Introduction](/tidb-distributed-execution-framework.md). +Distributed eXecution Framework (DXF) is the framework used by TiDB to distribute tasks across a TiDB cluster. DXF is designed to efficiently use cluster resources to execute tasks (such as index creation or data import) while controlling the resource usage and impact on core business transactions. For more information, see [DXF Introduction](/tidb-distributed-execution-framework.md). ### Dynamic Pruning @@ -112,7 +112,7 @@ Garbage Collection (GC) is a process that clears obsolete data to free up resour ### General Availability (GA) -General Availability (GA) of a feature is when it is, fully tested and is Generally Available for use in production environments. TiDB features may be released as Generally Available in both [DMR](#development-milestone-release-dmr) and [LTS](#long-term-support-lts) releases. However, as TiDB does not provide patch releases based on DMR it is generally recommended to use the LTS product release for production use. +General Availability (GA) of a feature means the feature is fully tested and is Generally Available for use in production environments. TiDB features can be released as GA in both [DMR](#development-milestone-release-dmr) and [LTS](#long-term-support-lts) releases. However, as TiDB does not provide patch releases for DMR it is generally recommended to use the LTS release for production use. ### Global Transaction Identifiers (GTIDs) @@ -286,7 +286,7 @@ Schedulers are components in PD that generate scheduling tasks. Each scheduler i ### Static Sorted Table / Sorted String Table (SST) -Static Sorted Table or Sorted String Table is a file storage format used in RocksDB (a component used by the [TiKV Storage Engine](/storage-engine/rocksdb-overview.md). +Static Sorted Table or Sorted String Table is a file storage format used in RocksDB (a component used by the [TiKV Storage Engine](/storage-engine/rocksdb-overview.md)). ### Store diff --git a/ticdc/ticdc-glossary.md b/ticdc/ticdc-glossary.md index b4debae89e9b9..b054da51db7db 100644 --- a/ticdc/ticdc-glossary.md +++ b/ticdc/ticdc-glossary.md @@ -7,7 +7,7 @@ summary: Learn the terms about TiCDC and their definitions. This glossary provides TiCDC-related terms and definitions. These terms appears in TiCDC logs, monitoring metrics, configurations, and documents. -For TiDB-related terms and definitions, refer to [TiDB glossary](/glossary.md). +For TiDB-related terms and definitions, see [TiDB glossary](/glossary.md). ## C diff --git a/tidb-lightning/tidb-lightning-glossary.md b/tidb-lightning/tidb-lightning-glossary.md index ded5bcd5e276b..5bef9f13ccfce 100644 --- a/tidb-lightning/tidb-lightning-glossary.md +++ b/tidb-lightning/tidb-lightning-glossary.md @@ -8,7 +8,7 @@ aliases: ['/docs/dev/tidb-lightning/tidb-lightning-glossary/','/docs/dev/referen This page explains the special terms used in TiDB Lightning's logs, monitoring, configurations, and documentation. -For TiDB-related terms and definitions, refer to [TiDB glossary](/glossary.md). +For TiDB-related terms and definitions, see [TiDB glossary](/glossary.md). From 091940d9cd280df44537d0b8724a1774866bb733 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Fri, 8 Nov 2024 09:04:51 +0100 Subject: [PATCH 22/29] Update glossary.md Co-authored-by: Grace Cai --- glossary.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/glossary.md b/glossary.md index e43f06ed130a2..3df0b84592b40 100644 --- a/glossary.md +++ b/glossary.md @@ -10,7 +10,7 @@ This glossary provides definitions for key terms related to the TiDB platform. Other available glossaries: -- [DM Glossary](/dm/dm-glossary.md) +- [TiDB Data Migration Glossary](/dm/dm-glossary.md) - [TiCDC Glossary](/ticdc/ticdc-glossary.md) - [TiDB Lightning Glossary](/tidb-lightning/tidb-lightning-glossary.md) From 64f41243256a6599ae73ddcccc29d760e757d755 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Fri, 8 Nov 2024 09:11:21 +0100 Subject: [PATCH 23/29] Remove AWS related items --- glossary.md | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/glossary.md b/glossary.md index 3df0b84592b40..3efc0dd2f252e 100644 --- a/glossary.md +++ b/glossary.md @@ -98,12 +98,6 @@ Distributed eXecution Framework (DXF) is the framework used by TiDB to distribut Dynamic pruning mode is one of the modes that TiDB accesses partitioned tables. In dynamic pruning mode, each operator supports direct access to multiple partitions. Therefore, TiDB no longer uses Union. Omitting the Union operation can improve the execution efficiency and avoid the problem of Union concurrent execution. -## E - -### EC2 - -[Elastic Compute Cloud (EC2)](https://aws.amazon.com/pm/ec2/) is an AWS service that provides scalable compute resources. It can be used with TiUP to deploy and manage a TiDB cluster. - ## G ### Garbage Collection (GC) @@ -134,10 +128,6 @@ The in-memory pessimistic lock is a new feature introduced in TiDB v6.0.0. When Index Merge is a method introduced in TiDB v4.0 to access tables. Using this method, the TiDB optimizer can use multiple indexes per table and merge the results returned by each index. In some scenarios, this method makes the query more efficient by avoiding full table scans. Since v5.4, Index Merge has become a GA feature. -### Instance Metadata Service (IMDS) - -Instance Metadata Service (IMDS) is an AWS service designed to manage and retrieve metadata for [EC2](#ec2) instances. For more information, see [Instance metadata](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html). - ## K ### Key Management Service (KMS) From 78b9fbaba6fb05a69c45342bc77db31d2358dacd Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Mon, 11 Nov 2024 10:21:08 +0100 Subject: [PATCH 24/29] Update glossary.md Co-authored-by: Grace Cai --- glossary.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/glossary.md b/glossary.md index 3efc0dd2f252e..c8a451608c975 100644 --- a/glossary.md +++ b/glossary.md @@ -176,7 +176,7 @@ Online Analytical Processing (OLAP) refers to database workloads focused on anal Online Transaction Processing (OLTP) refers to database workloads focused on transactional tasks, such as selecting, inserting, updating, and deleting small sets of records. -## Out of Memory (OOM) +### Out of Memory (OOM) Out of Memory (OOM) is a situation where a system fails due to insufficient memory. For more information, see [Troubleshoot TiDB OOM Issues](/troubleshoot-tidb-oom.md). From ab2d8e1dd6f9755cd402667a781239d5c66eedd7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 13 Nov 2024 07:21:16 +0100 Subject: [PATCH 25/29] Update glossary.md Co-authored-by: Grace Cai --- glossary.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/glossary.md b/glossary.md index c8a451608c975..f8afe6f1f6d23 100644 --- a/glossary.md +++ b/glossary.md @@ -276,7 +276,7 @@ Schedulers are components in PD that generate scheduling tasks. Each scheduler i ### Static Sorted Table / Sorted String Table (SST) -Static Sorted Table or Sorted String Table is a file storage format used in RocksDB (a component used by the [TiKV Storage Engine](/storage-engine/rocksdb-overview.md)). +Static Sorted Table or Sorted String Table is a file storage format used in RocksDB (a storage engine used by [TiKV](/storage-engine/rocksdb-overview.md)). ### Store From d2554637fdf54113d2fdbebc87156c3bfc07a71a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 13 Nov 2024 10:52:48 +0100 Subject: [PATCH 26/29] Apply suggestions from code review Co-authored-by: Grace Cai --- glossary.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/glossary.md b/glossary.md index f8afe6f1f6d23..52ce29a4882cf 100644 --- a/glossary.md +++ b/glossary.md @@ -30,9 +30,11 @@ ACID refers to the four key properties of a transaction: atomicity, consistency, ## B -### Backup and Restore (BR) +### Backup & Restore (BR) -BR is the Backup and Restore tool for TiDB. For more information, see [BR Overview](/br/backup-and-restore-overview.md). +BR is the backup and restore tool for TiDB. For more information, see [BR Overview](/br/backup-and-restore-overview.md). + +`br` is the [br command line tool](/br/use-br-command-line-tool.md) used for backups or restores in TiDB. ### Baseline Capturing From 4b9c2df0f0854b18057175f147e48e1eb6fab8d9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 13 Nov 2024 12:04:18 +0100 Subject: [PATCH 27/29] Update glossary.md Co-authored-by: xixirangrang --- glossary.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/glossary.md b/glossary.md index 52ce29a4882cf..780b2e359f977 100644 --- a/glossary.md +++ b/glossary.md @@ -42,7 +42,7 @@ Baseline Capturing captures queries that meet capturing conditions and create bi ### Batch Create Table -Batch Create Table is a feature introduced in TiDB v6.0.0. This feature is enabled default. When restoring data with a large number of tables (nearly 50000) using BR (Backup & Restore), the feature can greatly speed up the restore process by creating tables in batches. For details, see [Batch Create Table](/br/br-batch-create-table.md). +Batch Create Table is a feature introduced in TiDB v6.0.0. This feature is enabled by default. When restoring data with a large number of tables (nearly 50000) using BR (Backup & Restore), the feature can greatly speed up the restore process by creating tables in batches. For details, see [Batch Create Table](/br/br-batch-create-table.md). ### Bucket From 6e9c2953b1141ecfffea54ca44676936167401c6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Thu, 14 Nov 2024 07:41:24 +0100 Subject: [PATCH 28/29] Apply suggestions from code review Co-authored-by: Grace Cai Co-authored-by: xixirangrang --- glossary.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/glossary.md b/glossary.md index 780b2e359f977..47560651e3dfe 100644 --- a/glossary.md +++ b/glossary.md @@ -94,7 +94,7 @@ Disaster Recovery (DR) includes solutions that can be used to recover data and s ### Distributed eXecution Framework (DXF) -Distributed eXecution Framework (DXF) is the framework used by TiDB to distribute tasks across a TiDB cluster. DXF is designed to efficiently use cluster resources to execute tasks (such as index creation or data import) while controlling the resource usage and impact on core business transactions. For more information, see [DXF Introduction](/tidb-distributed-execution-framework.md). +Distributed eXecution Framework (DXF) is the framework used by TiDB to centrally schedule certain tasks (such as creating indexes or importing data) and execute them in a distributed manner. DXF is designed to efficiently use cluster resources while controlling resource usage and reducing the impact on core business transactions. For more information, see [DXF Introduction](/tidb-distributed-execution-framework.md). ### Dynamic Pruning @@ -104,7 +104,7 @@ Dynamic pruning mode is one of the modes that TiDB accesses partitioned tables. ### Garbage Collection (GC) -Garbage Collection (GC) is a process that clears obsolete data to free up resources. For information on TiKV GC process, see [Garbage Collection overview](/garbage-collection-overview.md). +Garbage Collection (GC) is a process that clears obsolete data to free up resources. For information on TiKV GC process, see [GC Overview](/garbage-collection-overview.md). ### General Availability (GA) @@ -243,13 +243,13 @@ Quota Limiter is an experimental feature introduced in TiDB v6.0.0. If the machi Raft Engine is an embedded persistent storage engine with a log-structured design. It is built for TiKV to store multi-Raft logs. Since v5.4, TiDB supports using Raft Engine as the log storage engine. For details, see [Raft Engine](/tikv-configuration-file.md#raft-engine). -### Region split +### Region Split -Regions are generated as data writes increase. The process of splitting is called Region split. +A region in a TiKV cluster is not divided at the beginning, but is gradually split as data is written to it. The process is called Region split. The mechanism of Region split is to use one initial Region to cover the entire key space, and generate new Regions through splitting existing ones every time the size of the Region or the number of keys has reached a threshold. -### Region/peer/Raft group +### Region/Peer/Raft Group Region is the minimal piece of data storage in TiKV, each representing a range of data (256 MiB by default). Each Region has three replicas by default. A replica of a Region is called a peer. Multiple peers of the same Region replicate data via the Raft consensus algorithm, so peers are also members of a Raft instance. TiKV uses Multi-Raft to manage data. That is, for each Region, there is a corresponding, isolated Raft group. From fd4d6ca563b7992ad6004293a7c733b0ab9d2c58 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Thu, 14 Nov 2024 17:49:00 +0800 Subject: [PATCH 29/29] minor punctuation changes --- glossary.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/glossary.md b/glossary.md index 47560651e3dfe..51e612e6fb759 100644 --- a/glossary.md +++ b/glossary.md @@ -245,7 +245,7 @@ Raft Engine is an embedded persistent storage engine with a log-structured desig ### Region Split -A region in a TiKV cluster is not divided at the beginning, but is gradually split as data is written to it. The process is called Region split. +A region in a TiKV cluster is not divided at the beginning but is gradually split as data is written to it. The process is called Region split. The mechanism of Region split is to use one initial Region to cover the entire key space, and generate new Regions through splitting existing ones every time the size of the Region or the number of keys has reached a threshold.