Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glossary: Add more abbreviations #19213

Merged
merged 30 commits into from
Nov 15, 2024
Merged
Changes from 3 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
956145c
glossary: Add more abbreviations
dveeden Oct 25, 2024
f3186ce
Apply suggestions from code review
dveeden Oct 29, 2024
e4c82a5
Apply suggestions from code review
dveeden Oct 29, 2024
7379630
Update glossary.md
dveeden Oct 29, 2024
9fbf684
Apply suggestions from code review
dveeden Oct 29, 2024
bd23974
refine wording
qiancai Nov 1, 2024
fc1402c
GCP KMS -> Google Cloud KMS
qiancai Nov 1, 2024
c3c192a
Update based on review
dveeden Nov 6, 2024
d0ef6ee
Updated RocksDB link
dveeden Nov 6, 2024
539af48
Consistency updates
dveeden Nov 6, 2024
b07f67f
Update CTE
dveeden Nov 6, 2024
ae0e2bf
Fix issues found by linters
dveeden Nov 6, 2024
7af7c76
Merge remote-tracking branch 'upstream/master' into glossary_oct24_1
dveeden Nov 6, 2024
d5daf41
Fix link
dveeden Nov 6, 2024
41897b1
Add script to check glossary
dveeden Nov 6, 2024
288dee7
Sort entries
dveeden Nov 6, 2024
633306b
Ignore differences in case
dveeden Nov 6, 2024
1a32f1f
Fix MD012
dveeden Nov 6, 2024
3d2c761
Format script with black
dveeden Nov 6, 2024
18dcc30
Link between glossaries
dveeden Nov 6, 2024
bda5a82
Fix MD032
dveeden Nov 6, 2024
075afc4
minor wording updates
qiancai Nov 7, 2024
091940d
Update glossary.md
dveeden Nov 8, 2024
64f4124
Remove AWS related items
dveeden Nov 8, 2024
78b9fba
Update glossary.md
dveeden Nov 11, 2024
ab2d8e1
Update glossary.md
dveeden Nov 13, 2024
d255463
Apply suggestions from code review
dveeden Nov 13, 2024
4b9c2df
Update glossary.md
dveeden Nov 13, 2024
6e9c295
Apply suggestions from code review
dveeden Nov 14, 2024
fd4d6ca
minor punctuation changes
qiancai Nov 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
140 changes: 139 additions & 1 deletion glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@

Baseline Capturing captures queries that meet capturing conditions and create bindings for them. It is used for [preventing regression of execution plans during an upgrade](/sql-plan-management.md#prevent-regression-of-execution-plans-during-an-upgrade).

### BR
dveeden marked this conversation as resolved.
Show resolved Hide resolved

BR is the Backup and Restore tool for TiDB. See [BR Overview](/br/backup-and-restore-overview.md) for more information.

### Bucket

A [Region](#regionpeerraft-group) is logically divided into several small ranges called bucket. TiKV collects query statistics by buckets and reports the bucket status to PD. For details, see the [Bucket design doc](https://github.com/tikv/rfcs/blob/master/text/0082-dynamic-size-region.md#bucket).
Expand All @@ -40,6 +44,10 @@

With the cached table feature, TiDB loads the data of an entire table into the memory of the TiDB server, and TiDB directly gets the table data from the memory without accessing TiKV, which improves the read performance.

### CF
dveeden marked this conversation as resolved.
Show resolved Hide resolved

CF is short for Column Family as used by RocksDB / TiKV.

### Coalesce Partition

Coalesce Partition is a way of decreasing the number of partitions in a Hash or Key partitioned table. For more information, see [Manage Hash and Key partitions](/partitioned-table.md#manage-hash-and-key-partitions).
Expand All @@ -48,14 +56,72 @@

Introduced in TiDB 5.3.0, Continuous Profiling is a way to observe resource overhead at the system call level. With the support of Continuous Profiling, TiDB provides performance insight as clear as directly looking into the database source code, and helps R&D and operation and maintenance personnel to locate the root cause of performance problems using a flame graph. For details, see [TiDB Dashboard Instance Profiling - Continuous Profiling](/dashboard/continuous-profiling.md).

### CTE
dveeden marked this conversation as resolved.
Show resolved Hide resolved

A Common Table Expression (CTE) is part of the SQL standard and uses [`WITH`](/sql-statements/sql-statement-with.md) statements.

## D

### DDL
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Data Definition Language (DDL) is the part of the SQL standard that deals with creating, modifying and deleting tables, indexes, columns and other objects.

### DM
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Data Migration is the tool that allows MySQL to TiDB migration by reading data from a source instance and applying it to a target MySQL instance. See [DM Overview](/dm/dm-overview.md) for more information.

### DML

Data Modification Language (DML) is the part of the SQL standard that deals with inserting, updating and deleting rows in tables.

### DMR
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Development Milestone Release (DMR) is a version of TiDB that provides users with the latest features but doesn't provide long term support. See [TiDB Versioning](/releases/versioning.md) for more information.

### DR
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Disaster Recovery (DR) describes solutions that can be used to recover from a disaster in the future. This includes things like backups and standby clusters.

### DXF
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Distributed eXecution Framework (DXF) is the framework used by TiDB to speedup index creation and data import by distributing tasks over all available resources. See [DXF Introduction](/tidb-distributed-execution-framework.md) for more details

### Dynamic Pruning

Dynamic pruning mode is one of the modes that TiDB accesses partitioned tables. In dynamic pruning mode, each operator supports direct access to multiple partitions. Therefore, TiDB no longer uses Union. Omitting the Union operation can improve the execution efficiency and avoid the problem of Union concurrent execution.

## E

### EC2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to define EC2 in our glossary? Practically speaking, will people be looking to us to define EC2 for them in this doc? I worry that this would expand to defining a bunch of other third party terms if we go down this path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"EC2" is used three times in the 8.4.0 release notes and at least 76 times elsewhere in our docs.

$ git grep EC2 | wc -l
76


Elastic Compute Cloud (EC2) is an AWS service that provides compute resources. This can be used with TiUP to run a TiDB Cluster.

## G

### GA
dveeden marked this conversation as resolved.
Show resolved Hide resolved

General Available (GA) is the first non-beta version of a software product.
mjonss marked this conversation as resolved.
Show resolved Hide resolved

### GC
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Garbage Collection (GC) is the process to cleanup unused resources. See [GC](/garbage-collection-overview.md) for the GC process of TiKV.

### GTID
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Global Transactions ID's (GTIDs) are used by recent MySQL versions binary log to indicate what transactions have been replicated and which have not. This information can be used by DM.

## H

### HTAP
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Hybrid Transactional Analytical Process (HTAP) is a database feature that allows both OLTP and OLAP workloads on the same database. For TiDB the HTAP feature is provided by using both TiKV for row storage and TiFlash for columnar storage. See [the definition of HTAP on the Gartner website](https://www.gartner.com/en/information-technology/glossary/htap-enabling-memory-computing-technologies) for more information.

## I

### IMDS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we want to include this third party abbreviation here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used in br/backup-and-restore-storages.md and in the TiDB v8.4.0 release notes.


Instance Metadata Service (IMDS) is a AWS service that can be used to manage EC2 instances. See [Instance metadata](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html) for more information.

### Index Merge

Index Merge is a method introduced in TiDB v4.0 to access tables. Using this method, the TiDB optimizer can use multiple indexes per table and merge the results returned by each index. In some scenarios, this method makes the query more efficient by avoiding full table scans. Since v5.4, Index Merge has become a GA feature.
Expand All @@ -64,8 +130,26 @@

The in-memory pessimistic lock is a new feature introduced in TiDB v6.0.0. When this feature is enabled, pessimistic locks are usually stored in the memory of the Region leader only, and are not persisted to disk or replicated through Raft to other replicas. This feature can greatly reduce the overhead of acquiring pessimistic locks and improve the throughput of pessimistic transactions.

## K

### KMS
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Key Management Service (KMS) allows the storage and retrieval of secret keys in a secure way. Examples of this are the AWS KMS, GCP KMS and HashiCorp Vault. Various TiDB components can use this to manage the keys that are used for storage encryption and related services.

Check failure on line 137 in glossary.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [Vale.Avoid] Avoid using 'GCP'. Raw Output: {"message": "[Vale.Avoid] Avoid using 'GCP'.", "location": {"path": "glossary.md", "range": {"start": {"line": 137, "column": 129}}}, "severity": "ERROR"}

### KV
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Key-Value (KV) is a way storing information that allows easy retrieval by specifiying the key. Multiple values can be stored under a single key by encoding them. TiKV is implementing this.
dveeden marked this conversation as resolved.
Show resolved Hide resolved

## L

### LDAP
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Lightweight Directory Access Protocol (LDAP) is a standardized way of accessing a directory with information. This is often used to store information on accounts. This is used in TiDB by [LDAP authentication plugins](/security-compatibility-with-mysql.md#authentication-plugin-status).

### LTS
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Long Term Support (LTS) are software versions that are considered stable and are supported for a long term. See [TiDB Versioning](/releases/versioning.md) for more details.
dveeden marked this conversation as resolved.
Show resolved Hide resolved

### leader/follower/learner
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Leader/Follower/Learner each corresponds to a role in a Raft group of [peers](#regionpeerraft-group). The leader services all client requests and replicates data to the followers. If the group leader fails, one of the followers will be elected as the new leader. Learners are non-voting followers that only serves in the process of replica addition.
Expand All @@ -82,10 +166,22 @@

## O

### OLAP
dveeden marked this conversation as resolved.
Show resolved Hide resolved

OnLine Analytical Processing (OLAP) are describing database workloads that mostly deal with analytical workloads like reporting. The characteristics of this is read heavy queries that process many rows.

Check warning on line 171 in glossary.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion. Raw Output: {"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion.", "location": {"path": "glossary.md", "range": {"start": {"line": 171, "column": 193}}}, "severity": "INFO"}

### Old value

The "original value" in the incremental change log output by TiCDC. You can specify whether the incremental change log output by TiCDC contains the "original value".

### OLTP
dveeden marked this conversation as resolved.
Show resolved Hide resolved

OnLine Transaction Processing (OLTP) are describing database workloads that mostly deal with transactioonal workloads like inserting, updating and deleting small sets of records.
dveeden marked this conversation as resolved.
Show resolved Hide resolved

## OOM
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Out of Memory (OOM) is a situation where a system fails due to a a lack of available memory. See [Troubleshoot TiDB OOM Issues](/troubleshoot-tidb-oom.md) for more details.

### Operator

An operator is a collection of actions that applies to a Region for scheduling purposes. Operators perform scheduling tasks such as "migrate the leader of Region 2 to Store 5" and "migrate replicas of Region 2 to Store 1, 4, 5".
Expand All @@ -111,10 +207,18 @@

[Partitioning](/partitioned-table.md) refers to physically dividing a table into smaller table partitions, which can be done by partition methods such as RANGE, LIST, HASH, and KEY partitioning.

### PD
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Placement Driver (PD) is an important component of the [TiDB Architecture](/tidb-architecture.md#placement-driver-pd-server) that is responsible to store metadata and run the [TSO](/tso.md) that hands out timestamps that are used for transactions. It also orchestrates the data placement on TiKV and runs the [TiDB Dashboard](/dashboard/dashboard-overview.md).
mjonss marked this conversation as resolved.
Show resolved Hide resolved

### pending/down

"Pending" and "down" are two special states of a peer. Pending indicates that the Raft log of followers or learners is vastly different from that of leader. Followers in pending cannot be elected as leader. "Down" refers to a state that a peer ceases to respond to leader for a long time, which usually means the corresponding node is down or isolated from the network.

### PiTR

Point in Time Recovery (PiTR) is a database feature that allows the user to restore to a specific point in time (for example just before an accidental `DELETE` statement). See [TiDB Log Backup and PITR Architecture](/br/br-log-architecture.md) for more details.

### Point Get

Point get means reading a single row of data by a unique index or primary index, the returned resultset is up to one row.
Expand All @@ -125,6 +229,10 @@

## Q

### QPS

Queries Per Second (QPS) is a performance metric of a database service.

### Quota Limiter

Quota Limiter is an experimental feature introduced in TiDB v6.0.0. If the machine on which TiKV is deployed has limited resources, for example, with only 4v CPU and 16 G memory, and the foreground of TiKV processes too many read and write requests, the CPU resources used by the background are occupied to help process such requests, which affects the performance stability of TiKV. To avoid this situation, the [quota-related configuration items](/tikv-configuration-file.md#quota) can be set to limit the CPU resources to be used by the foreground.
Expand All @@ -135,6 +243,10 @@

Raft Engine is an embedded persistent storage engine with a log-structured design. It is built for TiKV to store multi-Raft logs. Since v5.4, TiDB supports using Raft Engine as the log storage engine. For details, see [Raft Engine](/tikv-configuration-file.md#raft-engine).

### RAG

Retrieval-Augmented Generation (RAG). See [Vector Search Overview](/vector-search-overview.md#use-cases) for more details.

### Region/peer/Raft group
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Region is the minimal piece of data storage in TiKV, each representing a range of data (256 MiB by default). Each Region has three replicas by default. A replica of a Region is called a peer. Multiple peers of the same Region replicate data via the Raft consensus algorithm, so peers are also members of a Raft instance. TiKV uses Multi-Raft to manage data. That is, for each Region, there is a corresponding, isolated Raft group.
Expand All @@ -145,10 +257,18 @@

The mechanism of Region split is to use one initial Region to cover the entire key space, and generate new Regions through splitting existing ones every time the size of the Region or the number of keys has reached a threshold.

### restore
### Restore

Restore is the reverse of the backup operation. It is the process of bringing back the system to an earlier state by retrieving data from a prepared backup.

### RPC
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Remote Procedure Call (RPC) is a way for software components to communicate. In a TiDB cluster gRPC standard is used for communication between TiKV and TiDB.
dveeden marked this conversation as resolved.
Show resolved Hide resolved

### RU
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Request Unit (RU) is used in TiDB to describe the unit for the resource usage. This is used with [Resource Control](/tidb-resource-control.md) to manage resource usage.

## S

### scheduler
Expand All @@ -160,6 +280,10 @@
- `hot-region-scheduler`: Balances the distribution of hot Regions
- `evict-leader-{store-id}`: Evicts all leaders of a node (often used for rolling upgrades)

### SST
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Static Sorted Table is the store format of RocksDB.
dveeden marked this conversation as resolved.
Show resolved Hide resolved

### Store

A store refers to the storage node in the TiKV cluster (an instance of `tikv-server`). Each store has a corresponding TiKV instance.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we adding the PD component to this glossary should we also add the TiDB Server, TiFlash Server, and TiKV Server components as well for completeness?

Suggested additions:

TiDB Server

The TiDB server is a stateless SQL layer that exposes the connection endpoint of the MySQL protocol to the outside. The TiDB server receives SQL requests, performs SQL parsing and optimization, and ultimately generates a distributed execution plan.

TiFlash Server

The TiFlash server is a special type of storage server. Unlike ordinary TiKV nodes, TiFlash stores data by column, mainly designed to accelerate analytical processing.

TiKV Server

The TiKV server is responsible for storing data. TiKV is a distributed transactional key-value storage engine.

Expand All @@ -170,6 +294,20 @@

Top SQL helps locate SQL queries that contribute to a high load of a TiDB or TiKV node in a specified time range. For details, see [Top SQL user document](/dashboard/top-sql.md).

### TPS
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Transactions Per Second (TPS) is a performance metric of a database.

### TSO
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Because TiKV is a distributed storage system, it requires a global timing service, Timestamp Oracle (TSO), to assign a monotonically increasing timestamp. In TiKV, such a feature is provided by PD, and in Google [Spanner](http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf), this feature is provided by multiple atomic clocks and GPS. For details, see [TSO](/tso.md).

## U

### URI
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Uniform Resource Identifier (URI) is a uniform way of describing a resource. See [Uniform Resource Identifier](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier) on Wikipedia for more information.

### UUID
dveeden marked this conversation as resolved.
Show resolved Hide resolved

Universally Unique Identifier (UUID) is a 128-bit (16 byte) generated ID that can be used to identify records in a database. See [UUID](/best-practices/uuid.md) for more information on how UUID's are used in TiDB.
Loading