WIP

mathesar-foundation · Nov 27, 2024 · 767b891 · 767b891
1 parent 4ce1086
commit 767b891
Show file tree

Hide file tree

Showing 7 changed files with 197 additions and 118 deletions.
diff --git a/docs/docs/user-guide/databases.md b/docs/docs/user-guide/databases.md
@@ -8,7 +8,7 @@ If you're using Mathesar as a spreadsheet alternative, you might be curious what
 
 Within a database, you can have multiple [tables](./tables.md) &mdash; much like you might have multiple _sheets_ within a spreadsheet. And within each table in your database, you'll have rows and columns, similar to a spreadsheet. But while a spreadsheet gives you a blank canvas to freely enter any data into any cell you choose, a database is more structured. Rows and columns must be explicitly added before you can enter data, and each column must have a name and a [data type](./data-types.md). In a database, rows are sometimes called "records".
 
-The biggest superpower of databases (specifically _relational_ databases like PostgreSQL) is the ability for cells to reference records from another table. In PostgreSQL, this concept is called a "foreign key constraint". Mathesar exposes this concept as ["references"](./references.md). If you've used `VLOOKUP` in a spreadsheet, you'll love using references in Mathesar!
+The biggest superpower of databases (specifically _relational_ databases like PostgreSQL) is the ability for cells to reference records from another table. In PostgreSQL, this concept is called foreign key constraints, and Mathesar leverages it so you can model your data with [relationships](./relationships.md). If you've ever used `VLOOKUP` in a spreadsheet, you'll love using relationships in Mathesar!
 
 ## Connecting a database {:#connection}
 
@@ -89,7 +89,7 @@ Separate from your connected PostgreSQL database, Mathesar also maintains an int
         </li>
         <li>The rows, columns, and cells within those tables</li>
         <li>
-          <a href="/user-guide/references/">References</a>
+          <a href="/user-guide/relationships/">Relationships</a>
           between those tables
         </li>
         <li>

diff --git a/docs/docs/user-guide/index.md b/docs/docs/user-guide/index.md
@@ -2,7 +2,7 @@
 
 ## How Mathesar works
 
-Mathesar is a web application that gives you a spreadsheet-like interface to one or more PostgreSQL [databases](./databases.md). It lets technical and non-technical users collaborate directly with the same relational data, providing user-friendly access to your database's [schemas](./schemas.md), [tables](./tables.md), [foreign key references](./references.md), and so on &mdash; all comfortably within the limits of the PostgreSQL [privileges](./access-control.md) for the PostgreSQL role that you give to Mathesar. Your data appears in Mathesar exactly as it is structured in PostgreSQL, with some additional convenience features ease the process of working with related data while keeping it normalized.
+Mathesar is a web application that gives you a spreadsheet-like interface to one or more PostgreSQL [databases](./databases.md). It lets technical and non-technical users collaborate directly with the same relational data, providing user-friendly access to your database's [schemas](./schemas.md), [tables](./tables.md), [relationships](./relationships.md), and so on &mdash; all comfortably within the limits of the PostgreSQL [privileges](./access-control.md) for the PostgreSQL role that you give to Mathesar. Your data appears in Mathesar exactly as it is structured in PostgreSQL, with some additional convenience features ease the process of working with related data while keeping it normalized.
 
 You can use Mathesar with PostgreSQL data sets you already have. Point it at your database, and you'll have a powerful GUI admin tool to help with data entry, analytics, and internal back-office processes.
 

diff --git a/docs/docs/user-guide/metadata.md b/docs/docs/user-guide/metadata.md
@@ -12,7 +12,7 @@ For each table, the following optional configurations are stored as metadata:
 
 - **Record summary template**
 
-    The template used to generate [record summaries](./references.md#record-summaries). This allows links to records in the table to be summarized into short human-readable pieces of text.
+    The template used to generate [record summaries](./relationships.md#record-summaries). This allows links to records in the table to be summarized into short human-readable pieces of text.
 
     Without any metadata, the record summary will be generated using the first text-like column of the table if possible.
 

diff --git a/docs/docs/user-guide/references.md b/docs/docs/user-guide/references.md
diff --git a/docs/docs/user-guide/relationships.md b/docs/docs/user-guide/relationships.md
@@ -0,0 +1,188 @@
+# Relationships
+
+Relationships allow a single cell in one table to reference a row in another table. When one table references another in this manner, the two tables are said to be "related". This is a core feature of relational databases, and it allows us to model complex data structures using multiple tables.
+
+## Example
+
+Let's say we are maintaining an address book of people and their contact info...
+
+???+ failure "Without Relationships"
+
+    Associating multiple email addresses with one person is tricky! We might try the following approaches:
+
+    - Option 1: Email addresses combined into a single column:
+
+        | name | emails |
+        | - | - |
+        | Alice Roberts | [email protected], [email protected] |
+        | Bob Davis | [email protected] |
+
+    - Option 2: Email addresses spread across multiple columns
+
+        | name | email_1 | email_2 |
+        | - | - | - |
+        | Alice Roberts | [email protected] | [email protected] |
+        | Bob Davis | [email protected] | |
+
+    - Option 3: People spread across multiple rows
+
+        | name | email |
+        | - | - |
+        | Alice Roberts | [email protected] |
+        | Alice Roberts | [email protected] |
+        | Bob Davis | [email protected] |
+
+    None of these options are ideal. They make it difficult to query the data, and they make it easy to introduce errors.
+
+???+ success "With Relationships"
+
+    We can create _two_ tables:
+
+    The `people` table:
+
+    | id | name |
+    | - | - |
+    | 1 | Alice Roberts |
+    | 2 | Bob Davis |
+
+    The `emails` table:
+
+    | id | email | person |
+    | - | - | - |
+    | 1 | [email protected] | 1 |
+    | 2 | [email protected] | 1 |
+    | 3 | [email protected] | 2 |
+
+    And we configure the `person` column to **reference** the `id` column in the `people` table, ensuring that all the references are valid. The database handles this validation for us, and even prevents us from deleting a person without deleting their associated email addresses too.
+
+## Normalization
+
+This practice of modeling data through multiple related tables is called **data normalization**, and it's why a database will typically have its data spread across many tables, each with their own unique column structure, and very few tables providing much use or value in isolation. Normalized data structures are more efficient to query and update, and they help to ensure data integrity by reducing redundancy and minimizing the risk of inconsistencies. But they can also be more cumbersome to work with manually due to the inderection inherent in having data spread across multiple tables. Mathesar helps you manage this complexity by providing a user-friendly interface to work with normalized data.
+
+## Foreign key constraints in PostgreSQL
+
+In PostgreSQL, references are called "[foreign key constraints](https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-FK)", or simply "foreign keys". These constraints are set on the table to ensure that the data in the referencing column always points to a valid row in the referenced table.
+
+## Reference columns in Mathesar
+
+Mathesar identifies reference columns in your database by looking for foreign key constraints set in PostgreSQL. And when you create a reference column in Mathesar, it will automatically create the necessary foreign key constraint in PostgreSQL.
+
+As noted below, reference columns get some extra features too!
+
+### Record summaries {:#record-summaries}
+
+Without Mathesar, reference cells are typically rather opaque. Often they contain only an id number, which is not very helpful when you're trying to understand the data.
+
+Mathesar helps solve this problem by providing a feature called "record summaries" which allows you to see a short text summary of the referenced record directly in the referencing cell. By default, the record summary will be the value of the first text-like column in the referenced table. You can customize the record summary to show any columns and text you choose.
+
+To customize a record summary, you can either:
+
+- Start from the referenced table, and:
+
+    1. Go to the table page of the referenced table.
+    1. In the table inspector on the right, click on the "Table" tab.
+    1. Find the "Record Summary" section below.
+
+    *or*
+
+- Start from a reference column, and:
+
+    1. Go to the table page containing the reference column.
+    1. Select the reference column or a cell within it.
+    1. In the table inspector on the right, click on the "Column" tab.
+    1. Find the "Linked Record Summary" section below.
+
+### Record selector
+
+Reference columns also provide a "record selector" tool which helps you search through referenced records when modifying reference values. It allows you to search on all columns from the referenced table and will use fuzzy logic to find the most relevant records. You can even create new records directly from the record selector.
+
+### Limitations of Mathesar's reference columns
+
+- Mathesar does not support supports "composite" foreign keys &mdash; foreign keys that reference _multiple_ columns in the referenced table at once.
+
+- Some PostgreSQL databases might contain normalized data which is implicitly structured to utilize the concept of references but which lacks the foreign key constraints necessarly to ensure data integrity. Mathesar will not treat such columns as references. It only recognizes foreign key columns as references.
+
+## Relationship types and patterns
+
+### One-to-many relationships
+
+To illustrate a one-to-many relationship we'll re-use our example above.
+
+- We'll have a `people` table as follows:
+
+    | id | name |
+    | - | - |
+    | 1 | Alice Roberts |
+    | 2 | Bob Davis |
+
+- And an `emails` table as follows:
+
+    | id | email | person |
+    | - | - | - |
+    | 1 | [email protected] | `Alice Roberts` |
+    | 2 | [email protected] | `Alice Roberts` |
+    | 3 | [email protected] | `Bob Davis` |
+
+    !!! note
+        Here the reference column, `person`, displays with formatting to mimic Mathesar's record summaries feature.
+
+Now **one** person can have **many** email addresses, hence the name "one-to-many".
+
+### Many-to-one relationships
+
+A many-to-one relationships is structurually equivalent to a one-to-many relationships, but with the perspective reversed. The two terms are often used interchangeably.
+
+### Many-to-many relationships
+
+Continuing our address book example, let's pretend we'd like to apply tags to our contacts. For example, we'd like to:
+
+- Tag Alice Roberts as "colleague"
+- Tag Bob Davis as "friend" and "colleague"
+
+We can use three tables to model this relationship:
+
+- A `people` table (as before):
+
+    | id | name |
+    | - | - |
+    | 1 | Alice Roberts |
+    | 2 | Bob Davis |
+
+- A new `tags` table:
+
+    | id | tag |
+    | - | - |
+    | 1 | colleague |
+    | 2 | friend |
+
+- And a new `people_tags` table (sometimes referred to as a "join table" or "mapping table"):
+
+    | id | person | tag |
+    | - | - | - |
+    | 1 | `Alice Roberts` | `colleague` |
+    | 2 | `Bob Davis` | `friend` |
+    | 3 | `Bob Davis` | `colleague` |
+
+Now people can have **many** tags and tags can have **many** people, hence the name "many-to-many".
+
+### Other types of relationships
+
+More esoteric relationships are possible too. For example:
+
+- One-to-one relationships can be created by applying a unique constraint to the reference column. This is sometimes useful in more complex situations.
+- Heirarchical data structures can be modeled using self-referential relationships.
+- Polymorphic relationships can be modeled through a [variety of different patterns](https://hashrocket.com/blog/posts/modeling-polymorphic-associations-in-a-relational-database).
+
+## Creating relationships
+
+1. First, create the tables you want to relate.
+1. From the table page of either table, open the "Table" tab within the table inspector, and find the "Relationships" section.
+1. Click on the "Create relationship" button, and follow the prompts.
+
+Alternatively, you can manually add a foreign key constraint to an existing column with the following steps:
+
+1. Open the "Table" tab within the table inspector.
+1. Open the "Advanced" section at the bottom.
+1. Click on the "Constraints" button.
+1. Next to "Foreign Key", click on "Add".
+
diff --git a/docs/docs/user-guide/tables.md b/docs/docs/user-guide/tables.md
@@ -2,11 +2,7 @@
 
 ## What is a table
 
-All relational databases, including PostgreSQL, organize data into tables containing rows, columns, and cells. Much like a single spreadsheet might have multiple _sheets_ within it, a single database will typically have several &mdash; or sometimes several _dozen_ &mdash; tables within it. Unlike most spreadsheets though, database tables are usually highly interconnected. In a database, [references](./references.md) offer a robust mechanism for one cell to reference one record in another table. By leveraging these references, we can unlock the ability to model complex data structures via multiple linked tables.
-
-## Normalization
-
-This practice of modeling data through multiple linked tables is called **data normalization**, and it's why a database will typically have its data spread across many tables, each with their own unique column structure, and very few tables providing much use or value in isolation. Normalized data structures are more efficient to query and update, and they help to ensure data integrity by reducing redundancy and minimizing the risk of inconsistencies. But they can also be more complex to work with, especially for users who are accustomed to the simplicity of spreadsheets. Mathesar aims to bridge this gap by providing a user-friendly interface to normalized data structures.
+All relational databases, including PostgreSQL, organize data into tables containing rows, columns, and cells. Much like a single spreadsheet might have multiple _sheets_ within it, a single database will typically have several &mdash; or sometimes several _dozen_ &mdash; tables within it. Unlike most spreadsheets though, database tables are usually highly interconnected. In a database, [relationships](./relationships.md) offer a robust mechanism for one cell to reference one record in another table. By leveraging relationships, we can unlock the ability to model complex data structures via multiple linked tables.
 
 ## Managing tables
 
@@ -29,10 +25,10 @@ Keep in mind that your ability to alter tables may be limited by [access control
 
     - `SELECT` - Allows reading data from the table
     - `INSERT` - Allows creation of new records within the table.
-    - `UPDATE` - Allow creation of new records within the table.
-    - `DELETE` - Allow creation of new records within the table.
+    - `UPDATE` - Allow updating existing records within the table.
+    - `DELETE` - Allow deletion of records from the table.
     - `TRUNCATE` - Allows the deletion of all records from the table at once
-    - `REFERENCES` - Allow creation of new records within the table.
+    - `REFERENCES` - Allow creation of foreign key constraints that [reference](./relationships.md) the table.
     - `TRIGGER` - Allow creation of triggers on the table.
 
 See the [PostgreSQL docs](https://www.postgresql.org/docs/17/ddl-priv.html) for more info.