Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pagination): re-importing from AIP-158 #168

Merged
merged 3 commits into from
May 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions aep/general/0003/aep.md.j2
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,12 @@ Examples of complexities that declarative clients abstract away include:

[Terraform][] is an example of such a client.

### Schema

A schema describes the structure of the request or response of an [API
method](#api-method), or a [resource](#api-resource). It refers both to an
OpenAPI schema as well as a protobuf message.

### User

A human being which is using an API directly, such as with cURL. This term is
Expand Down
189 changes: 76 additions & 113 deletions aep/general/0158/aep.md.j2
Original file line number Diff line number Diff line change
@@ -1,100 +1,105 @@
# Pagination

APIs often need to provide collections of data, most commonly in the
[List][aip-132] standard method. However, collections can often be arbitrarily
sized, and also often grow over time, increasing lookup time as well as the
size of the responses being sent over the wire. Therefore, it is important that
collections be paginated.
APIs often need to provide collections of data, most commonly in the [List][]
standard method. However, collections can often be arbitrarily sized, and also
often grow over time, increasing lookup time as well as the size of the
responses being sent over the wire. Therefore, it is important that collections
be paginated.

## Guidance

Operations returning collections of data **must** provide pagination _at the
outset_, as it is a [backwards-incompatible change](#backwards-compatibility)
to add pagination to an existing method.
Methods returning collections of data **must** provide pagination _at the outset_,
as it is a [backwards-incompatible change](#backwards-compatibility) to add
pagination to an existing method.

```typescript
{% tab proto %}

```proto
// The request structure for listing books.
interface ListBooksRequest {
message ListBooksRequest {
// The parent, which owns this collection of books.
// Format: publishers/{publisher}
parent: string;
string parent = 1 [
(google.api.field_behavior) = REQUIRED,
(google.api.resource_reference) = {
child_type: "library.googleapis.com/Book"
}];

// The maximum number of books to return. The service may return fewer than
// this value.
// If unspecified, at most 50 books will be returned.
// The maximum value is 1000; values above 1000 will be coerced to 1000.
maxPageSize: bigint;
int32 max_page_size = 2;

// A page token, received from a previous `ListBooks` call.
// Provide this to retrieve the subsequent page.
//
// When paginating, all other parameters provided to `ListBooks` must match
// the call that provided the page token.
pageToken: string;
string page_token = 3;
}

// The response structure from listing books.
interface ListBooksResponse {
message ListBooksResponse {
// The books from the specified publisher.
books: Book[];
repeated Book books = 1;

// A token that can be sent as `page_token` to retrieve the next page.
// If this field is omitted, there are no subsequent pages.
nextPageToken: string;
string next_page_token = 2;
}
```

- Request definitions for collections **should** define an
`int32 max_page_size` field, allowing users to specify the maximum number of
results to return.
- The field containing pagination results **should** be the first field in
the message and have a field number of `1`.

{% tab oas %}

**Note:** OAS example not yet written.

{% endtabs %}

- Request messages for collections **should** define an integer (`int32` for
protobuf) `max_page_size` field, allowing users to specify the maximum number of
results to return.
- If the user does not specify `max_page_size` (or specifies `0`), the API
chooses an appropriate default, which the API **should** document. The API
**must not** return an error.
- If the user specifies `max_page_size` greater than the maximum permitted by
the service, the service **should** coerce down to the maximum permitted
page size.
- If the user specifies a negative value for `max_page_size`, the API
**must** return a `400 Bad Request` error.
- The service **should** the number of results requested, unless the end of
the collection is reached.
- However, occasionally this is infeasible, especially within expected time
limits. In these cases, the service **may** return fewer results than the
number requested (including zero results), even if not at the end of the
collection.
- Request definitions for collections **should** define a `string page_token`
- If the user specifies `max_page_size` greater than the maximum permitted by the
API, the API **should** coerce down to the maximum permitted page size.
- If the user specifies a negative value for `max_page_size`, the API **must**
send an `INVALID_ARGUMENT` error.
- The API **may** return fewer results than the number requested (including
zero results), even if not at the end of the collection.
- Request schemas for collections **should** define a `string page_token`
field, allowing users to advance to the next page in the collection.
- If the user changes the `max_page_size` in a request for subsequent pages,
the service **must** honor the new page size.
- The user is expected to keep all other arguments to the operation request
the same; if any arguments are different, the API **should** send a
`400 Bad Request` error.
- The `page_token` field **must not** be required.
- If the user changes the `max_page_size` in a request for subsequent pages, the
service **must** honor the new page size.
- The user is expected to keep all other arguments to the method the same; if
any arguments are different, the API **should** send an `INVALID_ARGUMENT`
error.
- The response **must not** be a streaming response.
- Services **may** support using page tokens across versions of a service, but
are not required to do so.
- Response definitions for collections **must** define a
- Response messages for collections **must** define a
`string next_page_token` field, providing the user with a page token that may
be used to retrieve the next page.
- The field containing pagination results **should** be the first field
specified. It **should** be a repeated field containing a list of resources
constituting a single page of results.
- The field containing pagination results **must** be a repeated
field containing a list of resources constituting a single page of results.
- If the end of the collection has been reached, the `next_page_token` field
**must** be empty. This is the _only_ way to communicate
"end-of-collection" to users.
- If the end of the collection has not been reached (or if the API can not
determine in time), the service **must** provide a `next_page_token`.
- Response definitions **may** include a `string next_page_url` field
containing the full URL for the next page.
- Response definitions for collections **may** provide an `int32 total_size`
field, providing the user with the total number of items in the list.
- This total **may** be an estimate (but the API **should** explicitly
document that).

[rfc-8288]: https://tools.ietf.org/html/rfc8288
determine in time), the API **must** provide a `next_page_token`.
- Response messages for collections **may** provide an integer (`int32` for
protobuf) `total_size` field, providing the user with the total number of items
in the list.
- This total **may** be an estimate. If so, the API **should** explicitly
document that.

### Skipping results

The request definition for a paginatied operation **may** define an
`int32 skip` field to allow the user to skip results.
The request definition for a paginated operation **may** define an integer
(`int32` for protobuf ) `skip` field to allow the user to skip results.

The `skip` value **must** refer to the number of individual resources to skip,
not the number of pages.
Expand All @@ -111,9 +116,9 @@ If a `skip` value is provided that causes the cursor to move past the end of
the collection of results, the response **must** be `200 OK` with an empty
result set, and not provide a `next_page_token`.

### Opacity
### Page Token Opacity

Page tokens provided by services **must** be opaque (but URL-safe) strings, and
Page tokens provided by APIs **must** be opaque (but URL-safe) strings, and
**must not** be user-parseable. This is because if users are able to
deconstruct these, _they will do so_. This effectively makes the implementation
details of your API's pagination become part of the API surface, and it becomes
Expand All @@ -133,32 +138,21 @@ authorization to the underlying resources, and authorization **must** be
performed on the request as with any other regardless of the presence of a page
token.

### Expiring page tokens

Many services store page tokens in a database internally. In this situation,
the service **may** expire page tokens a reasonable time after they have been
sent, in order not to needlessly store large amounts of data that is unlikely
to be used. It is not necessary to document this behavior.

**Note:** While a reasonable time may vary between services, a good rule of
thumb is three days.

### Consistency
### Page Token Expiration

When discussing pagination, consistency refers to the question of what to do if
the underlying collection is modified while pagination is in progress. The most
common way that this occurs is for a resource to be added or deleted in a place
that the pagination cursor has already passed.
Many APIs store page tokens in a database internally. In this situation, APIs
**may** expire page tokens a reasonable time after they have been sent, in
order not to needlessly store large amounts of data that is unlikely to be
used. It is not necessary to document this behavior.

Services **may** choose to be strongly consistent by approximating the
"repeatable read" behavior in databases, and returning exactly the records that
exist at the time that pagination begins.
**Note:** While a reasonable time may vary between APIs, a good rule of thumb
is three days.

### Backwards compatibility

Adding pagination to an existing operation is a backwards-incompatible change.
This may seem strange; adding fields to interface definitions is generally
backwards compatible. However, this change is _behaviorally_ incompatible.
Adding pagination to an existing method is a backwards-incompatible change. This
may seem strange; adding fields to proto messages is generally backwards
compatible. However, this change is _behaviorally_ incompatible.

Consider a user whose collection has 75 resources, and who has already written
and deployed code. If the API later adds pagination fields, and sets the
Expand All @@ -167,48 +161,17 @@ now is only getting the first 50 (and does not know to advance pagination).
Even if the API set a higher default limit, such as 100, the user's collection
could grow, and _then_ the code would break.

For this reason, it is important to always add pagination to operations
returning collections _up front_; they are consistently important, and they can
not be added later without causing problems for existing users.
For this reason, it is important to always add pagination to RPCs returning
collections _up front_; they are consistently important, and they can not be
added later without causing problems for existing users.

**Warning:** This also entails that, in addition to presenting the pagination
fields, they **must** be _actually implemented_ with a non-infinite default
value. Implementing an in-memory version (which might fetch everything then
paginate) is reasonable for initially-small collections.

## Implementation
[list]: ./list

Page tokens **should** be versioned independently of the public API, so that
toumorokoshi marked this conversation as resolved.
Show resolved Hide resolved
page tokens can be used with any version of the service.

The simplest form of a page token only requires an offset. However, offsets
pose challenges when a distributed database is introduced, so a more robust
page token needs to store the information needed to find a "logical" position
in the database. The simplest way to do this is to include relevant data from
the last result returned. Primarily, this means the resource ID, but also
includes any other fields from the resource used to sort the results (for the
event where the resource is changed or deleted).

This information is from the resource itself, and therefore might be sensitive.
Sensitive data **must** be encrypted before being used in a page token.
Therefore, the token also includes the date it was created, to allow for the
potential need to rotate the encryption key.

This yields the following interface, which **may** be base64 encoded and used
as a page token:

```typescript
interface PageTokenSecrets {
// The ID of the most recent resource returned.
lastId: string;

// Any index data needed, generally 1:1 with the fields used for ordering.
indexData: Buffer[];

// When this token was minted.
createTime: Date;
}
```
## Changelog

**Note:** This section does not preclude alternative page token implementations
provided they conform to the guidelines discussed in this document.
- **2024-04-14**: Imported from https://google.aip.dev/158.
1 change: 1 addition & 0 deletions aep/general/0158/aep.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ id: 158
state: approved
slug: pagination
created: 2019-02-18
updated: 2024-04-14
placement:
category: design-patterns
order: 60
Loading