Features:
- Supports Binary and String key types
- Generates AES256 encrypted and authenticated pagination tokens
- Works with TypeScript type guards natively
- Ensures a minimum number of items when using a
FilterExpression
- Compatible with AWS SDK v2 and v3
- Supports pagination over segmented parallel scans
Pagination in NoSQL stores such as DynamoDB can be challenging. This
library provides a developer friendly interface around the DynamoDB Query
and Scan
APIs.
It generates and encrypted and authenticated pagination token that can be shared with an untrustworthy
client (like the browser or a mobile app) without disclosing potentially sensitive data and protecting
the integrity of the token.
Why is the pagination token encrypted?
When researching pagination with DynamoDB, you will come across blog posts and libraries that recommend
to JSON-encode the LastEvaluatedKey
attribute (or even the whole query command). This is dangerous!
The token is sent to a client which can happily decode the token, look at the values for the partition and sort key and even modify the token, making the application vulnerable to NoSQL injections.
How is the pagination token encrypted?
The encryption key passed to the paginator is used to derive an encryption and a signing key using an HMAC.
The LastEvaluatedKey
attribute is first flattened by length-encoding its datatypes and values. The
encoded key is then encrypted with the encryption key using AES-256 in CBC mode with a randomly generated IV.
The additional authenticated data (AAD), the IV, the ciphertext and an int64 of the length of the AAD are concatenated to form the message to be signed.
The encrypted and signed pagination token is then returned by concatenating the IV, ciphertext and the first 16 bytes of the HMAC-SHA256 of the message using the signing key.
"Dance like nobody is watching. Encrypt like everyone is." -- Werner Vogels
import { Paginator } from '@emdgroup/dynamodb-paginator';
import { DynamoDB } from '@aws-sdk/client-dynamodb';
import { DynamoDBDocument } from '@aws-sdk/lib-dynamodb';
import * as crypto from 'crypto';
const client = DynamoDBDocument.from(new DynamoDB({}));
// persist the key in the SSM parameter store or similar
const key = crypto.randomBytes(32);
const paginateQuery = Paginator.createQuery({
key,
client,
});
const paginator = paginateQuery({
TableName: 'MyTable',
KeyConditionExpression: 'PK = :pk',
ExpressionAttributeValues: {
':pk': 'U#ABC',
},
});
Use for await...of
syntax:
for await (const item of paginator) {
// do something with item
// only work on the first 50 items,
// then generate a pagination token and break.
if (paginator.count === 50) {
console.log(paginator.nextToken);
break;
}
}
items.requestCount; // number of requests to DynamoDB
Use await all()
syntax:
const items = await paginator.limit(50).all();
paginator.nextToken;
const nextPaginator = paginator.from(paginator.nextToken);
nextPaginator.all(); // up to 50 more items
Use TypeScript guards to filter for items:
interface User {
PK: string;
SK: string;
}
function isUser(arg: Record<string, unknown>): args is User {
return typeof arg.PK === 'string' &&
typeof arg.SK === 'string' &&
arg.PK.startsWith('U#');
}
for await (const user of paginator.filter(isUser)) {
// user is of type User
}
The Paginator
class is a factory for the PaginationResponse
object. This class
is instantiated with a 32-byte key and the DynamoDB document client (versions
2 and 3 of the AWS SDK are supported).
const paginateQuery = Paginator.createQuery({
key: () => Promise.resolve(crypto.randomBytes(32)),
client: documentClient,
});
To create a paginator over a scan operation, use createScan
.
const paginateScan = Paginator.createScan({
key: () => Promise.resolve(crypto.randomBytes(32)),
client: documentClient,
});
This library also supports pagination over segmented parallel scans. This is useful when you have a large table and want to parallelize the scan operation to reduce the time it takes to scan the whole table.
To create a paginator over a segmented scan operation, use createParallelScan
.
const paginateParallelScan = Paginator.createParallelScan({
key: () => Promise.resolve(crypto.randomBytes(32)),
client: documentClient,
});
Then, create a paginator and pass the segments
parameter.
const paginator = paginateParallelScan({
TableName: 'MyTable',
Limit: 250,
}, { segments: 10 });
await paginator.all();
The scan will be executed in parallel over 10 segments. The paginator will return the items in the order
they are returned by DynamoDB which might deliver items from different segments out of order. Refer to the
following waterfall diagram for an example. The parallel scan was executed over a high-latency connection
to better illustrate the variability in the requests and responses. Even though the Limit
is set to 250,
DynamoDB will return on occasion less than 250 items per segment. The paginator will continue to request
items until all segments have been exhausted.
• new Paginator(args
)
Use the static factory function create()
instead of the constructor.
Name | Type |
---|---|
args |
PaginatorOptions |
▸ Static
createParallelScan(args
): <T>(scan
: ScanCommandInput
, opts
: PaginateQueryOptions
<T
> & { segments
: number
}) => ParallelPaginationResponse
<T
>
Returns a function that accepts a DynamoDB Scan command and return an instance of PaginationResponse
.
Name | Type |
---|---|
args |
PaginatorOptions |
fn
▸ <T
>(scan
, opts
): ParallelPaginationResponse
<T
>
Returns a function that accepts a DynamoDB Scan command and return an instance of PaginationResponse
.
Name | Type |
---|---|
T |
extends AttributeMap |
Name | Type |
---|---|
scan |
ScanCommandInput |
opts |
PaginateQueryOptions <T > & { segments : number } |
ParallelPaginationResponse
<T
>
▸ Static
createQuery(args
): <T>(query
: QueryCommandInput
, opts?
: PaginateQueryOptions
<T
>) => PaginationResponse
<T
>
Returns a function that accepts a DynamoDB Query command and return an instance of PaginationResponse
.
Name | Type |
---|---|
args |
PaginatorOptions |
fn
▸ <T
>(query
, opts?
): PaginationResponse
<T
>
Returns a function that accepts a DynamoDB Query command and return an instance of PaginationResponse
.
Name | Type |
---|---|
T |
extends AttributeMap |
Name | Type |
---|---|
query |
QueryCommandInput |
opts? |
PaginateQueryOptions <T > |
▸ Static
createScan(args
): <T>(scan
: ScanCommandInput
, opts?
: PaginateQueryOptions
<T
>) => PaginationResponse
<T
>
Returns a function that accepts a DynamoDB Scan command and return an instance of PaginationResponse
.
Name | Type |
---|---|
args |
PaginatorOptions |
fn
▸ <T
>(scan
, opts?
): PaginationResponse
<T
>
Returns a function that accepts a DynamoDB Scan command and return an instance of PaginationResponse
.
Name | Type |
---|---|
T |
extends AttributeMap |
Name | Type |
---|---|
scan |
ScanCommandInput |
opts? |
PaginateQueryOptions <T > |
• client: DynamoDBDocumentClientV2
| DynamoDBDocumentClient
AWS SDK v2 or v3 DynamoDB Document Client.
• Optional
indexes: Record
<string
, [partitionKey: string, sortKey?: string]> | (index
: string
) => [partitionKey: string, sortKey?: string]
Object that resolves an index name to the partition and sort key for that index. Also accepts a function that builds the names based on the index name.
Defaults to (index) => [`${index}PK`, `${index}SK`]
.
• key: CipherKey
| Promise
<CipherKey
> | () => CipherKey
| Promise
<CipherKey
>
A 32-byte encryption key (e.g. crypto.randomBytes(32)
). The key
parameter also
accepts a Promise that resolves to a key or a function that resolves to a Promise of a key.
If a function is passed, that function is lazily called only once. The function is called concurrently with the first query request to DynamoDB to reduce the overall latency for the first query. The key is cached and the function is not called again.
• Optional
schema: [partitionKey: string, sortKey?: string]
Names for the partition and sort key of the table. Defaults to ['PK', 'SK']
.
The PaginationResponse
class implements the query result iterator. It has a number of
utility functions such as peek()
and all()
to simplify common usage patterns.
The iterator can be interrupted and resumed at any time. The iterator will stop to produce
items after the end of the query is reached or the provided limit
parameter is exceeded.
Name | Type |
---|---|
T |
extends AttributeMap = AttributeMap |
• count: number
Number of items yielded
• get
consumedCapacity(): number
Total consumed capacity for query
number
• get
finished(): boolean
Returns true if all items for this query have been returned from DynamoDB.
boolean
• get
nextToken(): undefined
| string
Token to resume query operation from the current position. The token is generated from the LastEvaluatedKey
attribute provided by DynamoDB and then AES256 encrypted such that it can safely be provided to an
untrustworthy client (such as a user browser or mobile app). The token is Base64 URL encoded which means that
it only contains URL safe characters and does not require further encoding.
The encryption is necessary to
prevent leaking sensitive information that can be included in the LastEvaluatedKey
provided
by DynamoDB. It also prevents a client from modifying the token and therefore manipulating the query
execution (NoSQL injection).
The length of the token depends on the length of the values for the partition and sort key of the table or index that you are querying. The token length is at least 42 characters.
undefined
| string
• get
requestCount(): number
Number of requests made to DynamoDB
number
• get
scannedCount(): number
Number of items scanned by DynamoDB
number
▸ [asyncIterator](): AsyncGenerator
<T
, void
, void
>
for await (const item of items) {
// work with item
}
AsyncGenerator
<T
, void
, void
>
▸ all(): Promise
<T
[]>
Return all items from the query (up to limit
items). This is potentially dangerous and expensive
as it this query will keep making requests to DynamoDB until there are no more items. It is recommended
to pair all()
with a limit()
to prevent a runaway query execution.
Promise
<T
[]>
▸ filter<K
>(predicate
): PaginationResponse
<K
>
Filter results by a predicate function
Name | Type |
---|---|
K |
extends AttributeMap |
Name | Type |
---|---|
predicate |
(arg : AttributeMap ) => arg is K |
▸ from<L
>(nextToken
): L
Start returning results starting from nextToken
Name | Type |
---|---|
L |
extends PaginationResponse <T , L > |
Name | Type |
---|---|
nextToken |
undefined | string |
L
▸ limit<L
>(limit
): L
Limit the number of results to limit
. Will return at least limit
results even when using FilterExpressions.
Name | Type |
---|---|
L |
extends PaginationResponse <T , L > |
Name | Type |
---|---|
limit |
number |
L
▸ peek(): Promise
<undefined
| T
>
Returns the first item in the query without advancing the iterator. peek()
can
also be used to "prime" the iterator. It will immediately make a request to DynamoDB
and fill the iterators cache with the first page of results. This can be useful if
you have other concurrent asynchronous requests:
const items = paginateQuery(...);
await Promise.all([
items.peek(),
doSomethingElse(),
]);
for await (const item of items) {
// the first page of items has already been pre-fetched so they are available immediately
}
peek
can be invoked inside a for await
loop. peek
returns undefined
if there are no
more items returned or if the limit
has been reached.
for await (const item of items) {
const next = await items.peek();
if (!next) {
// we've reached the last item
}
}
peek()
does not increment the count
attribute.
Promise
<undefined
| T
>
Name | Type |
---|---|
T |
extends AttributeMap |
• Optional
context: string
| Buffer
The context defines the additional authenticated data (AAD) that is used to generate the signature
for the pagination token. It is optional but recommended because it adds an additional layer of
authentication to the pagination token. Pagination token will be tied to the context and replaying
them in other contexts will fail. Good examples for the context are a user ID or a session ID concatenated
with the purpose of the query, such as ListPets
. The context cannot be extracted from the pagination
token and can therefore contain sensitive data.
• Optional
filter: (arg
: AttributeMap
) => arg is T
▸ (arg
): arg is T
Filter results by a predicate function
Name | Type |
---|---|
arg |
AttributeMap |
arg is T
• Optional
from: string
Start returning results starting from nextToken
• Optional
limit: number
Limit the number of results to limit
. Will return at least limit
results even when using FilterExpressions.