Skip to content

Latest commit

 

History

History
 
 

id-compressor

@fluidframework/id-compressor

A library which generates small number representations of arbitrary non-colliding Version 4 UUIDs ("stable IDs") across multiple sessions in a network. This scheme enables a distributed application to utilize the global uniqueness guarantees of UUIDs while maintaining the performance advantages of small integers.

Using Fluid Framework libraries

When taking a dependency on a Fluid Framework library's public APIs, we recommend using a ^ (caret) version range, such as ^1.3.4. While Fluid Framework libraries may use different ranges with interdependencies between other Fluid Framework libraries, library consumers should always prefer ^.

If using any of Fluid Framework's unstable APIs (for example, its beta APIs), we recommend using a more constrained version range, such as ~.

Installation

To get started, install the package by running the following command:

npm i @fluidframework/id-compressor

Importing from this package

This package leverages package.json exports to separate its APIs by support level. For more information on the related support guarantees, see API Support Levels.

To access the public (SemVer) APIs, import via @fluidframework/id-compressor like normal.

To access the legacy APIs, import via @fluidframework/id-compressor/legacy.

API Documentation

API documentation for @fluidframework/id-compressor is available at https://fluidframework.com/docs/apis/id-compressor.

Overview

The distributed ID allocation scheme allows clients to use, store, and reference small numbers that map to UUIDs rather than the UUIDs themselves. The primary benefits are improved memory efficiency and reduced storage size, as well as improved runtime performance when used as keys in collections.

The scheme requires a total order for ID allocation, so the allocator is designed to work within a service that provides centralized total order broadcast.

Sessions

A session can be thought of as a unique identifier for a single allocator. Even if an allocator is taken offline, serialized to disk, and rehydrated back, it's considered the same session if it has the same session ID.

Document

The set of IDs generated by all Sessions contributing to the same distributed collection (e.g. an individual collaborative document). Uniqueness of allocated IDs is limited to the document context (unless in their UUID form), and thus IDs in their compressed form are not portable between documents.

Generation API

New IDs are generated via a synchronous API on the allocator. The API returns a small number representation of the UUID with the following guarantees:

  • The ID can be synchronously converted into a full UUID at any time. This is required if global uniqueness (across documents) is needed.
  • Prior to the service being notified of the ID's creation, it is unique within the scope of the session (i.e. the new ID will not collide with other IDs from the same allocator, but it can collide with IDs from a different allocator in the same network).
  • After the service has been notified of the ID's creation and has determined a total ordering between all clients creating IDs, the ID will have a "final" form that is a small number representation that is unique across the document.

ID Spaces

The allocation scheme separates the small-number IDs into two "spaces":

  1. Session space: IDs are unique within the session scope, but are not guaranteed to be document unique without the accompanying context of their session ID.
  2. Op space: IDs are in their most final form, as unique as possible within the entire system. IDs that have been ordered by the central service ("finalized") are represented in their final form, while any other IDs that have not yet been finalized are left in their "session space" form.

Each of these spaces is represented by a discrete type. This allows for the direct encoding of an ID's uniqueness guarantees without contextual ambiguity. An ID can be "normalized" from one space to the other with a sychronous call.

Usage Guidelines

When delivering Allocated IDs to application authors for use in place of V4 UUIDs, follow these guidelines:

  • Expose only session space IDs to application authors. This means that application developers don't need to manage multiple ID spaces or worry about converting IDs between session space and op space. This simplifies the data management process and reduces the chances of errors or inconsistencies.

    • Note: Session space IDs are only unique within the originating session (for most applications this will mean within each individual client's app instance, as it is rare to hold handles to more than one allocator). This constrains usage to scopes where IDs are unique; for instance, application developers should never persist these IDs and should instead be directed to convert them to stable IDs first.
  • Use op space IDs for serialized forms, such as sending information over the wire or storing it in a file. Op space IDs are always generated in their final form when possible, which imparts a performance benefit by reducing normalization work when a client receives a network operation or rehydrates an allocator.

  • When serializing op space IDs, annotate the entire context (e.g., file or network operation) with the session ID. This is necessary because not all IDs in op space are in their final form. Any non-finalized IDs will still be in their session space form, unique only within that specific session. Annotating the entire context with the session ID ensures that recipients of the serialized data can correctly interpret and process the non-finalized IDs.

Efficiency Properties

UUID Generation

The allocator generates UUIDs in non-random ways to reduce entropy, optimizing storage of data. A session begins with a random UUID, and subsequent IDs are allocated sequentially. UUIDs generated with this strategy are less likely to collide than fully random UUIDs, and this fact can be leveraged to compact the on-disk and in-memory representations of the allocator.

UUID generation is O(1).

Decompression/recompression

Converting an allocated ID into its UUID form (decompression) and the reverse (recompression) are both O(logn) in the number of IDs created by the originating session.

Clustering

The sequential allocation approach allows the system to implement a clustering scheme.

As allocators across the network create IDs, they reserve blocks of positive ID space (clusters) for that allocator. The positive integer space is sharded across different clients, and session space IDs are mapped into these clusters.

Clusters are efficient to store, requiring only two integers: the base positive integer in the cluster and the count of reserved IDs in that cluster.

Normalization

Normalization is O(logn) in the number of IDs created by the originating session, as it requires a simple binary search on the clusters owned by that session.

Minimum Client Requirements

These are the platform requirements for the current version of Fluid Framework Client Packages. These requirements err on the side of being too strict since within a major version they can be relaxed over time, but not made stricter. For Long Term Support (LTS) versions this can require supporting these platforms for several years.

It is likely that other configurations will work, but they are not supported: if they stop working, we do not consider that a bug. If you would benefit from support for something not listed here, file an issue and the product team will evaluate your request. When making such a request please include if the configuration already works (and thus the request is just that it becomes officially supported), or if changes are required to get it working.

Supported Runtimes

  • NodeJs ^20.10.0 except that we will drop support for it when NodeJs 20 loses its upstream support on 2026-04-30, and will support a newer LTS version of NodeJS (22) at least 1 year before 20 is end-of-life. This same policy applies to NodeJS 22 when it is end of life (2027-04-30).
  • Modern browsers supporting the es2022 standard library: in response to asks we can add explicit support for using babel to polyfill to target specific standards or runtimes (meaning we can avoid/remove use of things that don't polyfill robustly, but otherwise target modern standards).

Supported Tools

  • TypeScript 5.4:
    • All strict options are supported.
    • strictNullChecks is required.
    • Configuration options deprecated in 5.0 are not supported.
    • exactOptionalPropertyTypes is currently not fully supported. If used, narrowing members of Fluid Framework types types using in, Reflect.has, Object.hasOwn or Object.prototype.hasOwnProperty should be avoided as they may incorrectly exclude undefined from the possible values in some cases.
  • webpack 5
    • We are not intending to be prescriptive about what bundler to use. Other bundlers which can handle ES Modules should work, but webpack is the only one we actively test.

Module Resolution

Node16, NodeNext, or Bundler resolution should be used with TypeScript compilerOptions to follow the Node.js v12+ ESM Resolution and Loading algorithm. Node10 resolution is not supported as it does not support Fluid Framework's API structuring pattern that is used to distinguish stable APIs from those that are in development.

Module Formats

  • ES Modules: ES Modules are the preferred way to consume our client packages (including in NodeJs) and consuming our client packages from ES Modules is fully supported.

  • CommonJs: Consuming our client packages as CommonJs is supported only in NodeJS and only for the cases listed below. This is done to accommodate some workflows without good ES Module support. If you have a workflow you would like included in this list, file an issue. Once this list of workflows motivating CommonJS support is empty, we may drop support for CommonJS one year after notice of the change is posted here.

Contribution Guidelines

There are many ways to contribute to Fluid.

Detailed instructions for working in the repo can be found in the Wiki.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

This project may contain Microsoft trademarks or logos for Microsoft projects, products, or services. Use of these trademarks or logos must follow Microsoft’s Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.

Help

Not finding what you're looking for in this README? Check out fluidframework.com.

Still not finding what you're looking for? Please file an issue.

Thank you!

Trademark

This project may contain Microsoft trademarks or logos for Microsoft projects, products, or services.

Use of these trademarks or logos must follow Microsoft's Trademark & Brand Guidelines.

Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.