This is a description of the file format.
This database is basically a hashtable with its 'underlying array' as the database file. It should thus have the following features:
- It should be able to have its sections mapped to memory. Each section should be of OS VM page size as done by lmdb. To get the OS VM page size in a cross-platform way, we could use what page_size crate used.
- All data is saved in bytes
- It should have the following major sections:
- a 100-byte header to hold metadata for the database
- a series of consecutive index blocks. They are
round_up(max_keys / (block_size / 8))
where(block_size / 8)
is items in each index block since each item is a 8-byte offset (offset is described below). - a series of key-value entries
- The 100-byte header, similar to sqlite contains:
Offset | Size | Description |
---|---|---|
0 | 16 | The header string: "Scdb versn 0.001" |
16 | 4 | block_size - the database page size in bytes. Must be a power of two, as got in a similar way to how page_size crate does it. |
20 | 8 | max_keys - maximum number of keys (saved as a 4 byte number). Defaults to 1000,000 (1 million) |
28 | 2 | redundant_blocks - number of redundant index blocks to cater for where all index blocks are filled up for a given hash. Defaults to 1. |
30 | 70 | Reserved for expansion. Must be zero. |
- The index blocks each contain offsets where an offset is how far in bits from the start of the file that you will find the corresponding key-value entry.
- Each key-value entry has the following parts all in binary format
SIZE <the 4 byte unsigned integer showing number of bits for this whole entry>
KEY SIZE <the 4 byte unsigned integer showing number of bits for this key>
KEY <the key>
IS_DELETED <the 1-byte unsigned integer showing 1 for deleted, 0 for not>
EXPIRY <the timestamp>
VALUE <the value in binary>
- Ideas are borrowed from rds file format , sqlite file format and lmdb file format