Data model for lightweight mode on MvStorage #3

pragmaxim · 2023-06-14T09:08:13Z

Schema for embedded database

Finding out if any box related data have been spent or not in query time puts huge pressure on DB
=> let's do that at indexing time so that queries are real-time !

Shared

headerIdsByHeight:  Map[Height, Set[HeaderId]] // more than one in case of a fork-in-progress
blockByHeaderId:    Map[HeaderId, Block] // arbitrary block data (depends on performance)

Unspent/NonEmpty

utxosByAddress:     Map[Address, Map[BoxId, Value]] // this would be a clone of Node's utxo state
addressByUtxo:      Map[BoxId, Address] // non-empty address by utxo

Spent

allBoxesByCustomAddress:     Map[Address, Set[BoxId]] // all boxes for a configured address by a dApp developer

There can be many of these indexes, please provide your suggestions and use-cases!

Facts to consider :

some bare minimum of data like UtxoState is going to be indexed for all data
arbitrary data, especially spent inputs (tens of millions of records) could be configurable for specific addresses
so that dApps developers can customize the explorer for their needs

Eventually the Http API will allow for retrieving anything that can be put together from these persistent Maps.

The text was updated successfully, but these errors were encountered:

arobsn · 2023-06-16T15:21:07Z

Shared

headerById:                       Map[HeaderId, Value] // Probably best to only keep n headers? 
                                                       // Most of dApps will only need the last 10 
                                                       // headers to be used as reduction context.
headerIdsByHeight:                Map[Height, Set[HeaderId]]

Unspent

boxById:                          Map[BoxId, Value]
boxIdsByContract:                 Map[ContractHex, Set[BoxId]]
boxIdsByContractTemplate:         Map[TemplateHex, Set[BoxId]] // Constant segregated contract template
boxIdsByCreationHeight:           Map[Height, Set[BoxId]
boxIdsByR4:                       Map[RegisterHex, Set[BoxId]] // Non-empty R4 register
boxIdsByR5:                       Map[RegisterHex, Set[BoxId]] // Non-empty R5 register
boxIdsByR6:                       Map[RegisterHex, Set[BoxId]] // Non-empty R6 register
boxIdsByR7:                       Map[RegisterHex, Set[BoxId]] // Non-empty R7 register
boxIdsByR8:                       Map[RegisterHex, Set[BoxId]] // Non-empty R8 register
boxIdsByR9:                       Map[RegisterHex, Set[BoxId]] // Non-empty R9 register
boxIdsByTokenId:                  Map[TokenId, Set[BoxId]]
boxIdsByTransactionId:            Map[TransactionId, Set[BoxId]]

Spent

Spent boxes needs all Unspent maps plus the following:

mintingBoxIdsByTokenId:           Map[TokenId, Set[BoxId]] // EIP-4 only considers one minting box 
                                                           // per token, but protocol allows multiple 
                                                           // boxes in the same minting transaction, 
                                                           // so best to follow the protocol.

Using hashes as contract and registers indexing keys

From storing efficiency point of view, it's better to use hashes instead of the content directly as indexing keys, BLAKE2b256 have 32 bytes against contracts and registers that can be as big as the maximum box size (4 KB) minus the required registers' size.

~~Average contract size: 121 bytes;~~
~~Average register size: 33 bytes, but it tends to grow with complex dApps like Paideia which stores entire boxes on its registers.~~

~~BLAKE hashing algorithm is know by its speed and security, and is extensively used on Ergo, however indexing times must be taken into consideration.~~

Updated - `v1`

Replaced Hash by the base16 content of Contracts, Templates and Registers;
boxIdsByContractTemplate: Only index constant segregated contracts; and
Added creationHeight map.

pragmaxim · 2023-07-03T06:31:08Z

Copy/pasting some rest-endpoints from @arobsn

get  /blocks/{blockId}  // block metadata and statistics
get  /boxes/{state}/tokens/{tokenId}/
get  /boxes/{state}/{boxId}/
get  /boxes/{state}/addresses/{address}/
get  /boxes/{state}/addresses/{address}/tokens/{tokenId}/
get  /boxes/{state}/contracts/{contractHex}/
get  /boxes/{state}/contracts/{contractHex}/tokens/{tokenId}/
get  /boxes/{state}/contracts/hashes/{contractHashHex}/
get  /boxes/{state}/contracts/hashes/{contractHashHex}/tokens/{tokenId}/
get  /boxes/{state}/contracts/templates/{contractTemplateHex}/
get  /boxes/{state}/contracts/templates/{contractTemplateHex}/tokens/{tokenId}/
get  /boxes/{state}/contracts/templates/hashes/{contractTemplateHashHex}/
get  /boxes/{state}/contracts/templates/hashes/{contractTemplateHashHex}/?R4=deadbeef&R5=cafe
get  /boxes/{state}/contracts/templates/hashes/{contractTemplateHashHex}/tokens/{tokenId}/
post /boxes/query/

// state = spent | unspent

get /tokens/{tokenId}/
get /tokens/{tokenId}/minting-box/

pragmaxim · 2023-07-03T06:55:31Z

I keep uexplorer on Scala3, I'm currently spiking this on multiple tech stacks, started with Slick as I had experience with it, then Doobie, ended up with : zio-protoquill, zio-http, zio-json which are all production ready or very close to production ready ... Eventually zio-protoquill could be replaced with zio-sql which is currently in development.

There are 2 choices in the scala ecosystem when it comes to SQL : Typelevel stack and Zio stack ... My bet is on Zio as the Typelevel stack is not really united well. One needs to have at least 10 various dependencies to put a simple CRUD app together, whereas in Zio land, you are good to go with just : zio-protoquill, zio-http, zio-json. This pays off especially when using Scala3 as it is basically first-class citizen in Zio 2.0.

ross-weir mentioned this issue Jun 17, 2023

Improve efficiency of discovery of Resolvers / ReservedResolvers from fresh wallet bitdomains/contracts#19

Closed

This was referenced Jun 21, 2023

switch address to ergo-tree #7

Merged

Ergo tree template supernode resistant multimap #10

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data model for lightweight mode on MvStorage #3

Data model for lightweight mode on MvStorage #3

pragmaxim commented Jun 14, 2023 •

edited

Loading

arobsn commented Jun 16, 2023 •

edited

Loading

pragmaxim commented Jul 3, 2023 •

edited

Loading

pragmaxim commented Jul 3, 2023 •

edited

Loading

Data model for lightweight mode on MvStorage #3

Data model for lightweight mode on MvStorage #3

Comments

pragmaxim commented Jun 14, 2023 • edited Loading

Schema for embedded database

Shared

Unspent/NonEmpty

Spent

arobsn commented Jun 16, 2023 • edited Loading

Shared

Unspent

Spent

Using hashes as contract and registers indexing keys

Updated - v1

pragmaxim commented Jul 3, 2023 • edited Loading

pragmaxim commented Jul 3, 2023 • edited Loading

pragmaxim commented Jun 14, 2023 •

edited

Loading

arobsn commented Jun 16, 2023 •

edited

Loading

Updated - `v1`

pragmaxim commented Jul 3, 2023 •

edited

Loading

pragmaxim commented Jul 3, 2023 •

edited

Loading