Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data model for lightweight mode on MvStorage #3

Open
pragmaxim opened this issue Jun 14, 2023 · 3 comments
Open

Data model for lightweight mode on MvStorage #3

pragmaxim opened this issue Jun 14, 2023 · 3 comments

Comments

@pragmaxim
Copy link
Member

pragmaxim commented Jun 14, 2023

Schema for embedded database

Finding out if any box related data have been spent or not in query time puts huge pressure on DB
=> let's do that at indexing time so that queries are real-time !

Shared

headerIdsByHeight:  Map[Height, Set[HeaderId]] // more than one in case of a fork-in-progress
blockByHeaderId:    Map[HeaderId, Block] // arbitrary block data (depends on performance)

Unspent/NonEmpty

utxosByAddress:     Map[Address, Map[BoxId, Value]] // this would be a clone of Node's utxo state
addressByUtxo:      Map[BoxId, Address] // non-empty address by utxo

Spent

allBoxesByCustomAddress:     Map[Address, Set[BoxId]] // all boxes for a configured address by a dApp developer

There can be many of these indexes, please provide your suggestions and use-cases!

Facts to consider :

  • some bare minimum of data like UtxoState is going to be indexed for all data
  • arbitrary data, especially spent inputs (tens of millions of records) could be configurable for specific addresses
    so that dApps developers can customize the explorer for their needs

Eventually the Http API will allow for retrieving anything that can be put together from these persistent Maps.

@arobsn
Copy link

arobsn commented Jun 16, 2023

Shared

headerById:                       Map[HeaderId, Value] // Probably best to only keep n headers? 
                                                       // Most of dApps will only need the last 10 
                                                       // headers to be used as reduction context.
headerIdsByHeight:                Map[Height, Set[HeaderId]]

Unspent

boxById:                          Map[BoxId, Value]
boxIdsByContract:                 Map[ContractHex, Set[BoxId]]
boxIdsByContractTemplate:         Map[TemplateHex, Set[BoxId]] // Constant segregated contract template
boxIdsByCreationHeight:           Map[Height, Set[BoxId]
boxIdsByR4:                       Map[RegisterHex, Set[BoxId]] // Non-empty R4 register
boxIdsByR5:                       Map[RegisterHex, Set[BoxId]] // Non-empty R5 register
boxIdsByR6:                       Map[RegisterHex, Set[BoxId]] // Non-empty R6 register
boxIdsByR7:                       Map[RegisterHex, Set[BoxId]] // Non-empty R7 register
boxIdsByR8:                       Map[RegisterHex, Set[BoxId]] // Non-empty R8 register
boxIdsByR9:                       Map[RegisterHex, Set[BoxId]] // Non-empty R9 register
boxIdsByTokenId:                  Map[TokenId, Set[BoxId]]
boxIdsByTransactionId:            Map[TransactionId, Set[BoxId]]

Spent

Spent boxes needs all Unspent maps plus the following:

mintingBoxIdsByTokenId:           Map[TokenId, Set[BoxId]] // EIP-4 only considers one minting box 
                                                           // per token, but protocol allows multiple 
                                                           // boxes in the same minting transaction, 
                                                           // so best to follow the protocol.

Using hashes as contract and registers indexing keys

From storing efficiency point of view, it's better to use hashes instead of the content directly as indexing keys, BLAKE2b256 have 32 bytes against contracts and registers that can be as big as the maximum box size (4 KB) minus the required registers' size.

  • Average contract size: 121 bytes;
  • Average register size: 33 bytes, but it tends to grow with complex dApps like Paideia which stores entire boxes on its registers.

BLAKE hashing algorithm is know by its speed and security, and is extensively used on Ergo, however indexing times must be taken into consideration.

Updated - v1

  • Replaced Hash by the base16 content of Contracts, Templates and Registers;
  • boxIdsByContractTemplate: Only index constant segregated contracts; and
  • Added creationHeight map.

@pragmaxim
Copy link
Member Author

pragmaxim commented Jul 3, 2023

Copy/pasting some rest-endpoints from @arobsn

get  /blocks/{blockId}  // block metadata and statistics
get  /boxes/{state}/tokens/{tokenId}/
get  /boxes/{state}/{boxId}/
get  /boxes/{state}/addresses/{address}/
get  /boxes/{state}/addresses/{address}/tokens/{tokenId}/
get  /boxes/{state}/contracts/{contractHex}/
get  /boxes/{state}/contracts/{contractHex}/tokens/{tokenId}/
get  /boxes/{state}/contracts/hashes/{contractHashHex}/
get  /boxes/{state}/contracts/hashes/{contractHashHex}/tokens/{tokenId}/
get  /boxes/{state}/contracts/templates/{contractTemplateHex}/
get  /boxes/{state}/contracts/templates/{contractTemplateHex}/tokens/{tokenId}/
get  /boxes/{state}/contracts/templates/hashes/{contractTemplateHashHex}/
get  /boxes/{state}/contracts/templates/hashes/{contractTemplateHashHex}/?R4=deadbeef&R5=cafe
get  /boxes/{state}/contracts/templates/hashes/{contractTemplateHashHex}/tokens/{tokenId}/
post /boxes/query/

// state = spent | unspent

get /tokens/{tokenId}/
get /tokens/{tokenId}/minting-box/

@pragmaxim
Copy link
Member Author

pragmaxim commented Jul 3, 2023

I keep uexplorer on Scala3, I'm currently spiking this on multiple tech stacks, started with Slick as I had experience with it, then Doobie, ended up with : zio-protoquill, zio-http, zio-json which are all production ready or very close to production ready ... Eventually zio-protoquill could be replaced with zio-sql which is currently in development.

There are 2 choices in the scala ecosystem when it comes to SQL : Typelevel stack and Zio stack ... My bet is on Zio as the Typelevel stack is not really united well. One needs to have at least 10 various dependencies to put a simple CRUD app together, whereas in Zio land, you are good to go with just : zio-protoquill, zio-http, zio-json. This pays off especially when using Scala3 as it is basically first-class citizen in Zio 2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants