Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBS R&D for large tables #84

Open
vkuznet opened this issue Sep 24, 2022 · 5 comments
Open

DBS R&D for large tables #84

vkuznet opened this issue Sep 24, 2022 · 5 comments
Labels

Comments

@vkuznet
Copy link
Contributor

vkuznet commented Sep 24, 2022

With growths of DBS data we need to perform R&D to address large tables

  • store data in different format, e.g. JSON rather single table
  • decouple data into another DB, e.g. put RunLumis into different DB
  • meta-data service for meta-data
  • table partions
  • dbs proxy server to perform concurrent requests of query from different period of time
@vkuznet vkuznet added the R&D label Sep 24, 2022
@vkuznet
Copy link
Contributor Author

vkuznet commented Oct 26, 2022

Here is a brief plan of R&D activities we need to perform:

  • Create new abstract NoSQL db layer
  • Try first MongoDB, then ElasticSearch
  • Copy dbs file_lumis table to NoSQL DB
  • Perform benchmark for queries with NoSQL DB, MongoDB vs ElasticSearch
  • Add API to insert, update, delete, and search meta-data Integrate NoSQL db layer with the rest of dbs APIs, eg
    • during insert we need to insert first into NoSQL, then fetch back results and if successful insert into DBS DB.
    • during search we need concurrently search in DBS DB and NoSQL db
    • Perform benchmark for insert and search

@d-ylee
Copy link
Contributor

d-ylee commented Oct 31, 2022

As per our discussion, the reasoning for doing this is because the HTTP front end API has a 5 minute timeout. Injectino of FileLumis is limited to 2-3M records per block before timeout. Fetching also takes time with an increased amount of data.

We need to first evaluate using both MongoDB and ElasticSearch. This would first require fetching FileLumis from current deployments and do an injection.

SQL For reference:

@amaltaro
Copy link

@d-ylee @vkuznet based on this information above, should we try to limit block sizes - in terms of number of lumis - to 1M lumis at top? Maybe we even cap it to .5M lumis per block? Once we decide on the threshold, we should feed this back to this WMCore GH issue: dmwm/WMCore#10264

@vkuznet
Copy link
Contributor Author

vkuznet commented Oct 31, 2022

It will certainly be helpful to put a limit on number of lumis since so far there is no limit and as such there is a potential to go above the limit on FEs. Based on initial benchmark of time taking by bulkblocks injsertion API the it can stays within 5 min if number of lumis not exceed few millions, e.g. 2-3. Therefore, a limit of 1M is good to have in place. To improve performance it is also better to limit it further to 0.5M but I do not know if it will have any side effect on DM side.

@vkuznet
Copy link
Contributor Author

vkuznet commented Oct 31, 2022

In addition to reasoning @d-ylee mentioned. This R&D will explore a possibility to add more unstructured meta-data to DBS information. Recently, we listen to I. Mandrichenko talk MetaCat - meta-data catalog for Rucio-based data management system where he argued Run conditions, File provenance meta-data can be stored as non-structural data into NoSQL DB which can provide better performance for queries than structured DBS information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants