DBS R&D for large tables #84

vkuznet · 2022-09-24T18:42:12Z

With growths of DBS data we need to perform R&D to address large tables

store data in different format, e.g. JSON rather single table
decouple data into another DB, e.g. put RunLumis into different DB
meta-data service for meta-data
table partions
dbs proxy server to perform concurrent requests of query from different period of time

vkuznet · 2022-10-26T13:24:26Z

Here is a brief plan of R&D activities we need to perform:

Create new abstract NoSQL db layer
Try first MongoDB, then ElasticSearch
Copy dbs file_lumis table to NoSQL DB
Perform benchmark for queries with NoSQL DB, MongoDB vs ElasticSearch
Add API to insert, update, delete, and search meta-data Integrate NoSQL db layer with the rest of dbs APIs, eg
- during insert we need to insert first into NoSQL, then fetch back results and if successful insert into DBS DB.
- during search we need concurrently search in DBS DB and NoSQL db
- Perform benchmark for insert and search

d-ylee · 2022-10-31T15:01:19Z

As per our discussion, the reasoning for doing this is because the HTTP front end API has a 5 minute timeout. Injectino of FileLumis is limited to 2-3M records per block before timeout. Fetching also takes time with an increased amount of data.

We need to first evaluate using both MongoDB and ElasticSearch. This would first require fetching FileLumis from current deployments and do an injection.

SQL For reference:

amaltaro · 2022-10-31T15:37:20Z

@d-ylee @vkuznet based on this information above, should we try to limit block sizes - in terms of number of lumis - to 1M lumis at top? Maybe we even cap it to .5M lumis per block? Once we decide on the threshold, we should feed this back to this WMCore GH issue: dmwm/WMCore#10264

vkuznet · 2022-10-31T16:28:52Z

It will certainly be helpful to put a limit on number of lumis since so far there is no limit and as such there is a potential to go above the limit on FEs. Based on initial benchmark of time taking by bulkblocks injsertion API the it can stays within 5 min if number of lumis not exceed few millions, e.g. 2-3. Therefore, a limit of 1M is good to have in place. To improve performance it is also better to limit it further to 0.5M but I do not know if it will have any side effect on DM side.

vkuznet · 2022-10-31T17:15:39Z

In addition to reasoning @d-ylee mentioned. This R&D will explore a possibility to add more unstructured meta-data to DBS information. Recently, we listen to I. Mandrichenko talk MetaCat - meta-data catalog for Rucio-based data management system where he argued Run conditions, File provenance meta-data can be stored as non-structural data into NoSQL DB which can provide better performance for queries than structured DBS information.

vkuznet added the R&D label Sep 24, 2022

amaltaro mentioned this issue Oct 31, 2022

Limit number of lumi sections per block in WMAgent dmwm/WMCore#10264

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DBS R&D for large tables #84

DBS R&D for large tables #84

vkuznet commented Sep 24, 2022

vkuznet commented Oct 26, 2022

d-ylee commented Oct 31, 2022

amaltaro commented Oct 31, 2022

vkuznet commented Oct 31, 2022

vkuznet commented Oct 31, 2022

DBS R&D for large tables #84

DBS R&D for large tables #84

Comments

vkuznet commented Sep 24, 2022

vkuznet commented Oct 26, 2022

d-ylee commented Oct 31, 2022

amaltaro commented Oct 31, 2022

vkuznet commented Oct 31, 2022

vkuznet commented Oct 31, 2022