-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Trigger Ntuples Trigger Storage #1189
Comments
Yeah, I'm not sure how easily feasible this is. Not many people are going to be good enough to be able to do trigger bit decisions at the ntuple level, especially for those who are joining ATLAS now. In most analyses, I only see ~5-10 trigger decisions being stored. Storing 50 does seem like a lot... It's an interesting thought. If you already specify the list of triggers you want to store, is it possible to store a function that calculates the trigger bit given a series of trigger names, and then you can search for that? |
Hi @kratsg, are you suggesting to add the output of this function in addition to what @kkrizka suggests? I feel like just adding this output would reduce the freedom of the user downstream to experiment with different trigger lists. This is especially true if common ntuples are produced in an analysis. If we decide for this combined approach, may I suggest to store a vector for each trigger, where the first element is the trigger bit and the second the prescale? |
@fscutti so no. What this effectively amounts to is requiring a consistent way of mapping input triggers to a fixed vector of trigger strings so that you just store a vector of prescales per event knowing that the order of the vector is well-defined... similarly with trigger bits for passing. The question really is, how do we sort/predetermine that order in an entirely generic / configurable way that doesn't place undue burden on the end user? An example is to provide a python script that parses the config.py/config.json someone uses, extracts the trigger, and provides the necessary order... but then keeping that up to date with the C++ code becomes somewhat hard to do. The other option might be to use a friend tree -- where the friend tree has a single row listing the trigger stings, and if you want to get the trigger names into your trees, just add a friend tree to link things up (join). |
You could use an You could even map from an |
Edit isn't working.
|
Hi all, I was not proposing to have a single bit string for triggers. I was thinking of a different branch per trigger decision, similar to what was used in the Run 1 ntuples. -- |
I am looking at reducing the size of my ntuples. I made some quick plots looking at the space different branches take (via
TBranch::GetTotalSize()
). I split the branches into categories based on the word before the first _. If the word is not jet, fatjet, muon, el or ph, then it is put into the event category.I put the composition of my data ntuples at the bottom. The event category takes up about 20% of the ntuples. Of that, over half is taken up by
triggerNames
(I run with #1184 applied, the branch isisPassedBitsNames
in master). Probably not too surprising, since each trigger is stored as a lengthy set of characters (up to 20 for the large-R jet triggers). If you have several triggers, things add up...Might be worth rethinking about how the trigger information is stored. My first thought is to have a boolean branch per trigger named
triggername
(or a floattriggername_prescale
). Similar to what the oldNTUP_COMMON
used. Might be faster, since one does not have to do a linear search through a list to determine a trigger decision. Not sure how nice this would be if the complete trigger list is not known at run time (ie: triggers added/removed for the different data periods).@kratsg @ntadej Thoughts? Maybe I am the only one who stores a lot of trigger decisions (~50)....
The text was updated successfully, but these errors were encountered: