Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to enable new DeepTauId trainig [12_4_X] #37985

Merged

Conversation

mbluj
Copy link
Contributor

@mbluj mbluj commented May 17, 2022

PR description:

This PR is a clone of #37892, i.e. it is created with the same development branch. Full description of the original PR is repeated below for convenience.

This PR adapts DeepTauId producer to a new v2p5 training. The changes are not significant since the architecture and data preprocessing has not changed much structurally compared to previous version. The changes include:

  • unification of the interface for scaling (aka standardization) of inputs across subversions and reading of scaling constants from a separate namespace,
  • functionality to drop specific global variables from a general list (TauBlockInputs),
  • adaptation of runTauIdMVA.py tool,
  • new deepTauId2018v2p5 producer added to the miniAOD workflow and its products (three raw tauIDs) stored in the slimmedTaus collection.

Full list of adaptations needed for the v2p5 can be found here: cms-tau-pog/TauMLTools/issues/99.

In addition this PR consists of changes to DFIsolation producer (1st version of tauID based on DNN) which adapts the producer to changes in base class (previously overlooked).

Note: this PR requires new training files being added to cms-data by cms-data/RecoTauTag-TrainingFiles#8.

PR validation:

Standard tests, i.e. runTheMatrix.py -l limited -i all --ibeos, passed successfully.

Backward-compatibility checks: updated CMSSW code fully reproduces predictions of previous version (v2p1) of deepTauId.

if this PR is a backport please specify the original PR and why you need to backport that PR:

Backport of #37892.

Enables new training of deepTauID which is meant to be dafulat for 2022 data taking.

@jpata
Copy link
Contributor

jpata commented May 18, 2022

@cmsbuild please test

@mbluj
Copy link
Contributor Author

mbluj commented May 18, 2022

Could it be that the test failures are caused by the fact that new deepTau model files are not propagated to IB yet despite of the fact that the corresponding PR to cms-data is already merged (cms-data/RecoTauTag-TrainingFiles#8)?

@cmsbuild
Copy link
Contributor

-1

Failed Tests: UnitTests RelVals RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b457b3/24812/summary.html
COMMIT: 71bf7af
CMSSW: CMSSW_12_4_X_2022-05-17-2300/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37985/24812/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found errors in the following unit tests:

---> test runtestRecoTauTagRecoTau had ERRORS

RelVals

----- Begin Fatal Exception 18-May-2022 09:24:51 CEST-----------------------
An exception of category 'FileInPathError' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing module: class=DeepTauId label='deepTau2018v2p5'
Exception Message:
edm::FileInPath unable to find file RecoTauTag/TrainingFiles/data/DeepTauId/deepTau_2018v2p5_core.pb anywhere in the search path.
The search path is defined by: CMSSW_SEARCH_PATH
${CMSSW_SEARCH_PATH} is: /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37985/24812/CMSSW_12_4_X_2022-05-17-2300/poison:/cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37985/24812/CMSSW_12_4_X_2022-05-17-2300/src:/cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37985/24812/CMSSW_12_4_X_2022-05-17-2300/external/slc7_amd64_gcc10/data:/cvmfs/cms-ib.cern.ch/nweek-02733/slc7_amd64_gcc10/cms/cmssw-patch/CMSSW_12_4_X_2022-05-17-2300/poison:/cvmfs/cms-ib.cern.ch/nweek-02733/slc7_amd64_gcc10/cms/cmssw-patch/CMSSW_12_4_X_2022-05-17-2300/src:/cvmfs/cms-ib.cern.ch/nweek-02733/slc7_amd64_gcc10/cms/cmssw-patch/CMSSW_12_4_X_2022-05-17-2300/external/slc7_amd64_gcc10/data
Current directory is: /data/cmsbld/jenkins/workspace/ib-run-pr-relvals/matrix-results/136.8311_RunJetHT2017F_reminiaod+RunJetHT2017F_reminiaod+REMINIAOD_data2017+HARVEST2017_REMINIAOD_data2017
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 18-May-2022 09:24:52 CEST-----------------------
An exception of category 'FileInPathError' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing module: class=DeepTauId label='deepTau2018v2p5'
Exception Message:
edm::FileInPath unable to find file RecoTauTag/TrainingFiles/data/DeepTauId/deepTau_2018v2p5_core.pb anywhere in the search path.
The search path is defined by: CMSSW_SEARCH_PATH
${CMSSW_SEARCH_PATH} is: /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37985/24812/CMSSW_12_4_X_2022-05-17-2300/poison:/cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37985/24812/CMSSW_12_4_X_2022-05-17-2300/src:/cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37985/24812/CMSSW_12_4_X_2022-05-17-2300/external/slc7_amd64_gcc10/data:/cvmfs/cms-ib.cern.ch/nweek-02733/slc7_amd64_gcc10/cms/cmssw-patch/CMSSW_12_4_X_2022-05-17-2300/poison:/cvmfs/cms-ib.cern.ch/nweek-02733/slc7_amd64_gcc10/cms/cmssw-patch/CMSSW_12_4_X_2022-05-17-2300/src:/cvmfs/cms-ib.cern.ch/nweek-02733/slc7_amd64_gcc10/cms/cmssw-patch/CMSSW_12_4_X_2022-05-17-2300/external/slc7_amd64_gcc10/data
Current directory is: /data/cmsbld/jenkins/workspace/ib-run-pr-relvals/matrix-results/136.7611_RunJetHT2016E_reminiaod+RunJetHT2016E_reminiaod+REMINIAOD_data2016_HIPM+HARVESTDR2_REMINIAOD_data2016_HIPM
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 18-May-2022 09:24:53 CEST-----------------------
An exception of category 'FileInPathError' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing module: class=DeepTauId label='deepTau2018v2p5'
Exception Message:
edm::FileInPath unable to find file RecoTauTag/TrainingFiles/data/DeepTauId/deepTau_2018v2p5_core.pb anywhere in the search path.
The search path is defined by: CMSSW_SEARCH_PATH
${CMSSW_SEARCH_PATH} is: /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37985/24812/CMSSW_12_4_X_2022-05-17-2300/poison:/cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37985/24812/CMSSW_12_4_X_2022-05-17-2300/src:/cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37985/24812/CMSSW_12_4_X_2022-05-17-2300/external/slc7_amd64_gcc10/data:/cvmfs/cms-ib.cern.ch/nweek-02733/slc7_amd64_gcc10/cms/cmssw-patch/CMSSW_12_4_X_2022-05-17-2300/poison:/cvmfs/cms-ib.cern.ch/nweek-02733/slc7_amd64_gcc10/cms/cmssw-patch/CMSSW_12_4_X_2022-05-17-2300/src:/cvmfs/cms-ib.cern.ch/nweek-02733/slc7_amd64_gcc10/cms/cmssw-patch/CMSSW_12_4_X_2022-05-17-2300/external/slc7_amd64_gcc10/data
Current directory is: /data/cmsbld/jenkins/workspace/ib-run-pr-relvals/matrix-results/136.88811_RunJetHT2018D_reminiaodUL+RunJetHT2018D_reminiaodUL+REMINIAOD_data2018UL+HARVEST2018_REMINIAOD_data2018UL
----- End Fatal Exception -------------------------------------------------
Expand to see more relval errors ...

RelVals-INPUT

  • 4.64.6_MinimumBias2010A+MinimumBias2010A+RECOSKIMALCA+HARVESTDR1/step2_MinimumBias2010A+MinimumBias2010A+RECOSKIMALCA+HARVESTDR1.log
  • 136.72411136.72411_RunJetHT2016B_reminiaodUL+RunJetHT2016B_reminiaodUL+REMINIAOD_data2016UL_HIPM+HARVESTDR2_REMINIAOD_data2016UL_HIPM/step2_RunJetHT2016B_reminiaodUL+RunJetHT2016B_reminiaodUL+REMINIAOD_data2016UL_HIPM+HARVESTDR2_REMINIAOD_data2016UL_HIPM.log
  • 136.72412136.72412_RunJetHT2016B_reminiaodUL+RunJetHT2016B_reminiaodUL+REMININANO_data2016UL_HIPM+HARVESTDR2_REMININANO_data2016UL_HIPM/step2_RunJetHT2016B_reminiaodUL+RunJetHT2016B_reminiaodUL+REMININANO_data2016UL_HIPM+HARVESTDR2_REMININANO_data2016UL_HIPM.log
Expand to see more relval errors ...

@perrotta
Copy link
Contributor

test parameters:

@perrotta
Copy link
Contributor

@cmsbuild please test

@perrotta
Copy link
Contributor

@smuzaffar could you please remind me how to make a backport of the cmsdist cms-sw/cmsdist#7876 to 12_4? Then I'll take note of it, and I won't ask you any more...

@smuzaffar
Copy link
Contributor

@perrotta , you can try running https://cmssdt.cern.ch/jenkins/job/backport-pull-request/lastCompletedBuild/rebuild/parameterized with

REPOSITORY=cms-sw/cmsdist
BRANCH=IB/CMSSW_12_4_X/master
PULL_REQUEST=7876

which sometimes fails as bot can not automatically merge changes. If this does not work then you can just open a PR to update https://github.com/cms-sw/cmsdist/blob/IB/CMSSW_12_4_X/master/data/cmsswdata.txt#L34 ( i.e. for IB/CMSSW_12_4_X/master branch) with the correct tag.

@perrotta
Copy link
Contributor

@perrotta , you can try running https://cmssdt.cern.ch/jenkins/job/backport-pull-request/lastCompletedBuild/rebuild/parameterized with

REPOSITORY=cms-sw/cmsdist
BRANCH=IB/CMSSW_12_4_X/master
PULL_REQUEST=7876

which sometimes fails as bot can not automatically merge changes. If this does not work then you can just open a PR to update https://github.com/cms-sw/cmsdist/blob/IB/CMSSW_12_4_X/master/data/cmsswdata.txt#L34 ( i.e. for IB/CMSSW_12_4_X/master branch) with the correct tag.

Indeed...
Thank you @smuzaffar
A new cmsdist is made as cms-sw/cmsdist#7877, which is currently tested together with this PR

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b457b3/24817/summary.html
COMMIT: 71bf7af
CMSSW: CMSSW_12_4_X_2022-05-17-2300/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37985/24817/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b457b3/24817/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b457b3/24817/git-merge-result

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-b457b3/11634.301_TTbar_14TeV+2021_Run3FS+TTbar_14TeV_TuneCP5_GenSim+HARVESTNano

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 123 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3663214
  • DQMHistoTests: Total failures: 92
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3663100
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 2024.7790000000002 KiB( 49 files compared)
  • DQMHistoSizes: changed ( 1000.0,... ): 4.691 KiB RPC/Muon
  • DQMHistoSizes: changed ( 1000.0,... ): 2.046 KiB RPC/AllHits
  • DQMHistoSizes: changed ( 1000.0,... ): 0.958 KiB RPC/EventInfo
  • DQMHistoSizes: changed ( 10024.0,... ): 13.123 KiB RPC/Muon
  • DQMHistoSizes: changed ( 10024.0,... ): 10.091 KiB RPC/AllHits
  • DQMHistoSizes: changed ( 11634.0,... ): 61.076 KiB RPC/Muon
  • DQMHistoSizes: changed ( 11634.0,... ): 58.100 KiB RPC/AllHits
  • DQMHistoSizes: changed ( 23234.0,... ): 53.604 KiB RPC/Muon
  • DQMHistoSizes: changed ( 23234.0,... ): 50.428 KiB RPC/AllHits
  • Checked 208 log files, 45 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@jpata
Copy link
Contributor

jpata commented May 19, 2022

+reconstruction

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_12_4_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_12_5_X is complete. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@qliphy
Copy link
Contributor

qliphy commented May 19, 2022

+1
tested also in cms-sw/cmsdist#7877

@cmsbuild cmsbuild merged commit 7ff2ef6 into cms-sw:CMSSW_12_4_X May 19, 2022
@mbluj mbluj deleted the CMSSW_12_4_X_tau-pog_deepTau-v2p5 branch October 10, 2023 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants