Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding SONIC ParticleNet Producer to CMSSW #37964

Merged

Conversation

wpmccormack
Copy link
Contributor

PR description:

This PR introduces a producer for ParticleNet using SONIC for co-processor-enabled inferences-as-a-service. The main versions of ParticleNet are all included (AK4, AK8 general, MassDecorrelation, and MassRegression), as are configs for both ONNX versions and PyTorch versions of the models. For more information on the status of SONIC, please refer to this presentation to JetMET DPG: https://indico.cern.ch/event/1143469/contributions/4799423/attachments/2416540/4135308/March_28_JetMET_SONIC.pdf. Please note that the producer introduced here does not affect the main ParticleNet producer in any way (https://github.com/cms-sw/cmssw/blob/master/RecoBTag/ONNXRuntime/plugins/BoostedJetONNXJetTagsProducer.cc). This is an alternate way to run the standard ParticleNet model using inference as a service. As noted in the linked slides, this producer has been validated locally on Tier 2 resources and at scale with cloud computing, but with this PR, we hope to perform more "official" production tests. Again, this producer does not any workflow unless explicitly called. A helper-function macro is slightly modified, as are a few config files. There will be an accompanying PR into https://github.com/cms-data/RecoBTag-Combined, which adds needed model files.

Tagging @kpedro88 @yongbinfeng @violatingcp @nhanvtran @jmduarte

PR validation:

This PR was tested on LPC using a Triton server running on the GPU-enabled AILab machines at FermiLab. Tests have also been performed using the ParticleNet SONIC producer at Purdue T2 and in Google Cloud. When the producer is not called, there is no performance impact, and when it is called, the Jet Tagging results are the same as for the generic ParticleNet producer.

@wpmccormack
Copy link
Contributor Author

Accompanying model-file PR is here: cms-data/RecoBTag-Combined#48

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37964/30036

  • This PR adds an extra 32KB to repository

  • There are other open Pull requests which might conflict with changes you have proposed:

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@jpata
Copy link
Contributor

jpata commented May 17, 2022

code checks should be addressed before we proceed further (it's worth to run scram b code-format every time before pushing).

@jpata
Copy link
Contributor

jpata commented May 17, 2022

@emilbols @cms-sw/btv-pog-l2 could you please review this from the BTV side?

@wpmccormack
Copy link
Contributor Author

Hi @jpata , I ran scram b code-format, and pushed the change to one of the files. Sorry, I think I had run that command after pushing the first time, but forgot to push the change that came out of it.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37964/30083

  • This PR adds an extra 32KB to repository

  • There are other open Pull requests which might conflict with changes you have proposed:

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @wpmccormack (Patrick McCormack) for master.

It involves the following packages:

  • Configuration/ProcessModifiers (operations)
  • RecoBTag/FeatureTools (reconstruction)
  • RecoBTag/ONNXRuntime (reconstruction)

@perrotta, @clacaputo, @cmsbuild, @slava77, @jpata, @qliphy, @fabiocos, @davidlange6 can you please review it and eventually sign? Thanks.
@AlexDeMoor, @emilbols, @JyothsnaKomaragiri, @makortel, @Martin-Grunewald, @missirol, @hqucms, @andrzejnovak, @demuller, @fabiocos this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@kpedro88
Copy link
Contributor

test parameters:
pull_request = cms-data/RecoBTag-Combined#48
workflow = 11824.9001,23434.9001
relvals_opt= -w standard,highstats,pileup,generator,extendedgen,production,upgrade,cleanedupgrade,ged

@kpedro88
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-603a48/25508/summary.html
COMMIT: 7b1de53
CMSSW: CMSSW_12_5_X_2022-06-14-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37964/25508/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /pool/condor/dir_34533/jenkins/workspace/compare-root-files-short-matrix/data/PR-603a48/11824.9001_TTbar_13+2021PU_SonicTriton+TTbar_13TeV_TuneCUETP8M1_GenSim+DigiPU+RecoNanoPU+HARVESTNanoPU
  • /pool/condor/dir_34533/jenkins/workspace/compare-root-files-short-matrix/data/PR-603a48/23434.9001_TTbar_14TeV+2026D49PU_SonicTriton+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTriggerPU+RecoGlobalPU+HARVESTGlobalPU

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 52
  • DQMHistoTests: Total histograms compared: 3929865
  • DQMHistoTests: Total failures: 13
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3929829
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.004 KiB( 51 files compared)
  • DQMHistoSizes: changed ( 312.0 ): 0.004 KiB MessageLogger/Warnings
  • Checked 216 log files, 54 edm output root files, 52 DQM output files
  • TriggerResults: no differences found

@clacaputo
Copy link
Contributor

+reconstruction

@clacaputo
Copy link
Contributor

BTV is happy to sign off.

Hi @johnalison , to trigger the change from btv-pog-pending to btv-pog-approved, could you please comment +1? Thanks

@qliphy
Copy link
Contributor

qliphy commented Jun 28, 2022

ping @cms-sw/btv-pog-l2

@perrotta
Copy link
Contributor

please test
(to rfresh)

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-603a48/25856/summary.html
COMMIT: 7b1de53
CMSSW: CMSSW_12_5_X_2022-06-27-2300/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37964/25856/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-603a48/25856/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-603a48/25856/git-merge-result

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-603a48/11824.9001_TTbar_13+2021PU_SonicTriton+TTbar_13TeV_TuneCUETP8M1_GenSim+DigiPU+RecoNanoPU+HARVESTNanoPU
  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-603a48/23434.9001_TTbar_14TeV+2026D49PU_SonicTriton+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTriggerPU+RecoGlobalPU+HARVESTGlobalPU

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 32 differences found in the comparisons
  • DQMHistoTests: Total files compared: 52
  • DQMHistoTests: Total histograms compared: 3924579
  • DQMHistoTests: Total failures: 26
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3924531
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -74151.789 KiB( 51 files compared)
  • DQMHistoSizes: changed ( 11634.0,... ): -2033.303 KiB GEM/EventInfo
  • DQMHistoSizes: changed ( 11634.0,... ): -199.106 KiB GEM/DAQStatus
  • DQMHistoSizes: changed ( 138.4,... ): -1797.617 KiB GEM/EventInfo
  • DQMHistoSizes: changed ( 138.4,... ): -191.931 KiB GEM/DAQStatus
  • DQMHistoSizes: changed ( 23234.0,... ): -2751.391 KiB GEM/EventInfo
  • DQMHistoSizes: changed ( 23234.0,... ): -364.509 KiB GEM/DAQStatus
  • DQMHistoSizes: changed ( 35034.0,... ): -5672.711 KiB GEM/EventInfo
  • DQMHistoSizes: changed ( 35034.0,... ): -1488.243 KiB GEM/DAQStatus
  • Checked 216 log files, 54 edm output root files, 52 DQM output files
  • TriggerResults: no differences found

@johnalison
Copy link
Contributor

+1

@perrotta
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants