Fix a race condition in splitVertices #45656

fwyzard · 2024-08-06T19:51:15Z

PR description:

Add

alpaka::syncBlockThreads(acc);

at the end of the loop on the vertices to ensure that all threads are properly synchronised before resetting the shared memory.

Clean up the kernel to use the SoA accessors and the cms::alpakatools utilities.

PR validation:

Running the HLT with compute-sanitizer --tool=racecheck without this PR warns about a potential race condition in splitVertices().
This PR fixes the warning: see #44923 (comment).

Running the current online HLT menu over 20k events on top of CMSSW 14.0.13-patch2 does not result in any changes to the HLT results (see #45656 (comment)) and performance on GPU (see #45656 (comment)).

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

To be backport to 14.0.x for data taking.

Add alpaka::syncBlockThreads(acc); at the end of the loop on the vertices to ensure that all threads are properly synchronised before resetting the shared memory. Clean up the kernel to use the SoA accessors and the cms::alpakatools utilities.

cmsbuild · 2024-08-06T19:51:37Z

cms-bot internal usage

fwyzard · 2024-08-06T19:51:57Z

type bugfix

fwyzard · 2024-08-06T19:52:00Z

enable gpu

fwyzard · 2024-08-06T19:52:03Z

please test

cmsbuild · 2024-08-06T19:53:10Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-45656/41184

cmsbuild · 2024-08-06T19:53:32Z

A new Pull Request was created by @fwyzard for master.

It involves the following packages:

RecoTracker/PixelVertexFinding (reconstruction)

@jfernan2, @mandrenguyen can you please review it and eventually sign? Thanks.
@GiacomoSguazzoni, @JanFSchulte, @VinInn, @VourMa, @dgulhan, @felicepantaleo, @gpetruc, @missirol, @mmusich, @mtosi, @rovere this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

fwyzard · 2024-08-06T19:55:52Z

RecoTracker/PixelVertexFinding/plugins/alpaka/splitVertices.h

-    auto& __restrict__ data = pdata;
-    auto& __restrict__ ws = pws;
-    auto nt = ws.ntrks();
-    float const* __restrict__ zt = ws.zt();
-    float const* __restrict__ ezt2 = ws.ezt2();
-    float* __restrict__ zv = data.zv();
-    float* __restrict__ wv = data.wv();
-    float const* __restrict__ chi2 = data.chi2();


These should no longer be needed, because the SoA accessors include the __restrict__ qualifier.

fwyzard · 2024-08-06T19:56:22Z

RecoTracker/PixelVertexFinding/plugins/alpaka/splitVertices.h

-    ALPAKA_ASSERT_ACC(zt);
-    ALPAKA_ASSERT_ACC(wv);
-    ALPAKA_ASSERT_ACC(chi2);
-    ALPAKA_ASSERT_ACC(nn);


These are no longer needed, because the SoA View should guarantee that these are non-null.

fwyzard · 2024-08-06T19:58:15Z

I think all changes should be fine.

Before merging this PR I would like to

check the impact on the HLT results to make sure there is none;
check the impact on the module's performance to make sure it's not affected (or slightly improved).

cmsbuild · 2024-08-06T23:22:56Z

+1

Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-86968d/40811/summary.html
COMMIT: e746fdf
CMSSW: CMSSW_14_1_X_2024-08-06-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/45656/40811/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

You potentially removed 1 lines from the logs
Reco comparison results: 8 differences found in the comparisons
DQMHistoTests: Total files compared: 46
DQMHistoTests: Total histograms compared: 3530500
DQMHistoTests: Total failures: 6
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 3530474
DQMHistoTests: Total skipped: 20
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 45 files compared)
Checked 200 log files, 170 edm output root files, 46 DQM output files
TriggerResults: no differences found

GPU Comparison Summary

Summary:

No significant changes to the logs found
Reco comparison results: 2 differences found in the comparisons
DQMHistoTests: Total files compared: 6
DQMHistoTests: Total histograms compared: 37022
DQMHistoTests: Total failures: 1310
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 35712
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 5 files compared)
Checked 20 log files, 25 edm output root files, 6 DQM output files
TriggerResults: no differences found

missirol · 2024-08-07T05:25:21Z

@fwyzard , thanks for the fix !

Based on the check below, the HLT throughput is basically unaffected (same settings as in #45631 (comment), but a different Hilton node [*]).

Pre-PR: 614.5 +/- 2.1 events/sec (json).
Post-PR: 614.7 +/- 1.5 events/sec (json).

[*]

Input data: run-383631, LSs 476-479, ~40k events (PU ~64).
HLT menu: /cdaq/physics/Run2024/2e34/v1.4.3/HLT/V2 (current pp online menu).
Release: CMSSW_14_0_13_patch1_MULTIARCHS (with and without this PR).
Node: hilton-c2b02-44-01 (same hardware as a 2022/23 HLT node) (CPUs: 2 AMD EPYC 7763 64-Core; GPUs: 2 NVIDIA Tesla T4).
8 jobs, 32 threads and 24 streams per job.
NVIDIA MPS enabled.
Each of the two measurements reported above is the average of 4 repetitions.

mmusich · 2024-08-07T11:18:54Z

check the impact on the HLT results to make sure there is none;

I tried to evaluate this with this script (which makes use of some of the error stream input files from #44923 (comment)) with and without this PR and I am seeing some (small) changes on CPU only:

$ hltDiff -o output_1.root -n ../../../new/CMSSW_14_0_X_2024-08-06-2300/src/output_1.root
Processed events: 0 out of 8200 (0%)
Processed events: 820 out of 8200 (10%)
Processed events: 1640 out of 8200 (20%)
Processed events: 2460 out of 8200 (30%)
Processed events: 3280 out of 8200 (40%)
Processed events: 4100 out of 8200 (50%)
Processed events: 4920 out of 8200 (60%)
Processed events: 5740 out of 8200 (70%)
Processed events: 6560 out of 8200 (80%)
Processed events: 7380 out of 8200 (90%)
Found 8200 matching events, out of which 82 have different HLT results

      Events    Accepted      Gained        Lost       Other  Trigger
        8200        4742           -          -1           -  AlCa_PFJet40_v31
        8200        4742           -          -1           -  AlCa_PFJet40_CPUOnly_v10
        8200        6902          +1           -           -  AlCa_AK8PFJet40_v26
        8200         397           -          -1           -  DST_PFScouting_DoubleMuon_v5
        8200        1489          +2           -           -  DST_PFScouting_DatasetMuon_v5
        8200           0           -           -          ~1  HLT_DoubleEle8_CaloIdM_TrackIdM_Mass8_DZ_PFHT350_v31
        8200           0           -           -          ~1  HLT_DoubleEle8_CaloIdM_TrackIdM_Mass8_PFHT350_v31
        8200         676          +1           -           -  HLT_Mu0_L1DoubleMu_v12
        8200         468          +1           -           -  HLT_Mu4_L1DoubleMu_v12
        8200         687          +2           -           -  HLT_Mu3_PFJet40_v27
        8200        2218          +2           -           -  HLT_Mu3_L1SingleMu5orSingleMu7_v12
        8200          22           -          -1          ~1  HLT_Ele30_WPTight_Gsf_v10
        8200          20           -          -1          ~1  HLT_Ele32_WPTight_Gsf_v24
        8200          17           -           -          ~1  HLT_Ele35_WPTight_Gsf_v18
        8200          12           -           -          ~1  HLT_Ele38_WPTight_Gsf_v18
        8200          11           -           -          ~1  HLT_Ele40_WPTight_Gsf_v18
        8200          20           -          -1          ~1  HLT_Ele32_WPTight_Gsf_L1DoubleEG_v18
        8200        6902          +1           -           -  HLT_AK8PFJet40_v27
        8200        4290          +2          -1           -  HLT_AK8PFJet60_v26
        8200        4742           -          -1           -  HLT_PFJet40_v32
        8200         916           -          -1           -  HLT_PFJetFwd40_v30
        8200        3852          +2          -1           -  HLT_AK8PFJetFwd40_v26
        8200         947          +1           -           -  HLT_AK8PFJetFwd60_v25
        8200         681          +1           -           -  HLT_PFHT250_v28
        8200           2           -           -          ~1  HLT_Mu12_DoublePFJets100_PNetBTag_0p11_v5
        8200           3           -           -          ~1  HLT_Tau3Mu_Mu7_Mu1_TkMu1_Tau15_v15
        8200           3           -           -          ~1  HLT_Tau3Mu_Mu7_Mu1_TkMu1_Tau15_Charge1_v15
        8200           3           -           -          ~1  HLT_Tau3Mu_Mu7_Mu1_TkMu1_IsoTau15_v15
        8200           3           -           -          ~1  HLT_Tau3Mu_Mu7_Mu1_TkMu1_IsoTau15_Charge1_v15
        8200           0           -           -          ~1  HLT_Mu43NoFiltersNoVtx_Photon43_CaloIdL_v14
        8200           0           -           -          ~1  HLT_Mu48NoFiltersNoVtx_Photon48_CaloIdL_v14
        8200           0           -           -          ~1  HLT_Mu38NoFiltersNoVtxDisplaced_Photon38_CaloIdL_v10
        8200           0           -           -          ~1  HLT_Mu43NoFiltersNoVtxDisplaced_Photon43_CaloIdL_v10
        8200           0           -           -          ~3  HLT_Ele28_HighEta_SC20_Mass55_v22
        8200           1           -          -1           -  HLT_TrkMu12_DoubleTrkMu5NoFiltersNoVtx_v16
        8200         221          +3          -3          ~7  HLT_Ele8_CaloIdL_TrackIdL_IsoVL_PFJet30_v27
        8200         177          +2          -2          ~7  HLT_Ele12_CaloIdL_TrackIdL_IsoVL_PFJet30_v29
        8200           4           -           -          ~4  HLT_Ele14_eta2p5_IsoVVVL_Gsf_PFHT200_PNetBTag0p53_v4
        8200         137          +1          -1          ~4  HLT_Ele23_CaloIdL_TrackIdL_IsoVL_PFJet30_v29
        8200         250          +4          -3          ~9  HLT_Ele8_CaloIdM_TrackIdM_PFJet30_v29
        8200         139          +1           -          ~3  HLT_Ele17_CaloIdM_TrackIdM_PFJet30_v27
        8200         114          +1          -1          ~3  HLT_Ele23_CaloIdM_TrackIdM_PFJet30_v29
        8200           2           -           -          ~1  HLT_Ele50_CaloIdVT_GsfTrkIdT_PFJet165_v29
        8200           1           -           -          ~1  HLT_Ele50_CaloIdVT_GsfTrkIdT_AK8PFJet220_SoftDropMass40_v11
        8200           1           -           -          ~1  HLT_Ele50_CaloIdVT_GsfTrkIdT_AK8PFJet220_SoftDropMass40_PNetBB0p06_v8
        8200           1           -           -          ~1  HLT_Ele50_CaloIdVT_GsfTrkIdT_AK8PFJet230_SoftDropMass40_v11
        8200           1           -           -          ~1  HLT_Ele50_CaloIdVT_GsfTrkIdT_AK8PFJet230_SoftDropMass40_PNetBB0p06_v8
        8200           0           -           -          ~1  HLT_Ele50_CaloIdVT_GsfTrkIdT_AK8PFJet230_SoftDropMass40_PNetBB0p10_v8
        8200         803          +2           -           -  HLT_L3Mu10NoVtx_v9
        8200         450          +2           -           -  HLT_L3Mu10NoVtx_DxyMin0p01cm_v9
        8200          23           -          -1          ~1  HLT_SingleEle8_v8
        8200         795          +5          -7          ~7  HLT_SingleEle8_SingleEGL1_v8
        8200        6911          +1           -           -  Dataset_AlCaLowPtJet
        8200        2386          +1          -1           -  Dataset_EGamma0
        8200        2386          +1          -1           -  Dataset_EGamma1
        8200           7          +4          -6           -  Dataset_EventDisplay
        8200          66          +6          -8           -  Dataset_HLTMonitor
        8200        7115          +1           -           -  Dataset_JetMET0
        8200        7115          +1           -           -  Dataset_JetMET1
        8200        1489          +2           -           -  Dataset_Muon0
        8200        1489          +2           -           -  Dataset_Muon1
        8200         503          +1           -           -  Dataset_ParkingDoubleMuonLowMass0
        8200         503          +1           -           -  Dataset_ParkingDoubleMuonLowMass1
        8200         503          +1           -           -  Dataset_ParkingDoubleMuonLowMass2
        8200         503          +1           -           -  Dataset_ParkingDoubleMuonLowMass3
        8200         503          +1           -           -  Dataset_ParkingDoubleMuonLowMass4
        8200         503          +1           -           -  Dataset_ParkingDoubleMuonLowMass5
        8200         503          +1           -           -  Dataset_ParkingDoubleMuonLowMass6
        8200         503          +1           -           -  Dataset_ParkingDoubleMuonLowMass7

when running on GPU differences seem to be more contained:

$ hltDiff -o output_1.root -n ../../../new/CMSSW_14_0_X_2024-08-06-2300/src/output_1.root
Processed events: 0 out of 8200 (0%)
Processed events: 820 out of 8200 (10%)
Processed events: 1640 out of 8200 (20%)
Processed events: 2460 out of 8200 (30%)
Processed events: 3280 out of 8200 (40%)
Processed events: 4100 out of 8200 (50%)
Processed events: 4920 out of 8200 (60%)
Processed events: 5740 out of 8200 (70%)
Processed events: 6560 out of 8200 (80%)
Processed events: 7380 out of 8200 (90%)
Found 8200 matching events, out of which 44 have different HLT results

      Events    Accepted      Gained        Lost       Other  Trigger
        8200        6900          +1          -1           -  AlCa_AK8PFJet40_v26
        8200           6           -           -          ~1  HLT_DiPFJetAve180_PPSMatch_Xi0p3_QuadJet_Max2ProtPerRP_v5
        8200        6900          +1          -1           -  HLT_AK8PFJet40_v27
        8200        3856          +3          -3           -  HLT_AK8PFJetFwd40_v26
        8200         949           -          -1           -  HLT_AK8PFJetFwd60_v25
        8200         108           -          -1           -  HLT_AK8PFJetFwd140_v25
        8200          83          +1           -           -  HLT_PFHT430_v28
        8200         120           -           -          ~1  HLT_QuadPFJet103_88_75_15_v16
        8200           2           -           -          ~2  HLT_DoubleMediumDeepTauPFTauHPS35_L2NN_eta2p1_v11
        8200           3           -           -          ~1  HLT_QuadPFJet103_88_75_15_PNetBTag_0p4_VBF2_v5
        8200           1           -           -          ~1  HLT_QuadPFJet103_88_75_15_PNet2BTag_0p4_0p12_VBF1_v5
        8200          80           -           -          ~1  HLT_QuadPFJet105_88_75_30_v8
        8200           2           -           -          ~1  HLT_QuadPFJet105_88_75_30_PNet1CvsAll0p5_VBF3Tight_v9
        8200           0           -           -          ~1  HLT_DoubleMediumDeepTauPFTauHPS30_L2NN_eta2p1_PFJet60_v11
        8200           0           -           -          ~1  HLT_DoubleMediumDeepTauPFTauHPS30_L2NN_eta2p1_PFJet75_v11
        8200           7           -           -          ~2  HLT_DoublePNetTauhPFJet30_Medium_L2NN_eta2p3_v5
        8200           6           -           -          ~2  HLT_DoublePNetTauhPFJet30_Tight_L2NN_eta2p3_v5
        8200           1           -           -          ~1  HLT_DoublePNetTauhPFJet26_L2NN_eta2p3_PFJet60_v5
        8200           1           -           -          ~1  HLT_DoublePNetTauhPFJet26_L2NN_eta2p3_PFJet75_v5
        8200        6909          +1          -1           -  Dataset_AlCaLowPtJet
        8200          10          +8         -10           -  Dataset_EventDisplay
        8200          61          +7          -5           -  Dataset_HLTMonitor
        8200        7115          +1          -1           -  Dataset_JetMET0
        8200        7115          +1          -1           -  Dataset_JetMET1

fwyzard · 2024-08-07T11:31:22Z

Interesting... I did no expect any changes to the CPU results, since on CPU the code runs single threaded 🤔

fwyzard · 2024-08-07T20:57:09Z

I've run a comparison of the HLT results using

CMSSW_14_0_13_patch2_MULTIARCHS
/cdaq/physics/Run2024/2e34/v1.4.3/HLT/V2 (from run 384202), unprescaled
10k events from run 383631

and comparing

the base release
only adding the alpaka::syncBlockThreads at the end of the loop
all changes from Fix a race condition in splitVertices [14.0.x] #45655

In all cases, I did not observe any differences in the HLT results running on CPU, and only small discrepancies running on GPU consistent with the usual discrepancies.

fwyzard · 2024-08-07T20:58:37Z

`syncBlockThreads` vs reference:

$ hltDiff -o reference_cpu1.root -n syncBlockThreads_cpu1.root 
Found 10000 matching events, out of which 75 have different HLT results

      Events    Accepted      Gained        Lost       Other  Trigger
       10000          10         +11          -8           -  Dataset_EventDisplay
       10000         177         +22         -24           -  Dataset_ExpressPhysics
       10000          49         +10          -5           -  Dataset_HLTMonitor
       10000           5          +4          -5           -  Dataset_ScoutingPFMonitor

fwyzard · 2024-08-07T20:59:21Z

#45655 vs reference:

$ hltDiff -o reference_cpu1.root -n fix_splitVertices_140x_cpu1.root
Found 10000 matching events, out of which 82 have different HLT results

      Events    Accepted      Gained        Lost       Other  Trigger
       10000          10          +9          -8           -  Dataset_EventDisplay
       10000         177         +24         -26           -  Dataset_ExpressPhysics
       10000          49         +10          -8           -  Dataset_HLTMonitor
       10000           5          +2          -5           -  Dataset_ScoutingPFMonitor

fwyzard · 2024-08-07T21:02:44Z

@cms-sw/reconstruction-l2 could you review and sign this PR and its backport, #45655 ?

mandrenguyen · 2024-08-07T22:31:48Z

@cms-sw/reconstruction-l2 could you review and sign this PR and its backport, #45655 ?

Comparisons are clean except for this one:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisonsGPU/CMSSW_14_1_X_2024-08-06-1100+86968d/63913/validateJR/12434.402_TTbar_14TeV+2023_Patatrack_PixelOnlyAlpaka/all_RECO_step3/c_recoTracks_pixelTracks__RECO_obj_qualityMask.png

Is that expected/understood?

fwyzard · 2024-08-07T22:51:32Z

Mhm, I do not expect any differences from these changes (other than preventing a rare crash induced by the race condition being fixed), only the usual GPU vs GPU lack of reproducibility.

fwyzard · 2024-08-07T23:05:23Z

I also do not observe any impact on the HLT timing, using a subset of the HLT menu that runs the Pixel, ECAL and HCAL+PF reconstruction on GPUs on every event.

reference

Running 4 times over 20300 events with 16 jobs, each with 32 threads, 24 streams and 1 GPUs
  2129.5 ±   0.1 ev/s (20000 events, 99.0% overlap)
  2127.9 ±   0.1 ev/s (20000 events, 99.0% overlap)
  2129.3 ±   0.1 ev/s (20000 events, 98.8% overlap)
  2128.2 ±   0.1 ev/s (20000 events, 99.3% overlap)
 --------------------
  2128.7 ±   0.8 ev/s

#45655

Running 4 times over 20300 events with 16 jobs, each with 32 threads, 24 streams and 1 GPUs
  2129.5 ±   0.1 ev/s (20000 events, 99.0% overlap)
  2128.5 ±   0.1 ev/s (20000 events, 99.3% overlap)
  2129.6 ±   0.1 ev/s (20000 events, 98.7% overlap)
  2128.3 ±   0.1 ev/s (20000 events, 98.2% overlap)
 --------------------
  2129.0 ±   0.7 ev/s

mandrenguyen · 2024-08-08T00:22:43Z

+1

cmsbuild · 2024-08-08T00:23:05Z

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

Fix a race condition in splitVertices

e746fdf

Add alpaka::syncBlockThreads(acc); at the end of the loop on the vertices to ensure that all threads are properly synchronised before resetting the shared memory. Clean up the kernel to use the SoA accessors and the cms::alpakatools utilities.

fwyzard mentioned this pull request Aug 6, 2024

Fix a race condition in splitVertices [14.0.x] #45655

Merged

cmsbuild added this to the CMSSW_14_1_X milestone Aug 6, 2024

cmsbuild added reconstruction-pending pending-signatures tests-pending orp-pending code-checks-pending tracking labels Aug 6, 2024

cmsbuild added tests-started bug-fix and removed tests-pending labels Aug 6, 2024

cmsbuild added code-checks-approved and removed code-checks-pending labels Aug 6, 2024

fwyzard commented Aug 6, 2024

View reviewed changes

cmsbuild added tests-approved and removed tests-started labels Aug 6, 2024

mmusich added a commit to mmusich/hltScripts that referenced this pull request Aug 7, 2024

add script to test cms-sw/cmssw#45656

47d0cd3

cmsbuild added reconstruction-approved fully-signed orp-approved and removed reconstruction-pending pending-signatures orp-pending labels Aug 8, 2024

cmsbuild merged commit e0cb38f into cms-sw:master Aug 8, 2024
14 checks passed

fwyzard deleted the fix_splitVertices_141x branch August 12, 2024 07:35

mmusich mentioned this pull request Aug 13, 2024

HLT crashes in Run 380399 #44923

Closed

cmsbuild mentioned this pull request Aug 16, 2024

[TF] Update TF v2.16.1 cms-sw/cmsdist#9241

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix a race condition in splitVertices #45656

Fix a race condition in splitVertices #45656

fwyzard commented Aug 6, 2024 •

edited

Loading

cmsbuild commented Aug 6, 2024 •

edited

Loading

fwyzard commented Aug 6, 2024

fwyzard commented Aug 6, 2024

fwyzard commented Aug 6, 2024

cmsbuild commented Aug 6, 2024

cmsbuild commented Aug 6, 2024

fwyzard Aug 6, 2024

fwyzard Aug 6, 2024

fwyzard commented Aug 6, 2024 •

edited

Loading

cmsbuild commented Aug 6, 2024

missirol commented Aug 7, 2024

mmusich commented Aug 7, 2024

fwyzard commented Aug 7, 2024

fwyzard commented Aug 7, 2024

fwyzard commented Aug 7, 2024

fwyzard commented Aug 7, 2024

fwyzard commented Aug 7, 2024

mandrenguyen commented Aug 7, 2024

fwyzard commented Aug 7, 2024

fwyzard commented Aug 7, 2024

mandrenguyen commented Aug 8, 2024

cmsbuild commented Aug 8, 2024

Fix a race condition in splitVertices #45656

Fix a race condition in splitVertices #45656

Conversation

fwyzard commented Aug 6, 2024 • edited Loading

PR description:

PR validation:

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

cmsbuild commented Aug 6, 2024 • edited Loading

fwyzard commented Aug 6, 2024

fwyzard commented Aug 6, 2024

fwyzard commented Aug 6, 2024

cmsbuild commented Aug 6, 2024

cmsbuild commented Aug 6, 2024

fwyzard Aug 6, 2024

Choose a reason for hiding this comment

fwyzard Aug 6, 2024

Choose a reason for hiding this comment

fwyzard commented Aug 6, 2024 • edited Loading

cmsbuild commented Aug 6, 2024

Comparison Summary

GPU Comparison Summary

missirol commented Aug 7, 2024

mmusich commented Aug 7, 2024

fwyzard commented Aug 7, 2024

fwyzard commented Aug 7, 2024

fwyzard commented Aug 7, 2024

syncBlockThreads vs reference:

fwyzard commented Aug 7, 2024

#45655 vs reference:

fwyzard commented Aug 7, 2024

mandrenguyen commented Aug 7, 2024

fwyzard commented Aug 7, 2024

fwyzard commented Aug 7, 2024

reference

#45655

mandrenguyen commented Aug 8, 2024

cmsbuild commented Aug 8, 2024

fwyzard commented Aug 6, 2024 •

edited

Loading

cmsbuild commented Aug 6, 2024 •

edited

Loading

fwyzard commented Aug 6, 2024 •

edited

Loading

`syncBlockThreads` vs reference: