Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move pat::XGBooster out of PhysicsTools/PatAlgos into its own package and use it for PhotonXGBoostEstimator #45085

Merged
merged 3 commits into from
Jun 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions PhysicsTools/PatAlgos/BuildFile.xml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
<use name="PhysicsTools/PatUtils"/>
<use name="PhysicsTools/TensorFlow"/>
<use name="PhysicsTools/ONNXRuntime"/>
<use name="PhysicsTools/XGBoost"/>
<use name="TrackingTools/Records"/>
<use name="clhep"/>
<use name="xgboost"/>
Expand Down
2 changes: 1 addition & 1 deletion PhysicsTools/PatAlgos/plugins/PATMuonProducer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,8 @@
#include "PhysicsTools/PatAlgos/interface/PATUserDataHelper.h"
#include "PhysicsTools/PatAlgos/interface/SoftMuonMvaEstimator.h"
#include "PhysicsTools/PatAlgos/interface/SoftMuonMvaRun3Estimator.h"
#include "PhysicsTools/PatAlgos/interface/XGBooster.h"
#include "PhysicsTools/PatUtils/interface/MiniIsolation.h"
#include "PhysicsTools/XGBoost/interface/XGBooster.h"
#include "TrackingTools/IPTools/interface/IPTools.h"
#include "TrackingTools/Records/interface/TransientTrackRecord.h"
#include "TrackingTools/TransientTrack/interface/TransientTrack.h"
Expand Down
2 changes: 1 addition & 1 deletion PhysicsTools/PatAlgos/src/SoftMuonMvaRun3Estimator.cc
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#include "PhysicsTools/PatAlgos/interface/SoftMuonMvaRun3Estimator.h"
#include "DataFormats/PatCandidates/interface/Muon.h"
#include "PhysicsTools/PatAlgos/interface/XGBooster.h"
#include "PhysicsTools/XGBoost/interface/XGBooster.h"

typedef std::pair<const reco::MuonChamberMatch*, const reco::MuonSegmentMatch*> MatchPair;

Expand Down
4 changes: 4 additions & 0 deletions PhysicsTools/XGBoost/BuildFile.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
<use name="xgboost"/>
<export>
<lib name="1"/>
</export>
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#ifndef PhysicsTools_PatAlgos_XGBooster_h
#define PhysicsTools_PatAlgos_XGBooster_h
#ifndef PhysicsTools_XGBoost_XGBooster_h
#define PhysicsTools_XGBoost_XGBooster_h

#include <memory>
#include <string>
Expand All @@ -22,7 +22,7 @@ namespace pat {

void set(std::string name, float value);

float predict();
float predict(const int iterationEnd = 0);

private:
std::vector<float> features_;
Expand Down
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmusich,

  auto ret = XGBoosterPredictFromDMatrix(booster_, dvalues, json, &out_shape, &out_len, &score);
  XGDMatrixFree(dvalues);
  if (ret == 0) {
    assert(out_len == 1 && "Unexpected prediction format");
    result = score[0];
  }

I can't find particular details about the API, but I suppose that score, a pointer, points to something in dvalues handle (and we might see problems only with multi-threaded running), so it is safer to move XGDMatrixFree(dvalues); to the end of this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @smorovic OK, I can implement that.
For the record, I am not the original author of this code (just moved it in order to get it accessible to other consumers), let me tag @drkovalskyi that introduced it in #43622 in case he has comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Also, for the record, XGB photon producer was freeing it only after accessing output score.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++ interface to XGBoost is voodoo, so don't ask me how I made it work. All this was developed with trials and errors over years following changes in the API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed at e6df909

Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
#include "PhysicsTools/PatAlgos/interface/XGBooster.h"
#include <algorithm>
#include <cassert>
#include <cmath>
#include <stdexcept>

#include <iostream>
#include <cstdio> // For std::snprintf
#include <fstream>
#include <iostream>
#include <sstream>
#include <vector>
#include <stdexcept>
#include <stdexcept>
#include <vector>

#include "PhysicsTools/XGBoost/interface/XGBooster.h"

using namespace pat;

Expand Down Expand Up @@ -77,7 +78,7 @@ void XGBooster::addFeature(std::string name) {

void XGBooster::set(std::string name, float value) { features_.at(feature_name_to_index_[name]) = value; }

float XGBooster::predict() {
float XGBooster::predict(const int iterationEnd) {
float result(-999.);

// check if all feature values are set properly
Expand All @@ -99,27 +100,32 @@ float XGBooster::predict() {
bst_ulong out_len = 0;
const float* score = nullptr;

// config json
const char* json = R"({
char json[256]; // Make sure the buffer is large enough to hold the resulting JSON string

// Use snprintf to format the JSON string with the external value
std::snprintf(json,
sizeof(json),
R"({
"type": 0,
"training": false,
"iteration_begin": 0,
"iteration_end": 0,
"iteration_end": %d,
"strict_shape": false
})";
})",
iterationEnd);

// Shape of output prediction
bst_ulong const* out_shape = nullptr;

auto ret = XGBoosterPredictFromDMatrix(booster_, dvalues, json, &out_shape, &out_len, &score);

XGDMatrixFree(dvalues);

if (ret == 0) {
assert(out_len == 1 && "Unexpected prediction format");
result = score[0];
}

XGDMatrixFree(dvalues);

reset();

return result;
Expand Down
16 changes: 8 additions & 8 deletions RecoEgamma/PhotonIdentification/BuildFile.xml
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
<use name="FWCore/Framework"/>
<use name="FWCore/MessageLogger"/>
<use name="FWCore/ParameterSet"/>
<use name="CommonTools/MVAUtils" />
<use name="CondFormats/DataRecord"/>
<use name="CondFormats/EcalObjects"/>
<use name="CondFormats/HcalObjects"/>
<use name="CondFormats/DataRecord"/>
<use name="DataFormats/BeamSpot"/>
<use name="DataFormats/DetId"/>
<use name="DataFormats/EcalDetId"/>
Expand All @@ -12,17 +10,19 @@
<use name="DataFormats/EgammaReco"/>
<use name="DataFormats/HcalRecHit"/>
<use name="DataFormats/TrackReco"/>
<use name="FWCore/Framework"/>
<use name="FWCore/MessageLogger"/>
<use name="FWCore/ParameterSet"/>
<use name="Geometry/CaloGeometry"/>
<use name="Geometry/Records"/>
<use name="Geometry/CaloTopology"/>
<use name="Geometry/Records"/>
<use name="PhysicsTools/TensorFlow" />
<use name="PhysicsTools/XGBoost" />
<use name="RecoEcal/EgammaCoreTools"/>
<use name="RecoEgamma/EgammaIsolationAlgos"/>
<use name="RecoEgamma/EgammaTools"/>
<use name="RecoLocalCalo/EcalRecAlgos"/>
<use name="RecoLocalCalo/HcalRecAlgos"/>
<use name="PhysicsTools/TensorFlow" />
<use name="CommonTools/MVAUtils" />
<use name="xgboost"/>
<export>
<lib name="1"/>
</export>
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#define ReciEgamma_PhotonIdentification_PhotonXGBoostEstimator_h

#include "FWCore/ParameterSet/interface/FileInPath.h"
#include "xgboost/c_api.h"
#include "PhysicsTools/XGBoost/interface/XGBooster.h"

class PhotonXGBoostEstimator {
public:
Expand All @@ -20,7 +20,7 @@ class PhotonXGBoostEstimator {
float ecalPFIsoIn) const;

private:
BoosterHandle booster_;
std::unique_ptr<pat::XGBooster> booster_;
int best_ntree_limit_ = -1;
std::string config_;
};
Expand Down
66 changes: 22 additions & 44 deletions RecoEgamma/PhotonIdentification/src/PhotonXGBoostEstimator.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2,34 +2,21 @@
#include <sstream>

PhotonXGBoostEstimator::PhotonXGBoostEstimator(const edm::FileInPath& weightsFile, int best_ntree_limit) {
XGBoosterCreate(NULL, 0, &booster_);
// Set number of threads to 1, to avoid spawning hundreds of OpenMP threads
// See https://github.com/cms-sw/cmssw/issues/44923 for details
XGBoosterSetParam(booster_, "nthread", "1");
XGBoosterLoadModel(booster_, weightsFile.fullPath().c_str());
best_ntree_limit_ = best_ntree_limit;
booster_ = std::make_unique<pat::XGBooster>(weightsFile.fullPath());
booster_->addFeature("rawEnergy");
booster_->addFeature("r9");
booster_->addFeature("sigmaIEtaIEta");
booster_->addFeature("etaWidth");
booster_->addFeature("phiWidth");
booster_->addFeature("s4");
booster_->addFeature("eta");
booster_->addFeature("hOvrE");
booster_->addFeature("ecalPFIso");

std::stringstream config;
config << "{\"training\": false, \"type\": 0, \"iteration_begin\": 0, \"iteration_end\": " << best_ntree_limit_
<< ", \"strict_shape\": false}";
config_ = config.str();
best_ntree_limit_ = best_ntree_limit;
}

PhotonXGBoostEstimator::~PhotonXGBoostEstimator() { XGBoosterFree(booster_); }

namespace {
enum inputIndexes {
rawEnergy = 0, // 0
r9 = 1, // 1
sigmaIEtaIEta = 2, // 2
etaWidth = 3, // 3
phiWidth = 4, // 4
s4 = 5, // 5
eta = 6, // 6
hOvrE = 7, // 7
ecalPFIso = 8, // 8
};
} // namespace
PhotonXGBoostEstimator::~PhotonXGBoostEstimator() {}

float PhotonXGBoostEstimator::computeMva(float rawEnergyIn,
float r9In,
Expand All @@ -40,24 +27,15 @@ float PhotonXGBoostEstimator::computeMva(float rawEnergyIn,
float etaIn,
float hOvrEIn,
float ecalPFIsoIn) const {
float var[9];
var[rawEnergy] = rawEnergyIn;
var[r9] = r9In;
var[sigmaIEtaIEta] = sigmaIEtaIEtaIn;
var[etaWidth] = etaWidthIn;
var[phiWidth] = phiWidthIn;
var[s4] = s4In;
var[eta] = etaIn;
var[hOvrE] = hOvrEIn;
var[ecalPFIso] = ecalPFIsoIn;
booster_->set("rawEnergy", rawEnergyIn);
booster_->set("r9", r9In);
booster_->set("sigmaIEtaIEta", sigmaIEtaIEtaIn);
booster_->set("etaWidth", etaWidthIn);
booster_->set("phiWidth", phiWidthIn);
booster_->set("s4", s4In);
booster_->set("eta", etaIn);
booster_->set("hOvrE", hOvrEIn);
booster_->set("ecalPFIso", ecalPFIsoIn);

DMatrixHandle dmat;
XGDMatrixCreateFromMat(var, 1, 9, -999.9f, &dmat);
uint64_t const* out_shape;
uint64_t out_dim;
const float* out_result = NULL;
XGBoosterPredictFromDMatrix(booster_, dmat, config_.c_str(), &out_shape, &out_dim, &out_result);
float ret = out_result[0];
XGDMatrixFree(dmat);
return ret;
return booster_->predict(best_ntree_limit_);
}