Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concatenation on axis=1 with ak.combinations introduces overtouching #526

Open
ikrommyd opened this issue Jul 19, 2024 · 2 comments
Open

Comments

@ikrommyd
Copy link
Contributor

ikrommyd commented Jul 19, 2024

To reproduce:

import dask_awkward as dak
from coffea.nanoevents import NanoEventsFactory

events = NanoEventsFactory.from_root({"https://github.com/CoffeaTeam/coffea/raw/master/tests/samples/nano_dy.root": "Events"}).events()

tnp = dak.combinations(events.Electron, 2, fields=["tag", "probe"])
pnt = dak.combinations(events.Electron, 2, fields=["probe", "tag"])
zcands = dak.concatenate([tnp, pnt], axis=1)
dak.necessary_columns(zcands.tag.pt)

will give this while it should only need nElectron and Electron_pt:

{'from-uproot-df174c21f639e55bab46ab7dd4a720b9': frozenset({'Electron_charge',
            'Electron_cleanmask',
            'Electron_convVeto',
            'Electron_cutBased',
            'Electron_cutBased_Fall17_V1',
            'Electron_cutBased_HEEP',
            'Electron_deltaEtaSC',
            'Electron_dr03EcalRecHitSumEt',
            'Electron_dr03HcalDepth1TowerSumEt',
            'Electron_dr03TkSumPt',
            'Electron_dr03TkSumPtHEEP',
            'Electron_dxy',
            'Electron_dxyErr',
            'Electron_dz',
            'Electron_dzErr',
            'Electron_eCorr',
            'Electron_eInvMinusPInv',
            'Electron_energyErr',
            'Electron_eta',
            'Electron_genPartFlav',
            'Electron_genPartIdx',
            'Electron_hoe',
            'Electron_ip3d',
            'Electron_isPFcand',
            'Electron_jetIdx',
            'Electron_jetPtRelv2',
            'Electron_jetRelIso',
            'Electron_lostHits',
            'Electron_mass',
            'Electron_miniPFRelIso_all',
            'Electron_miniPFRelIso_chg',
            'Electron_mvaFall17V1Iso',
            'Electron_mvaFall17V1Iso_WP80',
            'Electron_mvaFall17V1Iso_WP90',
            'Electron_mvaFall17V1Iso_WPL',
            'Electron_mvaFall17V1noIso',
            'Electron_mvaFall17V1noIso_WP80',
            'Electron_mvaFall17V1noIso_WP90',
            'Electron_mvaFall17V1noIso_WPL',
            'Electron_mvaFall17V2Iso',
            'Electron_mvaFall17V2Iso_WP80',
            'Electron_mvaFall17V2Iso_WP90',
            'Electron_mvaFall17V2Iso_WPL',
            'Electron_mvaFall17V2noIso',
            'Electron_mvaFall17V2noIso_WP80',
            'Electron_mvaFall17V2noIso_WP90',
            'Electron_mvaFall17V2noIso_WPL',
            'Electron_mvaTTH',
            'Electron_pdgId',
            'Electron_pfRelIso03_all',
            'Electron_pfRelIso03_chg',
            'Electron_phi',
            'Electron_photonIdx',
            'Electron_pt',
            'Electron_r9',
            'Electron_seedGain',
            'Electron_sieie',
            'Electron_sip3d',
            'Electron_tightCharge',
            'Electron_vidNestedWPBitmap',
            'Electron_vidNestedWPBitmapHEEP',
            'nElectron',
            'nGenPart',
            'nJet',
            'nPhoton'})}
@martindurant
Copy link
Collaborator

Are you saying that you need both combinations and concatenate to get this?

@ikrommyd
Copy link
Contributor Author

ikrommyd commented Jul 19, 2024

You can do once and concatenate with itself like

In [1]: import dask_awkward as dak
   ...: from coffea.nanoevents import NanoEventsFactory
   ...:
   ...: events = NanoEventsFactory.from_root({"https://github.com/CoffeaTeam/coffea/raw/master/tests/samples/nano_dy.root": "Events"}).events()
   ...:
   ...: tnp = dak.combinations(events.Electron, 2, fields=["tag", "probe"])
/Users/iason/miniforge3/envs/egamma_dev/lib/python3.10/site-packages/coffea/nanoevents/methods/candidate.py:11: FutureWarning: In version 2024.7.0 (target date: 2024-06-30 11:59:59-05:00), this will be an error.
To raise these warnings as errors (and get stack traces to find out where they're called), run
    import warnings
    warnings.filterwarnings("error", module="coffea.*")
after the first `import coffea` or use `@pytest.mark.filterwarnings("error:::coffea.*")` in pytest.
Issue: coffea.nanoevents.methods.vector will be removed and replaced with scikit-hep vector. Nanoevents schemas internal to coffea will be migrated. Otherwise please consider using that package!.
  from coffea.nanoevents.methods import vector
/Users/iason/miniforge3/envs/egamma_dev/lib/python3.10/site-packages/coffea/nanoevents/schemas/nanoaod.py:243: RuntimeWarning: Missing cross-reference index for FatJet_genJetAK8Idx => GenJetAK8
  warnings.warn(

In [2]: zcands = dak.concatenate([tnp, tnp], axis=1)
   ...: dak.necessary_columns(zcands.tag.pt)

But yes, you do need to concatenate. tnp itself is fine

In [3]: dak.necessary_columns(tnp.tag.pt)
Out[3]:
{'from-uproot-b0c009586b4553e84b096eeaee2d1795': frozenset({'Electron_pt',
            'nElectron'})}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants