Fixed target DIS DW datasets as variant #2176

giacomomagni · 2024-10-16T15:45:02Z

Move the DW fixed target DIS to be a variant.

giacomomagni · 2024-10-16T15:58:01Z

I noticed that:

BCDMS DW had different FKtables splitted by Q2, is that because we forgot to update
the standard BCDMS sets?

These 2 parameters are also different in the DW data and standard one:

nnpdf_metadata.experiment
observable_name.process_type
can we add them to the variant options?

RoyStegeman · 2024-10-16T16:25:27Z

I'm not sure if this is what you mean, but BCDMS has a different fktable for each beam energy because there is a degeneracy in Q2,x between them so pineappl cannot distinguish the measurements of a given kinematic point at different energies.

nnpdf_data/nnpdf_data/commondata/dataset_names.yml

giacomomagni · 2024-10-16T16:31:28Z

I'm not sure if this is what you mean, but BCDMS has a different fktable for each beam energy because there is a degeneracy in Q2,x between them so pineappl cannot distinguish the measurements of a given kinematic point at different energies.

I've noticed that this has not been updated, because we usually use the DW version:

nnpdf/nnpdf_data/nnpdf_data/commondata/BCDMS_NC_NOTFIXED_D/metadata.yaml

Line 65 in deda5b8

- - BCDMS_NC_EM_D_F2

RoyStegeman · 2024-10-16T16:43:06Z

Ah that's your point. Yes you're right. I think you can just update the FKtables in the metadata? Because the difference is just in the uncertainties yaml. I would assume that in new theories the old fktables don't even exist

scarlehoff · 2024-10-16T18:30:43Z

nnpdf_metadata.experiment

Nop... this became a problem quite quickly @RoyStegeman :___

I'd say use the nuclear/deuteron type which should be more inclusive in what correlations are allowed (right @enocera ?)

observable_name.process_type

Here even if they used to be different, they can probably be made into the same one because the data is the same.

giacomomagni · 2024-10-17T15:02:24Z

The 2 remaining tests: test_pseudodata and test_overfit_metric are failing because of the change in nnpdf_metadata.experiment, I suspect (ie reverting the metadata makes them passing).
so there we need a better solution.

scarlehoff · 2024-10-17T15:26:02Z

are failing because of the change in nnpdf_metadata.experiment

Because that's making the random numbers change? (the seed takes the experiment name)

If so... I don't really see a way of fixing it without having two separate datasets, but if it is only the random numbers we can accept that it changed (this tag will not be random-seed-equivalent with the previous one anyway) although maybe we will need to redo some of the fits in the tests.

giacomomagni · 2024-10-18T07:20:36Z

So I gave a try to add experiment in the variant options,
somehow PlottingOptions still need a deepcopy somewhere, which I'm sure know how to fix
(I tried with object.__setattr__ but didn't work) .
If a7294bd it's not going in the correct direction then I suspect we have to give up on this PR.

scarlehoff · 2024-10-18T11:28:24Z

somehow PlottingOptions still need a deepcopy somewhere, which I'm sure know how to fix

Why do you care about PlottingOptions?

... I'm wondering whether it wouldn't be better just to have them as separate datasets. The fact that two "variants" of the same dataset can come from a different experiment is... silly. We are just trying to get around a mistake of times past.

However, send me the runcard and I'll give it a go .

scarlehoff

Seems fine? At least at the level of the tests. Let's see whether the fitbot works fine as well.

nnpdf_data/nnpdf_data/__init__.py

validphys2/src/validphys/tests/test_commondataparser.py

github-actions · 2024-10-22T09:32:58Z

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Fit Name: NNBOT-fb3bb38d7-2024-10-22
Fit Report wrt master: https://vp.nnpdf.science/dJKuRCPJRI2A9b18duXNIQ==
Fit Report wrt latest stable reference: https://vp.nnpdf.science/kjRD46_IRB-qIRQhDPp_Ww==
Fit Data: https://data.nnpdf.science/fits/NNBOT-fb3bb38d7-2024-10-22.tar.gz

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

scarlehoff · 2024-10-22T09:36:59Z

I think the runcards with DW are not understood correctly by vp :/

giacomomagni · 2024-10-22T09:46:49Z

I think the runcards with DW are not understood correctly by vp :/

PDFs are identical, but yes the chi2 tables are weird 😅
Something looks wrong when reading the old filters no?

scarlehoff · 2024-10-22T10:11:47Z

No, the problem is here https://github.com/NNPDF/nnpdf/blob/87daeb1b25cc913fe45a6d7d8be779f80c7ebffc/validphys2/src/validphys/config.py#L465 since we already had a variant. Not sure how to get around this without adding too much _DW_ specific code

The easy fix is to add:

# legacy_dw trumps everything
if variant is None or map_variant == "legacy_dw"

Btw, you need to add to this PR also the fixed target DY which also have DW variants.

giacomomagni · 2024-10-22T10:39:58Z

# legacy_dw trumps everything
if variant is None or map_variant == "legacy_dw"

I see, Shall I go for it?

Btw, you need to add to this PR also the fixed target DY which also have DW variants.

Yes, but these are trivial as they only have a DW legacy (as EMC), so it's just renaming but no duplication.

scarlehoff · 2024-10-22T11:42:14Z

Yes, let's try...

github-actions · 2024-10-22T21:16:30Z

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Fit Name: NNBOT-dadf6d879-2024-10-22
Fit Report wrt master: https://vp.nnpdf.science/RyQPwAr5Q7q_uJLIFnYsBQ==
Fit Report wrt latest stable reference: https://vp.nnpdf.science/X4SPA7evT8GdJqObRaoDSA==
Fit Data: https://data.nnpdf.science/fits/NNBOT-dadf6d879-2024-10-22.tar.gz

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

scarlehoff · 2024-10-23T06:24:19Z

Ok, this works now in terms of vp, however looking through the datafiles, they are actually not always the same, so you need to copy them as well for some of the datasets.

For instance, these two in master are not equal:

https://github.com/NNPDF/nnpdf/blob/master/nnpdf_data/nnpdf_data/commondata/NUTEV_CC_NOTFIXED_FE/data_legacy_NB-SIGMARED.yaml
https://github.com/NNPDF/nnpdf/blob/master/nnpdf_data/nnpdf_data/commondata/NUTEV_CC_NOTFIXED_FE_DW/data_legacy_NB-SIGMARED.yaml

The differences left in the fitbot are coming from there, but the fitbot does not include all datasets, so you should check. These can be added to the variant so it should just be a question of copying them over.

A quick check would be running this runcard for one replica for one epoch and then looking at the resulting .csv files for the generated pseudodata, if they are identical it's all good https://github.com/NNPDF/nnpdf/blob/master/n3fit/runcards/examples/nnpdf40-like.yml

giacomomagni · 2024-10-23T06:59:36Z

Ok, this works now in terms of vp, however looking through the datafiles, they are actually not always the same, so you need to copy them as well for some of the datasets.

okay let me double check this.

giacomomagni · 2024-10-23T09:12:02Z

Okay so somehow only Nutev had different central values.
Here you have a small script to check:

import yaml
import pathlib
import rich
import numpy as np

COMMONDATA_PATH = pathlib.Path(__file__).parent / "nnpdf_data/nnpdf_data/commondata"

def check_central(dataset, data_file):
    with open(COMMONDATA_PATH / f"{dataset}_DW" / data_file, encoding="utf-8") as f:
        central_dw = yaml.safe_load(f)['data_central']
    with open(COMMONDATA_PATH / dataset / data_file, encoding="utf-8") as f:
        central = yaml.safe_load(f)['data_central']

    try:
        np.testing.assert_allclose(central_dw, central)
    except AssertionError:
        rich.print(f"[red] {dataset}_DW has a different central value")
    return

if __name__ == "__main__":
    DATA_LIST = [
        "BCDMS_NC_NOTFIXED_D",
        "BCDMS_NC_NOTFIXED_P",
        "CHORUS_CC_NOTFIXED_PB",
        "NMC_NC_NOTFIXED",
        "NUTEV_CC_NOTFIXED_FE",
        "SLAC_NC_NOTFIXED_D",
        "SLAC_NC_NOTFIXED_P",
    ]
    for dataset in DATA_LIST:
        try:
            data_file = "data_legacy_EM-F2.yaml"
            check_central(dataset, data_file)
        except FileNotFoundError:
            data_file = "data_legacy_NB-SIGMARED.yaml"
            check_central(dataset, data_file)
            data_file = "data_legacy_NU-SIGMARED.yaml"
            check_central(dataset, data_file)

And the outcome of the test you suggested with the nutev data fixed.
They look identical now.

Fit on master:
datacuts_theory_fitting_pseudodata_table.csv

Fit on this branch:
datacuts_theory_fitting_pseudodata_table.csv

github-actions · 2024-10-23T12:10:25Z

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Fit Name: NNBOT-da9fed01b-2024-10-23
Fit Report wrt master: https://vp.nnpdf.science/USEBg7oMS7KtG7Bv7Q8Nkg==
Fit Report wrt latest stable reference: https://vp.nnpdf.science/f6jzzSK1TvSAhZyOV6l8FQ==
Fit Data: https://data.nnpdf.science/fits/NNBOT-da9fed01b-2024-10-23.tar.gz

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

… without one

Co-authored-by: Juan M. Cruz-Martinez <[email protected]>

giacomomagni added the data toolchain label Oct 16, 2024

giacomomagni requested a review from scarlehoff October 16, 2024 15:45

giacomomagni linked an issue Oct 16, 2024 that may be closed by this pull request

Revisit implementation of all DIS #2073

Open

giacomomagni marked this pull request as draft October 16, 2024 15:53

giacomomagni commented Oct 16, 2024

View reviewed changes

nnpdf_data/nnpdf_data/commondata/dataset_names.yml Show resolved Hide resolved

scarlehoff approved these changes Oct 22, 2024

View reviewed changes

nnpdf_data/nnpdf_data/__init__.py Outdated Show resolved Hide resolved

validphys2/src/validphys/tests/test_commondataparser.py Show resolved Hide resolved

scarlehoff added the run-fit-bot Starts fit bot from a PR. label Oct 22, 2024

giacomomagni marked this pull request as ready for review October 22, 2024 09:03

scarlehoff added run-fit-bot Starts fit bot from a PR. and removed run-fit-bot Starts fit bot from a PR. labels Oct 22, 2024

scarlehoff removed the run-fit-bot Starts fit bot from a PR. label Oct 23, 2024

scarlehoff added the run-fit-bot Starts fit bot from a PR. label Oct 23, 2024

giacomomagni and others added 18 commits October 24, 2024 06:29

move dw datasets to be variant

a809532

init fixing metadata

9c412e1

remove DW from example runcards

71975ca

clean docs

c0ce28f

clean DW in vp examples

25b2589

fix tests

31534e0

second round of test fixes

841b594

restore correct experiment metadata

8d6993b

attempt to introduce experiement variant

342eb50

small changes for plottingoptions

a5a9f84

another try, still failing

e1d3a54

allow plotinfo being _created_ without an experiment but _never_ used…

d01d494

… without one

minor forgotten null

4aea5e9

Update nnpdf_data/nnpdf_data/__init__.py

a182259

Co-authored-by: Juan M. Cruz-Martinez <[email protected]>

Update validphys2/src/validphys/tests/test_commondataparser.py

a45a1ac

Co-authored-by: Juan M. Cruz-Martinez <[email protected]>

apply another trick to make fitbot happy

cc38cfb

restore central value of NUTEV dy variant

13bd8d1

Update fitbot.yml

d6ce694

scarlehoff force-pushed the fixed_target_dis_dw_variants branch from 15174ca to d6ce694 Compare October 24, 2024 04:29

giacomomagni merged commit 0fadd2d into master Oct 24, 2024
6 checks passed

giacomomagni deleted the fixed_target_dis_dw_variants branch October 24, 2024 07:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed target DIS DW datasets as variant #2176

Fixed target DIS DW datasets as variant #2176

giacomomagni commented Oct 16, 2024

giacomomagni commented Oct 16, 2024 •

edited

Loading

RoyStegeman commented Oct 16, 2024

giacomomagni commented Oct 16, 2024

RoyStegeman commented Oct 16, 2024

scarlehoff commented Oct 16, 2024

giacomomagni commented Oct 17, 2024 •

edited

Loading

scarlehoff commented Oct 17, 2024

giacomomagni commented Oct 18, 2024 •

edited

Loading

scarlehoff commented Oct 18, 2024 •

edited

Loading

scarlehoff left a comment

github-actions bot commented Oct 22, 2024

scarlehoff commented Oct 22, 2024

giacomomagni commented Oct 22, 2024 •

edited

Loading

scarlehoff commented Oct 22, 2024

giacomomagni commented Oct 22, 2024

scarlehoff commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

scarlehoff commented Oct 23, 2024 •

edited

Loading

giacomomagni commented Oct 23, 2024

giacomomagni commented Oct 23, 2024 •

edited

Loading

github-actions bot commented Oct 23, 2024

Fixed target DIS DW datasets as variant #2176

Fixed target DIS DW datasets as variant #2176

Conversation

giacomomagni commented Oct 16, 2024

giacomomagni commented Oct 16, 2024 • edited Loading

RoyStegeman commented Oct 16, 2024

giacomomagni commented Oct 16, 2024

RoyStegeman commented Oct 16, 2024

scarlehoff commented Oct 16, 2024

giacomomagni commented Oct 17, 2024 • edited Loading

scarlehoff commented Oct 17, 2024

giacomomagni commented Oct 18, 2024 • edited Loading

scarlehoff commented Oct 18, 2024 • edited Loading

scarlehoff left a comment

Choose a reason for hiding this comment

github-actions bot commented Oct 22, 2024

scarlehoff commented Oct 22, 2024

giacomomagni commented Oct 22, 2024 • edited Loading

scarlehoff commented Oct 22, 2024

giacomomagni commented Oct 22, 2024

scarlehoff commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

scarlehoff commented Oct 23, 2024 • edited Loading

giacomomagni commented Oct 23, 2024

giacomomagni commented Oct 23, 2024 • edited Loading

github-actions bot commented Oct 23, 2024

giacomomagni commented Oct 16, 2024 •

edited

Loading

giacomomagni commented Oct 17, 2024 •

edited

Loading

giacomomagni commented Oct 18, 2024 •

edited

Loading

scarlehoff commented Oct 18, 2024 •

edited

Loading

giacomomagni commented Oct 22, 2024 •

edited

Loading

scarlehoff commented Oct 23, 2024 •

edited

Loading

giacomomagni commented Oct 23, 2024 •

edited

Loading