Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple staterror modifiers per sample and channel, round trips and duplicate modifiers #1830

Open
1 task done
alexander-held opened this issue Mar 31, 2022 · 3 comments · May be fixed by #2397
Open
1 task done
Labels
bug Something isn't working needs-triage Needs a maintainer to categorize and assign user request Request coming form a pyhf user

Comments

@alexander-held
Copy link
Member

Summary

In the context of #760 I was wondering what would happen in a setup that exploits the fact that pyhf allows customizing parameter names for staterror modifiers to assign multiple such modifiers to the same sample in the same channel.

Within just pyhf, this seems to work fine, and there will be multiple parameters in the fit. When converting to xml+ROOT, two StatError modifiers do show up in the xml. When converting back to JSON, the staterror modifier names get standardized, resulting in two fields with the same modifier name. Only one of them is used when building the model, so the roundtrip fails.

OS / Environment

n/a

Steps to Reproduce

Default workspace:

{
    "channels": [
        {
            "name": "ch",
            "samples": [
                {
                    "data": [1000.0],
                    "modifiers": [
                        {"data": null, "name": "mu_sig", "type": "normfactor"},
                        {"data": [10.0], "name": "staterror_ch_1", "type": "staterror"},
                        {"data": [20.0], "name": "staterror_ch_2", "type": "staterror"}
                    ],
                    "name": "signal"
                }
            ]
        }
    ],
    "measurements": [{"config": {"parameters": [], "poi": "mu_sig"}, "name": "meas"}],
    "observations": [{"data": [1000], "name": "ch"}],
    "version": "1.0.0"
}

This contains two parameters controlling the staterror modifiers (+ mu_sig, which is of no relevance here).

Convert to xml+ROOT via pyhf json2xml, resulting in the following config/FitConfig_ch.xml:

<!DOCTYPE Channel SYSTEM '../HistFactorySchema.dtd'>

<Channel Name="ch" InputFile="data/data.root">
  <Data HistoName="histch_data" InputFile="data/data.root" />
  <Sample Name="signal" HistoName="histch_signal" InputFile="data/data.root" NormalizeByTheory="False">
    <NormFactor Name="mu_sig" Val="1" Low="0" High="10" />
    <StatError Activate="True" HistoName="histch_signal_staterror_ch_1" />
    <StatError Activate="True" HistoName="histch_signal_staterror_ch_2" />
    </Sample>
  </Channel>

I do not know whether this is technically valid HistFactory, but it converts fine with hist2workspace. I did not check whether one of the modifiers gets dropped, or if they get merged.

Convert back to JSON with pyhf xml2json:

{
    "channels": [
        {
            "name": "ch",
            "samples": [
                {
                    "data": [1000.0],
                    "modifiers": [
                        {"data": null, "name": "mu_sig", "type": "normfactor"},
                        {"data": [10.0], "name": "staterror_ch", "type": "staterror"},
                        {"data": [20.0], "name": "staterror_ch", "type": "staterror"}
                    ],
                    "name": "signal"
                }
            ]
        }
    ],
    "measurements": [
        {
            "config": {
                "parameters": [
                    {
                        "auxdata": [1.0],
                        "bounds": [[1.0, 1.0]],
                        "inits": [1.0],
                        "name": "lumi",
                        "sigmas": [0.0]
                    },
                    {"bounds": [[0.0, 10.0]], "inits": [1.0], "name": "mu_sig"}
                ],
                "poi": "mu_sig"
            },
            "name": "meas"
        }
    ],
    "observations": [{"data": [1000.0], "name": "ch"}],
    "version": "1.0.0"
}

The modifiers are still both there, but since the name is now the same for both, the first one is dropped from the model and only the second modifier enters. See e.g. pyhf inspect, which will show a single staterror modifier for this version, and two for the original.

File Upload (optional)

No response

Expected Results

Unclear to me what the expected behavior should be:

  • not allowing this artificially multi-staterror setup to begin with (assuming it is not properly supported in ROOT HistFactory),
  • emitting a warning when converting to xml+ROOT (and another warning when converting back, in case the result will contain this problematic setup with multiple modifiers with the same name),
  • something else?

This was example was written as a targeted setup to test behavior, and may not be all that relevant in practice. shapesys modifiers can be used for this effect to achieve the same behavior (assuming customizable constraint term, as in #1829). It may make sense to only allow a single staterror modifier per sample and channel to avoid this issue altogether.

Actual Results

see above, roundtrip fails

pyhf Version

pyhf, version 0.6.4

Code of Conduct

  • I agree to follow the Code of Conduct
@kratsg
Copy link
Contributor

kratsg commented Dec 8, 2023

This will be changed as part of the HS3 migration, with documentation warning under staterror similar to what is currently done for shapesys. "Can only have one staterror per channel. If you want multiple staterrors, use shapefactor"

@matthewfeickert
Copy link
Member

It is @cburgard's recommendation (and I'm fine/agree here) that having the multiple staterror should cause an error. So #1830 (comment) is follow the idea that we should make this an error and then document this as we change towards using shapefactor.

@kratsg kratsg linked a pull request Dec 8, 2023 that will close this issue
13 tasks
@alexander-held
Copy link
Member Author

Raising an error sounds very reasonable, great that this can move forward as part of HS3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-triage Needs a maintainer to categorize and assign user request Request coming form a pyhf user
Projects
Status: To do
Development

Successfully merging a pull request may close this issue.

3 participants