Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding dataset_inputs namespace to old style fits as_input #1197

Merged
merged 5 commits into from
Apr 27, 2021
Merged

Conversation

siranipour
Copy link
Contributor

This is so that you can do things like:

fit: NNPDF31_nnlo_as_0118_1000

dataset_inputs:
  from_: fit

even though the runcard has an experiments namespace instead of dataset_inputs

@Zaharid
Copy link
Contributor

Zaharid commented Apr 15, 2021

@wilsonmr this seems a bit of a hack but seems like it simplifies things quite a bit down the line.

@wilsonmr
Copy link
Contributor

At some point I did this in the data keyword PR, I can't remember why I reverted it but I agree it simlifies things. In particular you don't have to write data_input: {from_:fitinputcontext} in the case where you don't know if the fit is new or old (which can break grouping: NNPDF/reportengine#38 )

I'm happy for this to go ahead, shall we also update the compfits runcard? Also shall we update the docs (the warning at the bottom of https://docs.nnpdf.science/vp/dataspecification.html for example - I think that's the only place which talks about this but perhaps double checking that doc page wouldn't hurt)

Possibly the fit context production rule could also be updated to parse dataset_inputs ?

@Zaharid
Copy link
Contributor

Zaharid commented Apr 16, 2021

Yes, I think that could simplify a lot of this stuff and reduce exposure to "limitations" in reportengine.

@siranipour
Copy link
Contributor Author

Good spot on the docs Mikey, it's been fixed now. With regards to the comp fits runcard are you referring to the warning around the fitinputcontext part? I can't quite see how this modification promises to fix that bit

@wilsonmr
Copy link
Contributor

I mean you can now change the compfits input to just take dataset_inputs: {from_: fit} right? then you can remove the warning as well.

@siranipour
Copy link
Contributor Author

So I tried the following modification:

diff --git a/validphys2/src/validphys/comparefittemplates/comparecard.yaml b/validphys2/src/validphys/comparefittemplates/comparecard.yaml
index 3747240fd..f6595ad1e 100644
--- a/validphys2/src/validphys/comparefittemplates/comparecard.yaml
+++ b/validphys2/src/validphys/comparefittemplates/comparecard.yaml
@@ -110,8 +110,8 @@ description:
 dataspecs:
   # WARNING: do not blindly copy and paste this: it can overwrite the datasets
   # for any actions which rely on grouping datasets.
-  - data_input:
-      from_: fitinputcontext
+  - dataset_inputs:
+      from_: current
     theoryid:
       from_: current
     pdf:
@@ -123,8 +123,8 @@ dataspecs:
 
   # WARNING: do not blindly copy and paste this: it can overwrite the datasets
   # for any actions which rely on grouping datasets.
-  - data_input:
-      from_: fitinputcontext
+  - dataset_inputs:
+      from_: reference
     theoryid:
       from_: reference
     pdf:

but this greets me with:

Enter current fit: NNPDF40_nnlo_as_0118_ite0
Enter reference fit: NNPDF31_nnlo_as_0118_1000
Enter report title [default:
Comparison between NNPDF40_nnlo_as_0118_ite0 and NNPDF31_nnlo_as_0118_1000 ]:

Enter author name: Shayan Iranipour
Enter keywords:
Do you want to use the theory covariance matrix, if available,
to calculate the statistical estimators? [Y/n]y
Enter label for current fit [default:
Current Fit]:

Enter label for reference fit [default:
Reference Fit]:

[INFO]: Starting NNPDF fit comparison:
    Current fit: NNPDF40_nnlo_as_0118_ite0
    Reference fit: NNPDF31_nnlo_as_0118_1000
    Title: Comparison between NNPDF40_nnlo_as_0118_ite0 and NNPDF31_nnlo_as_0118_1000
    Author: Shayan Iranipour
    Keywords: []
    Current fit label: Current Fit
    Reference fit label: Reference Fit
    Thcovmat if present: True
[WARNING]: Output folder exists: /private/tmp/output Overwriting contents
[WARNING]: use_thcovmat_if_present was true but the flag `use_thcovmat_in_fitting` didn't exist in the runcard for NNPDF40_nnlo_as_0118_ite0. Theory covariance matrix will not be used in any statistical estimators.
[WARNING]: use_thcovmat_if_present was true but the flag `use_thcovmat_in_fitting` didn't exist in the runcard for NNPDF31_nnlo_as_0118_1000. Theory covariance matrix will not be used in any statistical estimators.
[INFO]: Cannot find postfit log at: /Users/shayaniranipour/miniconda3/envs/nnpdf-dev/share/NNPDF/results/NNPDF31_nnlo_as_0118_1000/postfit/postfit.log. Falling back to old location: /Users/shayaniranipour/miniconda3/envs/nnpdf-dev/share/NNPDF/results/NNPDF31_nnlo_as_0118_1000/nnfit/postfit.log
[ERROR]: Bad configuration encountered:
Could not retrieve element dataset_inputs from namespace. No such key

I'm probably doing something silly here...

@wilsonmr
Copy link
Contributor

yeah you need to take the dataset inputs from the fit, not from those custom namespaces (which doesn't contain them if you look above at its definition)

dataspecs:
-  # WARNING: do not blindly copy and paste this: it can overwrite the datasets
-  # for any actions which rely on grouping datasets.
-  - data_input:
-      from_: fitinputcontext
+  - dataset_inputs:
+      from_: fit

@wilsonmr
Copy link
Contributor

but also since dataset_inputs would be from fit in both namespaces you could as well take it outside I believe it might be clearer to keep it in for now...

@siranipour
Copy link
Contributor Author

We happy to merge this chaps?

@wilsonmr
Copy link
Contributor

could we just run a compfits report and double check the grouping isn't screwed up?

@wilsonmr
Copy link
Contributor

or if you did that, can you link it here. Then I'm happy to merge

@siranipour
Copy link
Contributor Author

Good idea, I'll replicate a random one off the vp server

@wilsonmr
Copy link
Contributor

Thanks!

@siranipour
Copy link
Contributor Author

@wilsonmr
Copy link
Contributor

LGTM!

@Zaharid
Copy link
Contributor

Zaharid commented Apr 27, 2021

@wilsonmr I take it the behaviour is well defined and correct if we have both experiments and dataset_inputs

@wilsonmr
Copy link
Contributor

I think it's ok because in the bridging function dataset_inputs simply takes precedence. In the case it was constructed from the experiments then it should be fine..

@siranipour
Copy link
Contributor Author

Happy to merge this then?

@siranipour siranipour merged commit fdf5586 into master Apr 27, 2021
@siranipour siranipour deleted the exp2di branch April 27, 2021 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants