Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pineappl integration with validphys #1529

Closed
wants to merge 52 commits into from
Closed

Conversation

scarlehoff
Copy link
Member

@scarlehoff scarlehoff commented Feb 22, 2022

EDIT:
This branch is outdated with respect to master as it points to a master pre-thcovmat in n3fit, #1578 is a rebase of this branch to 08/07/2022 master: 3098d0a

As promised, the pineappl integration with vp is in preparation.

In this first commit the pineappl tables are already compatible with vp and can be used for instance in predictions (like with the commondata reader, I put a jupyter notebook at the root of the repository).

The fit part I will do tomorrow, which is a bit more complicated (if I want to keep both the old and new theories) since I will have to reconstruct the fktable from the pandas dataframe.

I will also need to modify FKTableData to hold more interesting information, but nothing to worrysome.

This will also close #1541


This PR has ended up being a tad bigger than one would like (and broader in scope). Introducing the pineappl tables meant it was easier to simplify some of the other stuff (such as the reader for the old fktables) rather than have both of them live together. As such, the review will be more painful than I hoped, so in order to help the reviewer (and myself in 5 years time when I have to remember what I did) I've written some notes on the changes.

TODO before merging

  • pineappl, eko in conda-forge
  • approved
  • Theory whatever in the server so it can be used by everyone
  • Remove jupyter notebook
  • Remove the oldmode flag from the loader

How to install the necessary packages

Since they are still not in conda you will need to install pineappl in order to use this branch, luckily it is available in the python repos:

pip install pineappl eko

or

conda install pineappl eko

(just in case, make sure that the pineappl version that get installed is 0.5.2)

Where to get pineappl grids

I've put in the root of the nnpdf server a folder called pineappl_ingredients.tar which contains a yamldb (which goes into NNPDF/data and a pineappls which goes into NNPDF/data/theory_200

cd ${CONDA_PREFIX}/share/NNPDF/data
scp vp.nnpdf.science:pineappl_ingredients.tar .
tar -xf pineappl_ingredients.tar yamldb/ 
tar -C theory_200/ -xf pineappl_ingredients.tar pineappls/
rm pineappl_ingredients.tar
cd -

Notes on the changes

  • Jupyter notebook: it's there just so we can play with it a bit, but I'll remove it once this is approved for merging.

  • all n3fit files: since now we have in vp a few objects that contain exactly the same information that was passed to some of the dictionaries in n3fit, I've removed said dictionaries. This is the only reason these files have changed so you might as well ignore them.

  • config.py: separated posdataset and integdataset

  • convolution.py: in order to keep using truediv for RATIO (which is nice because that means that tensorflow, numpy, pandas or any other library know what to do with the operation) it is necessary to make the pandas denominator into a numpy array. I think this is reasonable since in this case the denominator is usually a total cross section so the indexes are not supposed to match the numerator.

  • core.py:
    Added load_commondata to DataSetInput and DataGroupSpec so that one can get just the CommonData from libNNPDF instead of the whole DataSet.
    Modified FKTableSpec so that it can load both the new and old fktables.
    Added an IntegrabilitySetSpec which is equal to PositivitySetSpec.

  • coredata.py
    I've added to FKTableData methods that generate the information needed to fit an fktable: luminosity_mapping and get_np_fktable.
    I've moved the application of the cfactors to FKTableData to be equivalent to with_cuts (so that no funny games need to be played with the dataclass outside).
    Added also a _protected flag since cfactors and cuts were done with the old fktables in mind and as explained in convolution.py they wil find a different number of points. When the repetition flag was found in apfelcomb, it gets saved into the yaml database and cuts or cfactors are applied accordingly.
    Note that FKTableData works the same no matter where the fktables came from (pineappl or old)

  • fkparser.py
    Moved the application of cfactors away (see above).

  • loader.py
    I've added check_fkyaml to load the new yaml files. Eventually the yaml information will come from the new commondata format so at the moment I've just hardcoded the path of the yamldb inside the data folder.
    I've separated the positivity and integrability loading. This was not strictly necessary but it facilitated the n3fit_data.py below and it was something that annoyed me since a long time.
    For testing I've added a flag to check_dataset such that if you use oldmode as a cfactor, the old fktables are used regardless. This is useful for debugging and will be removed alongside the jupyter notebook for before merging (or as soon as the new theory is prepared).
    At the moment whether to use pineappl or not does not depend on the theory since the pineappl tables are not a part of any theory at the moment.

  • n3fit_data.py
    This has been greatly simplified. Most notably, since like mask_fk_tables have been removed. fitting_data_dict is no longer a dictionary with a list of dictionaries inside but contains a list of FittableDatasets which are coming from the outside.
    The most interesting part here is that this means that issue n3fit memory usage #1541 is also solved. I had to do something else which was creating a TupleComp class for the masks that depend on the name and the seed, but it is a small price to pay.
    For the rest most of the changes in this module are just removing stuff.

  • n3fit_data_utils.py
    Here don't even look at the diff, this module has been basically rewritten from scratch. Some lines just happen to be similar.
    I've created the FittableDataSet which contains all the information necessary to fit a given dataset other than the central value.
    imho, once we have the new common data format and pure python datasets this should be just an extension of those but it's a bit tricky because in its current form this can be shared between replicas and if we add the cv that won't be possible. But anyway, that's a problem for the future weeks.

  • pineparser.py
    Contains all the logic for loading the new fktables into coredata.FKTableData objects completely equivalent to what vp creates reading the old fktables with pandas. This is going to probably be moved to pineko since we believe building and reading the fktables should be one single thing. I'll do it when this Move permanent part of fkutils here pineko#12 PR is merged.

  • results.py
    This is the reason for the load_commondata needed in core.py.
    Currently getting results needed not only the central value of the data but also to load the libNNPDF fktable even if they were not used anywhere. I could live with that (just a waster of memory) when they exist, but this is no longer the case if you only have the pineappl version.

@scarlehoff scarlehoff added destroyingc++ run-fit-bot Starts fit bot from a PR. labels Feb 24, 2022
@scarlehoff
Copy link
Member Author

scarlehoff commented Feb 24, 2022

This is now "done" (let's see whether the fit bot is working) save a few details that I will do directly here since I already changed a lot n3fit_data_utils.

It is not ready for review until I get to do a full fit with all datasets we have instead just a few selected ones in case there are unforeseen problems (I'm sure there will be), @felixhekhorn is running now the ekos to prepare the full batch of dataset for the "baseline".

Then there is the (important) detail of the fact that I'm probably wasting a lot of memory and time with the spaguetti code that I've put in get_np_fktable. If someone who knows pandas better than me could have a look and fix it, please do.

Other than that, I'm quite happy with the result. Some of the layers of the rightfully-hated dictionary-complexity of n3fit became useless since there is no longer the mix of C++ and python (for starters no need to parse the FKTableData object, it contains everything needed now and can be pased down verbatin).

@scarlehoff scarlehoff added run-fit-bot Starts fit bot from a PR. and removed run-fit-bot Starts fit bot from a PR. labels Feb 24, 2022
@scarlehoff
Copy link
Member Author

I've updated the secrets, chances are that I've broken things more than they were but we'll discover it together.

@scarlehoff scarlehoff added run-fit-bot Starts fit bot from a PR. and removed run-fit-bot Starts fit bot from a PR. labels Feb 25, 2022
@scarlehoff
Copy link
Member Author

Got hit by the indexing...

@scarlehoff scarlehoff added run-fit-bot Starts fit bot from a PR. and removed run-fit-bot Starts fit bot from a PR. labels Mar 3, 2022
@scarlehoff
Copy link
Member Author

If the fit bot works I'm happy with the current state. Still not ready for review because I need to review the changes myself with a fresh pair of eyes.

@scarlehoff scarlehoff removed the run-fit-bot Starts fit bot from a PR. label Mar 3, 2022
@scarlehoff scarlehoff added the run-fit-bot Starts fit bot from a PR. label Mar 3, 2022
@scarlehoff scarlehoff force-pushed the validphys_with_pineappl branch from 28a7605 to 82874b1 Compare March 4, 2022 08:01
@scarlehoff scarlehoff added run-fit-bot Starts fit bot from a PR. and removed run-fit-bot Starts fit bot from a PR. labels Mar 4, 2022
@github-actions
Copy link

github-actions bot commented Mar 4, 2022

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

@scarlehoff scarlehoff added run-fit-bot Starts fit bot from a PR. and removed run-fit-bot Starts fit bot from a PR. labels Mar 4, 2022
@github-actions
Copy link

github-actions bot commented Mar 8, 2022

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

@scarlehoff scarlehoff force-pushed the validphys_with_pineappl branch 2 times, most recently from 4956de9 to 8ee9005 Compare May 21, 2022 19:47
@scarlehoff scarlehoff force-pushed the validphys_with_pineappl branch from 993b2e1 to 7380d89 Compare May 21, 2022 21:03
@scarlehoff
Copy link
Member Author

@felixhekhorn @alecandido is there a known problem with eko and python 3.9? The tests only fail when eko is included in the recipe (despite eko never being imported for any of the tests).

Asking before comparing the two logs side by side to see what has been installed differently...

@felixhekhorn
Copy link
Contributor

@felixhekhorn @alecandido is there a known problem with eko and python 3.9? The tests only fail when eko is included in the recipe (despite eko never being imported for any of the tests).

Asking before comparing the two logs side by side to see what has been installed differently...

Mmm ... we're not aware of problems and indeed the unit tests and the isolated benchmarks work in 3.9

I'd say the most likely candidate would be numba and friends (e.g. llvm) ... they might cause problems outside python (meaning without import)

@scarlehoff
Copy link
Member Author

scarlehoff commented May 23, 2022

The only big change that I can see is the libgfortran from 7 to 11... and when I do everything locally (with python 3.9) installing eko makes only the following change to the standard installation:

  numpy                               1.22.3-py39he7a7128_0 --> 1.21.5-py39he7a7128_2                         
  numpy-base                          1.22.3-py39hf524024_0 --> 1.21.5-py39hf524024_2

Which (locally) is not enough to break anything. Sigh. I really don't understand what's happening (also because if I run the test that seems to fail by itself in the github CI, it works, but not when it is run with the rest?)

@felixhekhorn
Copy link
Contributor

The numpy downgrade is expected since the current numba (0.55.1) only works with numpy<1.22 - that constrain should be lifted in 0.56

@alecandido
Copy link
Member

The numpy downgrade is expected since the current numba (0.55.1) only works with numpy<1.22 - that constrain should be lifted in 0.56

Actually, a backport for v0.55.2 is even scheduled: numba/numba#8067

@scarlehoff
Copy link
Member Author

scarlehoff commented May 24, 2022

I'm sorry for the people subscribed to this PR...

For future reference, there is an action to open a terminal inside the CI does work but as soon as I try to do something slightly complicated (and that includes getting the conda package with scp) it crashes :/ I thought it was my connection but changing it didn't fix it.

With respect to the actual problem it only happens when eko is installed and only in the CI for python 3.9. I have not been able to reproduce it in any way in my computer... the only thing left is try to get the same docker image that the github CI is using...

@scarlehoff scarlehoff force-pushed the validphys_with_pineappl branch from d1b114c to 4306fb6 Compare May 24, 2022 17:22
@scarlehoff
Copy link
Member Author

Since this branch is perfectly ok and the silly error is still unknown (we now believe it is one of the packages that banana-hep brings with it that breaks something, funny things no package changes version...) I've bypassed the error so this can be merged.

I'll investigate what the exact error is since @andreab1997 will need eko and banana-hep for the python evolven3fit. Hopefully either by then the problem has been solved automagically or I've discovered what the problem is.

As @RoyStegeman asked I've passed black through all the files I've touched in a single commit and as @Zaharid asked I've limited the damage to the parts of said files that I've actually modified (you can see in the changes that I did revert most if not all the spurious changes). I hope I managed to make everyone happy at once this time.

Thanks @felixhekhorn for the help with debugging this issue.

@alecandido
Copy link
Member

@felixhekhorn @alecandido is there a known problem with eko and python 3.9? The tests only fail when eko is included in the recipe (despite eko never being imported for any of the tests).

Asking before comparing the two logs side by side to see what has been installed differently...

Actually, looking back at the logs:

  • it fails on 3.9 only because arrived first, and stopped the others before they could. I'm pretty confident it would also crash on those before or after (or both)
  • I really believe it is a problem of environment and dependencies, so you actually need to reproduce the exact environment of the CI...

However, I hope that NNPDF/eko#122 will solve it before.

@scarlehoff
Copy link
Member Author

It passed in 3.8 a few times.

@alecandido
Copy link
Member

It passed in 3.8 a few times.

Compatible, but I don't see where

@scarlehoff
Copy link
Member Author

https://github.com/NNPDF/nnpdf/runs/6545874878?check_suite_focus=true

Here it passed the entire test, but in general for quite a few you can see the 3.8 reached a few steps further than 3.9 (before 3.9 crashes the entire thing)

@alecandido
Copy link
Member

TODO before merging

  • pineappl, eko in conda-forge
  • approved
  • Theory whatever in the server so it can be used by everyone
  • Remove jupyter notebook
  • Remove the oldmode flag from the loader

@scarlehoff maybe a few of these points (if not all) can be ticked

@scarlehoff
Copy link
Member Author

Closing in favour of #1578

@scarlehoff scarlehoff closed this Jul 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

n3fit memory usage
7 participants