Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numerical FONLL #99

Merged
merged 142 commits into from
Oct 23, 2023
Merged

Numerical FONLL #99

merged 142 commits into from
Oct 23, 2023

Conversation

RoyStegeman
Copy link
Member

Based on our discussion added the functionality that if pineko is called to produce opcards for a theory of which the flavor number scheme is FONLL, pineko will produce the necessary new theory cards by appending 1,2,3 to the end of the "input" theory id.

This may create problems with storing the intermediate grids since scale variations uses the same numbering, though I'm not sure how important that really is? Based on https://github.com/NNPDF/theories/tree/main/data/grids you did save them for scale variations, but since this will only be for DIS maybe that's not recommended? Alternatively, I could of course just append different numbers but that requires some awareness/coordination of which are already in use.

@RoyStegeman RoyStegeman requested a review from andreab1997 June 9, 2023 10:21
src/pineko/check.py Outdated Show resolved Hide resolved
src/pineko/theory.py Outdated Show resolved Hide resolved
@alecandido
Copy link
Member

This may create problems with storing the intermediate grids since scale variations uses the same numbering, though I'm not sure how important that really is? Based on NNPDF/theories@main/data/grids you did save them for scale variations, but since this will only be for DIS maybe that's not recommended? Alternatively, I could of course just append different numbers but that requires some awareness/coordination of which are already in use.

Since these are only for internal use, you can also make a tuple: (theoryid, 1), (theoryid, 2), ...
They will be consumed before coming back to the NNPDF world ^^

@RoyStegeman
Copy link
Member Author

RoyStegeman commented Jun 9, 2023

Since these are only for internal use, you can also make a tuple: (theoryid, 1), (theoryid, 2), ...
They will be consumed before coming back to the NNPDF world ^^

If we just use numbering we still need to coordinate though, I don't see so much the benefit from doing (theoryid,1) instead of `f"{theoryid}1". Anyway, I'd rather use integers since most/all codes expect integer theoryIDs even if they are just used for internal steps and I don't want to risk unexpected behaviours by allowing non-integer inputs.

@RoyStegeman RoyStegeman marked this pull request as draft June 9, 2023 12:17
@RoyStegeman
Copy link
Member Author

I converted it to draft since I opened the PR do discuss about some details first, not to merge immediatly.

@alecandido
Copy link
Member

alecandido commented Jun 9, 2023

I don't see so much the benefit from doing (theoryid,1) instead of `f"{theoryid}1".

I see it: f"{theoryid}1" could actually be a different theory (Andrea and Giacomo are using 4-digits theories already, and maybe even 5). Moreover, id[1] on a pair is a bit cleaner than id[-1] on a string, since:

  • it will be already an integer (rather than a char)
  • in the (currently?) absurd case of >10 theories generated, it would still work

@RoyStegeman
Copy link
Member Author

Ah, I think I misunderstood your proposal. Where would this tuple be stored and read and what would be the ID key in the theory card?

All code assumes that theoryIDs are integers, so are you proposing to use id[1] as the integer passed to identify? Then we would get e.g. the folder NNPDF/theories/data/id[1] which is of course also not unique (I'm sure this is not what you have in mind, but I don't understand what you do ;) ).

@alecandido
Copy link
Member

I guess you understood correctly the first time:

id: (400, 1)

What I was arguing is that the only one who cares about IDs is Pineko itself. In the pineline, no one else is reading IDs (and in case Pinefarm does, it should just propagate them straight, but we could check).

IDs are an NNPDF thing, but it is completely irrelevant outside it. Pineko is preparing theories, so it's aware of IDs. But if you take EKO, they are not even any longer present in current theory layout. And PineAPPL never knew anything of them.

@RoyStegeman
Copy link
Member Author

What I was arguing is that the only one who cares about IDs is Pineko itself. In the pineline, no one else is reading IDs (and in case Pinefarm does, it should just propagate them straight, but we could check).

Ah okay this I didn't realise, but that might make things a lot simpler. The original reason for my question though, was that if it is useful to store not only the final FKtables but also the grids that are produced during the intermediate step I need a unique folder name to not clash with Andrea (and maybe Giacomo), which I don't think your suggestion solves? Since it's only pineko that deals with theoryid's I could even add a more descriptive string or subfolder structure without needing to worry.

Note that even with the current system as I suggested we can do more than 9 theories since there is nothing stopping us from creating a theorid of 40011 - we don't read the id from the string using id[-1] but 40011 would be the full id. Again, my worry was purely about preventing clashes if the intermediate grids should be saved for the long term.

@alecandido
Copy link
Member

Note that even with the current system as I suggested we can do more than 9 theories since there is nothing stopping us from creating a theorid of 40011 - we don't read the id from the string using id[-1] but 40011 would be the full id. Again, my worry was purely about preventing clashes if the intermediate grids should be saved for the long term.

In principle, you'd be right, we might want to store the grids we computed.

In practice, this is not yet a problem, since only FONLL requires multiple grids, and FONLL is only applied to DIS, for which recomputing is potentially easier than storing.

However, even with a more complex structure it would still work, because you could save (400, 1), (400, 2), ... all as part of 400 (in case we stored grids by theories, that does not work in the long term).
Moreover, if you wish you can distinguish with a different key, you don't even have to use ID:. Something like fragment: or similar. In the end, it will be wiped out when you construct the theory for fitting (since only one joint FK table will survive).

@RoyStegeman
Copy link
Member Author

In practice, this is not yet a problem, since only FONLL requires multiple grids, and FONLL is only applied to DIS, for which recomputing is potentially easier than storing.

There already exists grids for theoryids 4001 and 4005, not exactly sure why since sv should correspond to different theories for nnpdf as well but regardless, it's not just fonll.

Moreover, if you wish you can distinguish with a different key, you don't even have to use ID:. Something like fragment: or similar. In the end, it will be wiped out when you construct the theory for fitting (since only one joint FK table will survive).

If you agree (also @andreab1997) I'll only save intermediate results locally for now (also the ekos) since it's DIS that shouldn't cause too much trouble if we ever need to reproduce them. What we could do if we do want to save is just add subfolders for fonll-a/b/c with inside the intermediate theorycards and grids. In this way it doesn't matter so much if we use theoryid 4001 or (400,1) since for fonll we won't need more than 9, but I do think (400,1) is a bit cleaner and also immediately signals that it's not a theory that directly corresponds to an fktable or "nnpdf theory".

@felixhekhorn
Copy link
Contributor

felixhekhorn commented Jun 9, 2023

but 40011 would be the full id

I think, creating new "NNPDF theories" could work - of course the naming has to be more complicated than just adding a number for the reason @alecandido said ... so e.g. $10^7 + \mathrm{theoryid}*10^2 + 1$ might work. Those theories are viable physical theories after all (e.g. FFNS3) ... the complication then is how to rejoin them together since you need them when merging ...

which recomputing is potentially easier than storing.

mmm CHORUS ekos are (very) expensive ... so we want to save intermediate results

EDIT: lowered the theory power from 4 to 2 since they are already O(1e3)

@RoyStegeman
Copy link
Member Author

RoyStegeman commented Jun 9, 2023

I think, creating new "NNPDF theories" could work - of course the naming has to be more complicated than just adding a number for the reason @alecandido said ... so e.g. $10^7 + \mathrm{theoryid}*10^2 + 1$ might work.

Sure, but there would still need to be some sort of system to it. Keeping a database might be a bit excessive but we do need to ensure that I'm not occupying the same identifiers (instead of theoryid to remain abstract as to how exactly we'll do this) for fonll as Andrea and Giacomo do for their stuff.

Those theories are viable physical theories after all (e.g. FFNS3) ... the complication then is how to rejoin them together since you need them when merging ...

FFN0 is less viable, but I see your point. The rejoining and damping I plan to do at the level of fktables. I haven't thought out the exact details yet, but as long as we know which theory/grids is which I don't foresee any problems at this step, do you?

@RoyStegeman RoyStegeman changed the title produce new theory cards if fns=fonll Numerical FONLL Jun 12, 2023
src/pineko/fonll.py Outdated Show resolved Hide resolved
src/pineko/fonll.py Outdated Show resolved Hide resolved
src/pineko/fonll.py Outdated Show resolved Hide resolved
@@ -17,13 +17,24 @@
logger = logging.getLogger(__name__)

FNS_BASE_PTO = {"FONLL-A": 1, "FONLL-B": 1, "FONLL-C": 2, "FONLL-D": 2, "FONLL-E": 3}
"""Mapping between pertubative orders as in the theory card PTO and the FONLL scheme names.

The explict mapping is the following (evolution, massive parts, massless parts):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to distinguish evolution from massive parts? From what I understood the resummation (NLL in FONLL) is just labeled by massless parts, so the other two have to be the same...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to be even more clear and pay the price of redundancy here 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not so much redundancy on its own, it is actually about the meaning. We have both evolution for massive and massless, and they have two different orders.

Why do you need to specify it separately? Which evolution are you referring to? (I guess the massive one, but I wonder why...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have both evolution for massive and massless, and they have two different orders.

no - they have both the same order, which indeed coincides with the massless order

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I guess massless is asy (massless keeps being an ambiguous word, whether it is the massless limit or zero-mass).

But then they all should have the same order (as I argued back then), otherwise you see it's rather inconsistent - you're definitely taking a weird limit...
(you do not subtract the logs of the extra mass, because they are not in the zero-mass, but you subtract the evolution logs of all the other masses, that are also not in the zero-mass - that's quite arbitrary)

benchmarks/bench_evolve.py Outdated Show resolved Hide resolved
src/pineko/check.py Outdated Show resolved Hide resolved
src/pineko/check.py Outdated Show resolved Hide resolved
Copy link
Contributor

@felixhekhorn felixhekhorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please apply @RoyStegeman's comments (they are good)

andreab1997 and others added 3 commits October 23, 2023 12:34
@andreab1997 andreab1997 self-requested a review October 23, 2023 10:37
src/pineko/check.py Outdated Show resolved Hide resolved
@RoyStegeman RoyStegeman merged commit f8e2768 into main Oct 23, 2023
9 checks passed
@RoyStegeman RoyStegeman deleted the num_fonll branch October 23, 2023 11:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Refactor code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants