Rework ekobox.apply #415

felixhekhorn · 2024-10-15T12:01:48Z

Closes #408

As said in the issue this is basically just changing the einsum instructions, which are now "ajbk,rbk->raj".
@giacomomagni, @scarlehoff, @Radonirinaunimi can you please help me and check if that is what you need on the other side? i.e. can someone of you please provide the evolven3fit side? 😇
the current state only changes the "apply" methods: it provides a new low-level callable apply.apply_grids, which accepts an EKO and a 3D tensor sorted (replica, flavor, xgrid), which is hopefully what you wanted.
it does not provide new genpdf utilities, as the writing of replica has to be done replica by replica. Actually now I wonder, if the performance bottleneck is/was really the EKO contraction, or is it rather the writing of the LHAPDF files? as is well known writing files is expensive and I even expect a major increase in time between Python (eko) and Fortran (apfel)
this is yet another breaking change with master as the return type of apply.apply_pdf has changed (evolven3fit is currently using this function, but as said above it should switch to apply.apply_grids) - effectively this is "transposed" on the first two "dimensions". However, I think I prefer this as then it is easier to discard errors.

giacomomagni · 2024-10-15T12:29:15Z

it does not provide new genpdf utilities, as the writing of replica has to be done replica by replica. Actually now I wonder, if the performance bottleneck is/was really the EKO contraction, or is it rather the writing of the LHAPDF files? as is well known writing files is expensive and I even expect a major increase in time between Python (eko) and Fortran (apfel)

this is a good point, because in the other PR also the dumping was parallelized,
but maybe the best solution can combine both options ?

scarlehoff · 2024-10-15T14:15:23Z

If the bottleneck is writting the files it should be easily solvable as well. 1000 files are not that many and I don't think Fortran should be much faster.

What do you need me to test @felixhekhorn ? Just try to use evolven3fit with this branch?

I guess in order to use this branch we need to regenerate ekos from scratch? I'm getting the following error:

error

    with eko.EKO.edit(eko_path) as eko_op:
         ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jumax9/Academic_Workspace/NNPDF/src/eko/src/eko/io/struct.py", line 399, in edit
    return cls.read(*args, readonly=False, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jumax9/Academic_Workspace/NNPDF/src/eko/src/eko/io/struct.py", line 376, in read
    loaded = cls.load(dir_)
             ^^^^^^^^^^^^^^
  File "/home/jumax9/Academic_Workspace/NNPDF/src/eko/src/eko/io/struct.py", line 345, in load
    metadata = Metadata.load(path)
               ^^^^^^^^^^^^^^^^^^^
  File "/home/jumax9/Academic_Workspace/NNPDF/src/eko/src/eko/io/metadata.py", line 62, in load
    content = cls.from_dict(
              ^^^^^^^^^^^^^^
  File "/home/jumax9/Academic_Workspace/NNPDF/src/eko/src/eko/io/dictlike.py", line 87, in from_dict
    return cls._from_dict(dictionary)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jumax9/Academic_Workspace/NNPDF/src/eko/src/eko/io/dictlike.py", line 74, in _from_dict
    dictionary[field.name] = load_field(field.type, dictionary[field.name])
                                                    ~~~~~~~~~~^^^^^^^^^^^^
KeyError: 'xgrid'

Where eko_path is the eko.tar of theory 40_000_000 which was created with 0.13.5

felixhekhorn · 2024-10-16T07:24:35Z

If the bottleneck is writting the files it should be easily solvable as well. 1000 files are not that many and I don't think Fortran should be much faster.

well it was just a conjecture, so to be proven. However, if that turns out true I can not see an easy solution ... also keep in mind that a single replica is 1.6MB, which is not a small amount I'd say ...

What do you need me to test @felixhekhorn ? Just try to use evolven3fit with this branch?

just plain running is not sufficient as the program flow will need to be adjusted: use apply.apply_grids (if that is what you need) and iterate on replicas afterwards for writing

I guess in order to use this branch we need to regenerate ekos from scratch? I'm getting the following error:
error
    with eko.EKO.edit(eko_path) as eko_op:
         ^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'xgrid'

this specific case is a consequence the breaking change in #292 ; The easy solution is to regenerate the ekos, but then the question is do we need to provide an upgrade tool?

Where eko_path is the eko.tar of theory 40_000_000 which was created with 0.13.5

for that matter: I'm surprised about the v0.13 here - can you read that with v0.14 which is our latest tag?

scarlehoff · 2024-10-16T07:37:01Z

but then the question is do we need to provide an upgrade tool?

Yes, please. Otherwise we need to regenerate all ekos for all theories :___

if that turns out true I can not see an easy solution

If it was quick in Fortran I'm sure we can find a way. I see the by-block separation is also done per-replica and that could be vectorized as well (that can be done mostly in the evolven3fit, although some features of eko are used).

scarlehoff · 2024-10-17T16:02:49Z

I created a new eko for theory 40_000_000 and tested this branch with it (with a few changes to evolven3fit).

I need to test that the numbers have not changed and all that* but it went from 20 minutes to about 2 minutes ^^ so this is a success! (the writting is still done sequentially and there's a few things that could be improved there still, I think the writting might now be taking a good chunck of those 2 minutes left).

*and this is not trivial because turns out we never updated the ekos in the 4.0 theories with the few fixes since... most importantly, they are not starting at Q=1 GeV

felixhekhorn · 2024-10-22T08:26:40Z

we will need to fix the tutorial https://eko.readthedocs.io/en/latest/overview/tutorials/pdf.html#Method-1:-Using-apply_pdf

Rework ekobox.apply

b6efcd1

felixhekhorn added enhancement New feature or request refactor Refactor code labels Oct 15, 2024

felixhekhorn requested review from scarlehoff, giacomomagni and Radonirinaunimi October 15, 2024 12:01

Fix evol_pdf

e7d775c

Adjust apply for ekomark

e9aff79

Fix inv matching benchmark

e07a115

scarlehoff mentioned this pull request Oct 17, 2024

Use eko v0.15 NNPDF/nnpdf#2181

Draft

giacomomagni merged commit e07a115 into master Oct 21, 2024
6 of 7 checks passed

felixhekhorn deleted the apply-replica branch October 22, 2024 08:25

giacomomagni restored the apply-replica branch October 22, 2024 09:37

giacomomagni mentioned this pull request Oct 22, 2024

Rework ekobox.apply #421

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework ekobox.apply #415

Rework ekobox.apply #415

felixhekhorn commented Oct 15, 2024

giacomomagni commented Oct 15, 2024

scarlehoff commented Oct 15, 2024 •

edited

Loading

felixhekhorn commented Oct 16, 2024

scarlehoff commented Oct 16, 2024

scarlehoff commented Oct 17, 2024

felixhekhorn commented Oct 22, 2024

Rework ekobox.apply #415

Rework ekobox.apply #415

Conversation

felixhekhorn commented Oct 15, 2024

giacomomagni commented Oct 15, 2024

scarlehoff commented Oct 15, 2024 • edited Loading

felixhekhorn commented Oct 16, 2024

scarlehoff commented Oct 16, 2024

scarlehoff commented Oct 17, 2024

felixhekhorn commented Oct 22, 2024

scarlehoff commented Oct 15, 2024 •

edited

Loading