Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework ekobox.apply #415

Merged
merged 4 commits into from
Oct 21, 2024
Merged

Rework ekobox.apply #415

merged 4 commits into from
Oct 21, 2024

Conversation

felixhekhorn
Copy link
Contributor

Closes #408

  1. As said in the issue this is basically just changing the einsum instructions, which are now "ajbk,rbk->raj".
  2. @giacomomagni, @scarlehoff, @Radonirinaunimi can you please help me and check if that is what you need on the other side? i.e. can someone of you please provide the evolven3fit side? 😇
  3. the current state only changes the "apply" methods: it provides a new low-level callable apply.apply_grids, which accepts an EKO and a 3D tensor sorted (replica, flavor, xgrid), which is hopefully what you wanted.
  4. it does not provide new genpdf utilities, as the writing of replica has to be done replica by replica. Actually now I wonder, if the performance bottleneck is/was really the EKO contraction, or is it rather the writing of the LHAPDF files? as is well known writing files is expensive and I even expect a major increase in time between Python (eko) and Fortran (apfel)
  5. this is yet another breaking change with master as the return type of apply.apply_pdf has changed (evolven3fit is currently using this function, but as said above it should switch to apply.apply_grids) - effectively this is "transposed" on the first two "dimensions". However, I think I prefer this as then it is easier to discard errors.

@felixhekhorn felixhekhorn added enhancement New feature or request refactor Refactor code labels Oct 15, 2024
@giacomomagni
Copy link
Collaborator

  1. it does not provide new genpdf utilities, as the writing of replica has to be done replica by replica. Actually now I wonder, if the performance bottleneck is/was really the EKO contraction, or is it rather the writing of the LHAPDF files? as is well known writing files is expensive and I even expect a major increase in time between Python (eko) and Fortran (apfel)

this is a good point, because in the other PR also the dumping was parallelized,
but maybe the best solution can combine both options ?

@scarlehoff
Copy link
Member

scarlehoff commented Oct 15, 2024

If the bottleneck is writting the files it should be easily solvable as well. 1000 files are not that many and I don't think Fortran should be much faster.

What do you need me to test @felixhekhorn ? Just try to use evolven3fit with this branch?

I guess in order to use this branch we need to regenerate ekos from scratch? I'm getting the following error:

error
    with eko.EKO.edit(eko_path) as eko_op:
         ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jumax9/Academic_Workspace/NNPDF/src/eko/src/eko/io/struct.py", line 399, in edit
    return cls.read(*args, readonly=False, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jumax9/Academic_Workspace/NNPDF/src/eko/src/eko/io/struct.py", line 376, in read
    loaded = cls.load(dir_)
             ^^^^^^^^^^^^^^
  File "/home/jumax9/Academic_Workspace/NNPDF/src/eko/src/eko/io/struct.py", line 345, in load
    metadata = Metadata.load(path)
               ^^^^^^^^^^^^^^^^^^^
  File "/home/jumax9/Academic_Workspace/NNPDF/src/eko/src/eko/io/metadata.py", line 62, in load
    content = cls.from_dict(
              ^^^^^^^^^^^^^^
  File "/home/jumax9/Academic_Workspace/NNPDF/src/eko/src/eko/io/dictlike.py", line 87, in from_dict
    return cls._from_dict(dictionary)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jumax9/Academic_Workspace/NNPDF/src/eko/src/eko/io/dictlike.py", line 74, in _from_dict
    dictionary[field.name] = load_field(field.type, dictionary[field.name])
                                                    ~~~~~~~~~~^^^^^^^^^^^^
KeyError: 'xgrid'

Where eko_path is the eko.tar of theory 40_000_000 which was created with 0.13.5

@felixhekhorn
Copy link
Contributor Author

If the bottleneck is writting the files it should be easily solvable as well. 1000 files are not that many and I don't think Fortran should be much faster.

well it was just a conjecture, so to be proven. However, if that turns out true I can not see an easy solution ... also keep in mind that a single replica is 1.6MB, which is not a small amount I'd say ...

What do you need me to test @felixhekhorn ? Just try to use evolven3fit with this branch?

just plain running is not sufficient as the program flow will need to be adjusted: use apply.apply_grids (if that is what you need) and iterate on replicas afterwards for writing

I guess in order to use this branch we need to regenerate ekos from scratch? I'm getting the following error:
error

    with eko.EKO.edit(eko_path) as eko_op:
         ^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'xgrid'

this specific case is a consequence the breaking change in #292 ; The easy solution is to regenerate the ekos, but then the question is do we need to provide an upgrade tool?

Where eko_path is the eko.tar of theory 40_000_000 which was created with 0.13.5

for that matter: I'm surprised about the v0.13 here - can you read that with v0.14 which is our latest tag?

@scarlehoff
Copy link
Member

but then the question is do we need to provide an upgrade tool?

Yes, please. Otherwise we need to regenerate all ekos for all theories :___

if that turns out true I can not see an easy solution

If it was quick in Fortran I'm sure we can find a way. I see the by-block separation is also done per-replica and that could be vectorized as well (that can be done mostly in the evolven3fit, although some features of eko are used).

@scarlehoff
Copy link
Member

I created a new eko for theory 40_000_000 and tested this branch with it (with a few changes to evolven3fit).

I need to test that the numbers have not changed and all that* but it went from 20 minutes to about 2 minutes ^^ so this is a success! (the writting is still done sequentially and there's a few things that could be improved there still, I think the writting might now be taking a good chunck of those 2 minutes left).

*and this is not trivial because turns out we never updated the ekos in the 4.0 theories with the few fixes since... most importantly, they are not starting at Q=1 GeV

@giacomomagni giacomomagni merged commit e07a115 into master Oct 21, 2024
6 of 7 checks passed
@felixhekhorn felixhekhorn deleted the apply-replica branch October 22, 2024 08:25
@felixhekhorn
Copy link
Contributor Author

@giacomomagni giacomomagni restored the apply-replica branch October 22, 2024 09:37
@giacomomagni giacomomagni mentioned this pull request Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request refactor Refactor code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow for parallelization in evolven3fit_new
3 participants