Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mapping tester repetitions #195

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

fsimonis
Copy link
Member

@fsimonis fsimonis commented Aug 8, 2024

Main changes of this PR

This PR:

  • adds repetitions to mapping tester. These default to 1 and are a final folder level in the case tree. This makes gathering more samples for performance data easy.
  • plots the average of runs

I have a further change coming up in which I make writing the output mesh on B optional. This prevents excessive use of storage. #197

Based on #210

Author's checklist

  • I used the pre-commit hook and used pre-commit run --all to apply all available hooks.
  • I added a test to cover the proposed changes in our test suite.
  • I updated the documentation in docs/README.md.
  • I updated potential breaking changes in the tutorial precice/tutorials/aste-turbine.

@fsimonis
Copy link
Member Author

fsimonis commented Aug 8, 2024

I am wondering if I should write a documentation page in docs/ and link to it from the README instead. Is this important/useful enough for a separate tooling page on the website?

@fsimonis
Copy link
Member Author

I split the gatherstats improvement #210 and the documentation #209 into separate PRS

@fsimonis fsimonis force-pushed the add-mapping-tester-repetitions branch from eb1c09b to ef98748 Compare October 25, 2024 11:45
@fsimonis fsimonis changed the title Add mapping tester repetitions and docs Add mapping tester repetitions Oct 25, 2024
Copy link
Member

@davidscn davidscn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I like about this approach:

  • the repetitions are part of the json file

what I don't like about this approach:

  • the hierarchy of the generated run-tree becomes yet another level steeper
  • each run (with the exact same configuration) is duplicated in another directory. This is particularly problematic in terms of memory. In the current setup, having larger runs can easily lead to $\mathcal{O}(10)GB$. Having 5 repetitions then would lead to an excessive memory usage for no reason.

Now there is another suggestion of making the generation of output files (which occupy this memory) optional. However, the use case of the mapping-tester is to test mappings, which have error and runtime as their integral properties. One of the main features of all this is in fact that it measures both with one execution. Therefore, disabling the error seems rather like a different use case.

In my previous setups, I simply repeated the whole pipeline "repetitions" amount of times to generate the same statistical data for individual runs. Thereby it gets rid of the two downsides I mentioned above, but comes at the disadvantage that temporarily generated profiling data of individual runs is not accessible after all repetitions are completed.

I think a clean approach here would be to have the same directory hierarchy as we have right now, and store the profiling information of each run in individual precice-profiling-<N> directories. Then, we would have (the same) memory problems as before, avoid duplicating the same runs and directory structures while keeping the profiling information persistent.

@fsimonis
Copy link
Member Author

the hierarchy of the generated run-tree becomes yet another level steeper

True. In practise this doesn't matter though. The directory structure is consistent, and it is easy to use globbing expressions or tools like find or fd to find all case files/directories.

Currently, we extract some job information from the path, but fundamentally, this could also go into a case.json, then the path doesn't matter too much and could be replaced by a hash.
The purpose of this path method was to be able to easily rerun entire mappings, which was useful to dial in RBF mappings.

each run (with the exact same configuration) is duplicated in another directory. This is particularly problematic in terms of memory. In the current setup, having larger runs can easily lead to O(10) GB . Having 5 repetitions then would lead to an excessive memory usage for no reason.

I assume that you mean storage, which depends on the goal of using the tool:

  • If mapping accuracy is a concern, then output is mandatory and repetitions are pointless. One output is all it needs to get useful results.
  • If mapping performance is a concern, then repetitions are mandatory and the output is pointless. Optional output of mapped mesh #197 allows disabling outputs if only timings are of interest.

However, the use case of the mapping-tester is to test mappings, which have error and runtime as their integral properties. One of the main features of all this is in fact that it measures both with one execution.
...

The mapping-tester was build to test accuracy and runtime for the mappings section of the preCICE v2 paper. Back then, I thankfully didn't run into storage issues, but the process was nevertheless wasteful in terms of storage and runtime (pointless post-processing).
For runtime measurements, this repetition loop and the aggregation needs to exist somewhere. It may be manually rerunning and saving outputs, it may be a loop over multiple test case directories, or it may directly baked into the tool.

I think a clean approach here would be to have the same directory hierarchy as we have right now, and store the profiling information of each run in individual precice-profiling- directories.

Reusing the same case files for multiple runs comes at the severe downside, that post-processing needs to be interleaved with the runs, which requires the entire job to wait. This is wasteful in terms of project CPUh and needlessly blocks the partition. Especially so, as post-processing isn't even parallelized.
The login nodes or the micro partition exist for such post-processing tasks.

This feature has always been part of the purpose of this tool. Up to now, we only got away with not having it built-in.

@davidscn
Copy link
Member

If mapping accuracy is a concern, then output is mandatory and repetitions are pointless.
If mapping performance is a concern, then repetitions are mandatory and the output is pointless

You successfully summarized the issue: both are a concern.

Reusing the same case files for multiple runs comes at the severe downside, that post-processing needs to be interleaved with the runs, which requires the entire job to wait.

That's not the case, please read again. If I have a profiling directory for each repetition, then I can handle everything afterwards.

@fsimonis
Copy link
Member Author

fsimonis commented Oct 28, 2024

You successfully summarized the issue: both are a concern.

I am trying to make the point that not necessarily both are a concern at the same time.

If I am interested only in measuring runtime then I shouldn't have to pay for measuring mapping-accuracy.

That's not the case, please read again. If I have a profiling directory for each repetition, then I can handle everything afterwards.

I get it now. You essentially move the “repetitions” loop into the run script of a case itself like this:

for run in range(repetitions)
  runCase()
  move("precice-profiling", f"precice-profiling-{run}")

So no interleaving of run and post-processing 👍

There are still some open points:

  • this still writes the output mesh n times, which is unnecessary IO and storage usage. Outputting the mapped mesh only for the first repetition may be an option even though it will lead to a different global time for the writing solver B.
  • the memory measurement also needs to be replicated (which is pretty questionable anyhow for anything running on multiple cores).
  • this complicates gathering data from cases. As a case now results in multiple data-points. The solution proposed here is a drop-in.

Can you point to a branch which uses your current workflow? It would be intetesting to see an implementation of this.

@davidscn
Copy link
Member

davidscn commented Oct 28, 2024

I get it now. You essentially move the “repetitions” loop into the run script of a case itself like this:

My first idea was to make use of the <profiling directory="" /> attribute, but honestly speaking, a simple renaming as suggested above seems to be much easier to apply and has the same effect.

this still writes the output mesh n times, which is unnecessary IO and storage usage.

Yes, it is unnecessary IO, but not any additional storage usage. The output mesh files generated in the first run will be overridden later on. Having the IO multiple times shouldn't hurt the runtime though.

the memory measurement also needs to be replicated

each measurement should give the same result, right?

this complicates gathering data from cases. As a case now results in multiple data-points. The solution proposed here is a drop-in.

I see what you have in mind. What I did last time was generating separate statistics files for each run and later on perform the averaging over all generating statistics files. Another option would be to load all frames immediately in each case and then aggregate the (e.g.) mean timing. Anyway, one would look at a single timing or derived measurements from the timings, it's mostly a question where to aggregate the data then.

That's what I did last time, after storing each run as a separate statistics file

https://github.com/davidscn/aste/blob/mapping-paper-setups/tools/mapping-tester/aggregate-timings.py

@fsimonis fsimonis force-pushed the add-mapping-tester-repetitions branch from ef98748 to 7adf960 Compare October 28, 2024 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants