Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usability: Allow users to relaunch a previously completed process with minimal required setup #9

Open
2 of 4 tasks
sphuber opened this issue Feb 23, 2023 · 0 comments
Assignees
Labels
roadmap/proposed A roadmap item that has been proposed but not yet processed

Comments

@sphuber
Copy link

sphuber commented Feb 23, 2023

Motivation

A cornerstone of AiiDA's design is the provenance graph. Its purpose is to improve the reproducibility of computational data. At a very minimum, the provenance graph should allow a user to retrace the origins of a particular piece of data stored within it. However, to make it truly reproducible, AiiDA should enable the user to actually relaunch a previously completed workflow or calculation to reproduce its outputs. The former goal has been achieved, but the latter is still not trivial.

Desired Outcome

The roadmap item can be considered complete if the following is possible:

  • User A can run a calculation or workflow through AiiDA and export it in an archive
  • User B can import the archive and relaunch the completed process with minimal setup. A rough sketch of the user interface would look something like:
    from aiida.engine import relaunch_completed_process
    completed = load_node(PK)
    resubmitted = relaunch_completed_process(completed)
    The relaunch_completed_process is a utility function that would take care of performing the required setup as much as possible, such as setting up and configuring any computers and codes.

Impact

The potential impact of this functionality is significant, although it is slightly attentuated since it is not realistic that it will be pragmatic for all projects run with AiiDA. For projects that can fulfill the requirements though, the reproducibility of the produced data should be close to 100% which is something that hasn't yet been accomplished in computational science and would therefore be a powerful demonstrator.

Complexity

The problem can be decomposed in two subproblems:

  1. Ensure that all inputs of any process are perfectly conserved in the provenance graph such that they can also be perfectly recreated for a relaunch
  2. Ensure that required computing environments can be automatically recreated and reproduced.

The first problem was essentially already solved by AiiDA's provenance graph, although it was incomplete for the metadata inputs. This has since been addressed in aiida-core and should be released with the next feature release v2.3.

The second problem is more challenging. Most completed processes will include at least one CalcJob and these have a Code input. In most cases, this is an instance of InstalledCode which represents an executable on a remote computing resource which is configured as a Computer node in the provenance graph. The problem is that the Computer that was used for the original calculation job is personal to the user that launched it and most likely the user wanting to reproduce the calculation doesn't have access to the same computer. Therefore the first challenge is to replace the original computer with one that the relaunching user has access to.

But even then, the original InstalledCode input only records the filepath to the executable on the remote computer. It is unlikely that the computer on which the job is to be relaunched has the code in the exact same path. Currently, the user will therefore manually have to create an InstalledCode node to replace the original input. This problem could be significantly simplified if the code would be containerized, for example using Docker. If the InstalledCode simply refers an executable within a Docker container, it should be transferable on any computing architecture and AiiDA would be able to automatically reconfigure an InstalledCode on the appropriate Computer.

This is then exactly where the limitation of the entire proposal comes in: the functionality will only be available to processes that use containerized codes for calculation jobs. This puts a burden on users, however, as now before they can run calculation jobs, they will have to create a container first. This will make this approach less suitable for research projects where the remote code is a moving target and recreating a Docker container for each iteration would not be feasible.

Progress

The following steps need to be taken to complete the roadmap item:

@sphuber sphuber added the roadmap/proposed A roadmap item that has been proposed but not yet processed label Feb 23, 2023
@sphuber sphuber self-assigned this Feb 23, 2023
@sphuber sphuber mentioned this issue Feb 23, 2023
27 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap/proposed A roadmap item that has been proposed but not yet processed
Projects
None yet
Development

No branches or pull requests

1 participant