-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Usability: Make it easy to run Python-based codes #19
Comments
I think https://github.com/microsoft/aiida-dynamic-workflows is another great solution to this problem. Its approach is simply to treat Python Code as Data, and execute it via the Python interpreter on the remote machine. |
Good point, I really need to find some time to give If that is the case, then this would of course restrict the compute environments that are compatible. Most computing environments have Python installed, but don't necessarily have access to the same AiiDA environment. Of course one could then only submit code that doesn't use AiiDA's API, but that means you also have to pass non |
The remote compute environment does not have access to the AiiDA profile.
It includes automatic type conversions for calculation/workflow inputs (where this is possible) (inside the decorated Python functions, AiiDA objects are of course not allowed). |
This seems quite interesting, I can see use cases for some software packages, e.g. thermocalc where one can use a python interface to submit calculations. Writing a full plugin for this might be not desirable as the python interface already exists. AiiDA making it more flexible to make use of these codes would make it much easier to use for codes which are not DFT/MD which has been the focus until now. |
Motivation
AiiDA is mostly used in the chemistry and materials science domains, where Python is very popular. Many users, therefore, will have Python-based codes that they want to use through or integrate with AiiDA.
If the code can be used as a library, the
calcfunction
concept makes it easy to use the code in AiiDA while keeping the provenance. However, thecalcfunction
has limitations. It can only be run on the machine where AiiDA runs, and so for heavy operations, this solution can overload the daemon workers. In addition, the calcfunctions execution does not support checkpointing, meaning the execution has to be finished in one go and cannot be interrupted to be restarted later. This poses a problem for code that has a long execution time.The solution to both these problems is to run the code as a standalone script and run it on another machine as a calculation job. However, this requires writing a dedicated
CalcJob
plugin, and potentially aParser
plugin to parse the results, which is complicated and takes a lot of time. In addition, these Python scripts are often "moving targets" where the interface changes fast. This means the plugins have to change with it, essentially doubling the development cost and severely slowing down the speed of development.This limitation often make AiiDA too restrictive to use for many use-cases where code is still in development and developing
CalcJob
andParser
plugins are simply not worth time.Desired Outcome
There should be a way to allow users to easily run Python based codes without the requirements of writing dedicated plugins.
Impact
Since Python is very popular in the field where most AiiDA users are active, but also in general in computational science, making it easier to run these codes with AiiDA would greatly increase adoption and improve the efficiency of users.
Complexity
This section should comment on the complexity of resolving the item, and make a rough estimate as to the amount of work/time it would take to reach the desired outcome.
Background
This issue is very closely related to roadmap item #5 which discusses making it easy to run "codes" (or executables) without requiring dedicated plugins. In this case, the code is Python itself and the relevant script would be a command line argument.
Progress
As mentioned in the previous section, this use-case is strongly related to #5 which already has a concrete solution in the form of
aiida-shell
. This plugin package also makes it easy to run arbitrary Python scripts on any computer configured in AiiDA and it is already being used in production.As an example, consider the following script:
It loads a
numpy
matrix from the filematrix.npy
, computes the inverse and writes the result to the fileinverse.npy
. It is currently not possible to have AiiDA run this on another computer, except if a dedicatedCalcJob
plugin is written first.Now consider how
aiida-shell
actually makes this possible. Imagine that the numpy script is written in the current working directory asnumpy_script.py
. The following is a complete script that works withaiida-shell
v0.4.0:With a few lines of Python, the script can be run (on any computer configured in AiiDA that has Python installed) without having to write a
CalcJob
plugin. The custom parser is even fully optional. Without it, theinverse.npy
would automatically be attached as aSinglefileData
. But since it is useful to have it as anArrayData
, it makes sense to use the on-the-fly parser to change the data type of the output.Note that in this example, the job was "run", however, it can just as easily be submitted to the daemon by passing
submit=True
to thelaunch_shell_job
function.The text was updated successfully, but these errors were encountered: