-
Notifications
You must be signed in to change notification settings - Fork 191
AiiDA release roadmap
Information on upcoming AiiDA releases and features being actively worked on:
-
GitHub milestones: We use milestones to collect issues we plan to resolve in specific upcoming releases of
aiida-core
. -
GitHub projects: We use GitHub projects to group issues by topic where appropriate. Projects may be tackled in focussed AiiDA coding days, or they may extend over longer periods of time.
Suggested features & future plans:
-
AiiDA Enhancement Proposals (AEPs): AEPs are the preferred vehicle for proposing substantial new features and design choices. Note in particular the open pull requests.
-
Google Summer of Code projects: Projects listed here are considered to be suitable for newcomers. Help is welcome!
-
GitHub issues: Open issues with
type/feature request
ortype/enhancement
-
Development roadmap (below)
Please note: All of these resources are working drafts and subject to change.
Tentative roadmap of upcoming releases with major features/changes that will be included:
This is a short overview of directions under discussion for future AiiDA development. Most of these items require significant development efforts, and will be achieved more quickly with dedicated efforts from external contributors.
In short, help is always welcome!
AiiDA communicates with HPC centres via SSH keys. The SSH agent already allows AiiDA to deal with password-protected SSH keys, but so far there is no support for HPC centres that require two-factor authentication (2FA). With the cyberattack on European HPC centres in May 2020, this is becoming a pressing issue.
There are multiple routes to explore -- simpler installation on login nodes of HPC centers as well as solutions for scoping SSH key access (for certain time periods).
For more details, see https://github.com/aiidateam/aiida-core/issues/3929
See https://github.com/sphuber/aiida-shell
AiiDA has two classes of processes: the locally running process functions, and the remotely executed CalcJob
s.
AiiDA will treat simulation codes written in python just as any other executable with input files and output files.
While you can use python packages inside work functions and calculation functions that run locally on your computer, for codes running on remote machines AiiDA supports only file-based interaction.
For python codes, it seems wasteful to serialize/deserialize python objects from/to files, and a direct reuse of python objects would seem to be useful.
Related projects: pyOS, mincePy, aiida-dynamic-workflows
See the Google Summer of Code project
Note: Basic support for containerized codes is scheduled to be released with AiiDA 2.1. This is being dealt with in #5507
Contacts: @giovannipizzi, Contact: @jusong
See the Google Summer of Code project
Contacts: @chrisjsewell, @giovannipizzi
Creating a command-line interface (CLI) or a graphical user interface (GUI) to provide the information needed to create instances of the AiiDA ORM, e.g. a Computer
, a Code
or even an Int
, currently involves boilerplate code.
It would be useful, if the Computer/Code/... classes could be annotated in a way that makes the creation of a CLI/GUI automatic.
One open question in this respect is how to deal with validation - most of the validation currently occurs at the click
level.
In order to avoid duplication of validation checks, these would need to move into the ORM classes themselves.
This is added for creating codes in #5510. The ORM class provides a simple spec of its properties and the CLI is generated automatically from this. This is done using the aiida.cmdline.groups.dynamic.DynamicEntryPointCommandGroup
.
AiiDA originated from the ab initio electronic structure community, where individual calculation jobs usually occupy a full compute node. As users from neighboring disciplines start using AiiDA (e.g. for force-field calculations), this is no longer the case, and one node may need to be shared between multiple AiiDA jobs.
Many HPC centers have queues that allow for node-sharing of multiple serial jobs, but there a centers that only allow one job per node. For such cases, it would be useful if AiiDA could pack multiple jobs into one.
See section "10. Task farming" in the report of the 2020 AiiDA hackathon for more details.
We also need to make sure that the plugin is well-documented, and clearly advertised on the main AiiDA documentation etc so it is easy to find.
Contacts: @mbercx, @giovannipizzi, @pzarabadip
While AiiDA makes it easy to submit jobs to different computers, it is the responsibility of the user to do so. It would be useful, if AiiDA would include some basic cross-computer scheduling features, such as setting a maximum number of jobs running on one computer at any given time.
AiiDA provides export files as a means of exchanging AiiDA graphs.
It would be useful if AiiDA supported pushing
/pulling
changes in AiiDA graphs from/to collaborators, transferring only the "delta" of the differences.
Early implementation: https://github.com/szoupanos/aiida_core/tree/sharing_v1
AiiDA provenance graphs can grow very complex. A hierarchical view of AiiDA graphs (e.g. by giving users the ability to "fold/unfold" workflow nodes) will be critical to make them useful for grasping the high-level structure of complex workflows.
This applies both to static visualization (verdi node graph generate
) and for interactive visualization (e.g. the provenance browser.
Contact: @ltalirz
The verdi export
command allows to export your AiiDA graph based on provenance rules, starting from a set of nodes specified by the user.
A common use case is to export the entire AiiDA graph, e.g. for backup purposes.
While we document how to create backups, it involves different tools, and there is currently no shorthand in verdi export
to export the entire profile.
See https://github.com/aiidateam/aiida-core/issues/974 for more details
See mailing list and implementation candidate
Contacts: @giovannipizzi, @bonanzhu
Sometimes, data that is necessary for provenance is licensed and cannot be shared publicly (e.g. atomic structures from commercial databases, pseudopotential files, ...) While this is contrary to the open science credo, it is a relevant use case and AiiDA should offer a convenient way to support it.
AiiDA can read computer & code configurations from yaml
files.
It would be useful to create an online registry of computer and code setups for major high-performance computing clusters.
An open question is how to best deal with customizations needed on an individual user level (e.g. which queues to use, etc.).
Do we need an additional templating language in the yaml
files (like jinja2) or can we get away without it?
Repository: https://github.com/aiidateam/aiida-code-registry
Most of AiiDA's API is already domain-agnostic but some materials science specific classes remain in aiida-core
(e.g. StructureData
, KpointsData
, ...).
In order to make aiida-core
more friendly to other disciplines, we would like to move these classes out to separate packages (e.g. aiida-pseudopotenials
, aiida-atomic
, ...).
One important open question is how to handle database schema migrations for data types defined by plugins.
Plugins can add to AiiDA in many ways, but one major way that is still missing is to be able to add plugins to the AiiDA REST API (e.g. for new data types).
This would require adding a new entry point group aiida.rest
and using this group to announce both existing and new REST endpoints.
One important open question in this regard is whether a switch from REST to GraphQL would make this easier (see aiida-graphql prototype and presentation ).
See also: https://github.com/aiidateam/aiida-restapi
While AiiDA has a web interface to query the AiiDA graph (the AiiDA REST API), it does currently not have a web interface for workflow management.
In order to integrate AiiDA into existing workflow management platforms, a web interface for workflow management (mimicking e.g. what can currently be achieved via the verdi
command line interface) is necessary.
Prototype implementation: https://github.com/aiidateam/aiida-restapi
Comment [GP]: should we first focus on a generic REST API interface for workflows, or rather on a specific (but generalisable) one, e.g. for the common workflows?
Contacts: @ltalirz, @chrisjsewell
Interest: @csadorf
Calculations which pass (part of) their outputs on as a RemoteData input are broken once their scratch directory is cleaned.
It becomes problematic however when caching is introduced: When trying to re-run the calculation (to regenerate the remote folder), caching will happily use the existing calculation (without a remote folder).
As a result, subsequent calculations which use the remote folder will fail.
Moreover, the cached Calculation node point to the same remote folder which is a 'shallow copy' of the original calculation node that causes the issue when cleaning the remote workdir of cached node, the original node is unable to used for parent calculation to pass their outputs on as a RemoteData
.
The caching policy needs to be redesigned a bit to overcome these issues and balance with not increasing the connection burden and remote disk storage too much.
See https://github.com/aiidateam/aiida-core/issues/3735 and https://github.com/aiidateam/aiida-core/issues/5178 for more details.
Contacts: @jusong
For plugins that consist of many new workflows and work chains, it is hard to create unit tests during development but have to create profiles and actually run the workflows.
The GitHub CI test has to be configured to run the daemon in the background and run the workflow as a whole to check the outputs.
There are efforts to tackle this with caching and pre-installed/config code, e.g aiida-testing. But it is not easy to use and still test workflow in a whole test case.
The idea is to have a test framework based on pytest
that can easily create mock tests for specific steps of a workflow.
Contacts: @jusong