-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use partial functions in flows to increase user flexibility in setting job attributes (e.g. executor) #1322
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1322 +/- ##
==========================================
- Coverage 98.90% 97.73% -1.18%
==========================================
Files 76 77 +1
Lines 2932 2955 +23
==========================================
- Hits 2900 2888 -12
- Misses 32 67 +35 ☔ View full report in Codecov by Sentry. |
After looking at the changes I am not sure if this would allow to attach different executors to the various jobs? Running the first one locally and the second one with an HPC executor? About the first con: I am not sure if it is clunkier, but I would advocate that it is more intuitive. People who are not proficient in Python might find partial functions easier to understand than passing multiple 'kwargs' (they might not know about 'kwargs' as well). Schematically it also makes more sense in my opinion, as it would be easier to explain how flow works by showing that it is built from jobs you send as parameters. You can show a first example where partial is not needed and go on from there for example. |
The following works, although it is very clunky: import covalent as ct
from ase.build import bulk
from quacc.recipes.emt.slabs import bulk_to_slabs_flow
from quacc.recipes.emt.core import static_job
from quacc import job
# Define the Atoms object
atoms = bulk("Cu")
# Modify the static job to run "local" (instead of "Dask")
@job(executor="local")
def slab_static_job(*args, **kwargs):
return static_job(*args, **kwargs)
# Dispatch the workflow
dispatch_id = ct.dispatch(bulk_to_slabs_flow)(atoms, slab_static_job=slab_static_job)
# Print the results
result = ct.get_result(dispatch_id, wait=True) The following was supposed to work, but it does not update the executor. I think this may be a Covalent-specific thing. I'll have to revisit this because the approach below (or one similar to it) was what I was hoping to achieve. import covalent as ct
from ase.build import bulk
from quacc.recipes.emt.slabs import bulk_to_slabs_flow
from quacc.recipes.emt.core import static_job
# Define the Atoms object
atoms = bulk("Cu")
# Modify the static job to run "local" (instead of "Dask")
static_job.electron_object.executor = "local"
# Dispatch the workflow
dispatch_id = ct.dispatch(bulk_to_slabs_flow)(atoms, slab_static_job=static_job)
# Print the results
result = ct.get_result(dispatch_id, wait=True)
Thanks for your input! I greatly appreciate it. " Schematically it also makes more sense in my opinion, as it would be easier to explain how flow works by showing that it is built from jobs you send as parameters." This is a very good point that I had not thought about before, but I believe you're right. |
Reverted the merge. Performance issues were found afterwards. Need to continue debugging. Moved to #1359. |
…g job attributes (e.g. executor) (#1359) Continuation of #1322. Challenges: - Mainly Dask, which does not play nicely when `functools.partial()` is applied to a `Delayed` object. See dask/dask#10707. There is now a workaround. - Also there is a minor issue with dynamic switching of Covalent executors, but there is a workaround. See AgnostiqHQ/covalent#1889 Remaining issue: Need to be able to parallelize the following code on Dask. ```python from dask.distributed import Client from ase.build import bulk from quacc.recipes.emt.slabs import bulk_to_slabs_flow from functools import partial client = Client() atoms = bulk("Cu") delayed = bulk_to_slabs_flow(atoms) result = client.gather(client.compute(delayed)) ``` Hint: To monitor, set `"logfile": ""`. --------- Co-authored-by: deepsource-autofix[bot] <62050782+deepsource-autofix[bot]@users.noreply.github.com>
Closes #1320, although I'm not 100% confident that this is the ideal approach yet. I'm not comfortable with merging this without further justification that this is the best mechanism.
Pros:
Cons:
Task
as a kwarg to aFlow
definition causes a JSON serialization error PrefectHQ/prefect#10135.