Skip to content

Commit

Permalink
DOC #329 how to submit a workflow job
Browse files Browse the repository at this point in the history
  • Loading branch information
prjemian committed Nov 8, 2024
1 parent 7b26d8d commit 977b638
Showing 1 changed file with 88 additions and 13 deletions.
101 changes: 88 additions & 13 deletions docs/source/howto/_data_management.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,33 @@
# Setup APS Data Management

This document describes how to setup and use the [APS Data
Management](https://git.aps.anl.gov/DM/dm-docs/-/wikis/home) (DM) Python
This document describes how to setup and submit a workflow job using the [APS
Data Management](https://git.aps.anl.gov/DM/dm-docs/-/wikis/home) (DM) Python
[API](https://git.aps.anl.gov/DM/dm-docs/-/wikis/DM/Beamline-Services/API-Reference)
(tools) in a Bluesky session.

This document provides guidance for workstations at the APS, where DM tools and
services are available.

## Background
For more information, see the DM API reference for more information about how to
use the DM API and tools. See the `apstools`
[documentation](https://bcda-aps.github.io/apstools/latest/api/_utils.html#apstools.utils.aps_data_management),
for a list of the support code available.

As stated in the DM [_Getting
Started_](https://git.aps.anl.gov/DM/dm-docs/-/wikis/DM/HowTos/Getting-Started)
guide:
## About APS Data Management (DM)

As stated in the DM _Getting Started_
[guide](https://git.aps.anl.gov/DM/dm-docs/-/wikis/DM/HowTos/Getting-Started):

> The APS Data Management System is a system for gathering together experimental
> data, metadata about the experiment and providing users access to the data
> based on a users role.
## DM is configured by Environment Variables

The [_Getting
Started_](https://git.aps.anl.gov/DM/dm-docs/-/wikis/DM/HowTos/Getting-Started#setting-up-the-environment)
guide explains how to setup a pre-configured conda environment to use the DM
tools from the command line directly. The setup procedure uses this shell command:
The DM _Getting Started_
[guide](https://git.aps.anl.gov/DM/dm-docs/-/wikis/DM/HowTos/Getting-Started)
explains how to activate a pre-configured conda environment to use the DM tools
directly from the command line. The setup procedure uses this shell command:

```bash
/home/DM_INSTALL_DIR/etc/dm.setup.sh
Expand All @@ -46,13 +50,13 @@ session.

The Bluesky conda environment has all the packages for both Bluesky and DM
already installed (for APS installations). One of those packages,
[`apstools`](https://bcda-aps.github.io/apstools/latest/api/_utils.html#aps-data-management),
[apstools](https://bcda-aps.github.io/apstools/latest/api/_utils.html#aps-data-management),
provides support for using DM in a Bluesky session.

<details>

Function
[`dm_source_environ()`](https://bcda-aps.github.io/apstools/latest/api/_utils.html#apstools.utils.aps_data_management.dm_source_environ)
The `dm_source_environ()`
[function](https://bcda-aps.github.io/apstools/latest/api/_utils.html#apstools.utils.aps_data_management.dm_source_environ)
is used internally to install the environment variables. It expects a global
variable `DM_SETUP_FILE` to be defined in the module.

Expand Down Expand Up @@ -82,3 +86,74 @@ In typical Bluesky installations at APS, this file name is defined in the
# Use bash shell, deactivate all conda environments, source this file:
DM_SETUP_FILE: "/home/dm/etc/dm.setup.sh"
```
### Example at APS XPCS station 8-ID-I
Show how many DM workflow jobs are processing now:
```py
In [1]: from apstools.utils import dm_setup
...:
...: dm_setup("/home/dm/etc/dm.setup.sh")
...:
Out[1]: '8idi'

In [2]: from dm.proc_web_service.api.procApiFactory import ProcApiFactory
...: api = ProcApiFactory.getWorkflowProcApi()
...: jobs = api.listProcessingJobs()
...: for j in jobs:
...: if j["status"] not in ("done", "failed"):
...: print(f"{j['id']=!r} {j.get('submissionTimestamp')=!r} {j['status']=!r}")
Out[2]: # lots of jobs, only showing a few of them
j['id']='6754e679-cedb-482b-bb4d-b58137f84001' j.get('submissionTimestamp')='2024/11/08 04:48:31 CST' j['status']='pending'
j['id']='ad7328ae-35ba-4418-a9fd-b3dcc873348f' j.get('submissionTimestamp')='2024/11/08 04:48:34 CST' j['status']='pending'
...
j['id']='72b6d1b7-b6e0-4eb8-87d5-5f52792a043b' j.get('submissionTimestamp')='2024/11/08 08:31:22 CST' j['status']='running'
j['id']='19252b7d-8961-4994-8977-86929811a988' j.get('submissionTimestamp')='2024/11/08 08:31:28 CST' j['status']='running'

```

## Submit a DM workflow job from a Bluesky session

Here, we demonstrate one way to start a DM workflow from a Bluesky session.

To submit a workflow job from a Bluesky session, first call `dm_setup()` as described above. Then,
get the "DM Processing API" as follows:

```py
from apstools.utils import dm_api_proc

api = dm_api_proc()
```

Choose the workflow by name:

```py
workflowOwner = api.username
workflowName = "xpcs8-02-gladier-boost"
```

Define the workflow arguments in a Python dictionary (these arguments are
specific to the XPCS workflow named above):

```py
argsDict = {
"filePath": "H001_005_test_Feb_7-01000.h5",
"qmap": "eiger4M_qmap_d36_s360.h5",
"experimentName": "zhang202402",
# any other keyword arguments required by the workflow come next ...
}
```

Start the processing job:

```py
job = api.startProcessingJob(workflowOwner, workflowName, argsDict)
```

Show the processing job ID:

```py
print(f"{job['id']=!r}")
'c322e87c-ec43-4077-b074-eeef8522889c'
```

0 comments on commit 977b638

Please sign in to comment.