Support CWL Prov with cwltool
for OGC API - Processes IPT
#673
Labels
feature/CWL
Issue related to CWL support
feature/job/provenance
Issue related to W3C PROV metadata applied to a Job.
process/OAP-Part4: Jobs
OGC API - Processes - Part 4: Job Management
process/workflow
Related to a Workflow process.
project/OGC-GDC
Developments related to OGC GeoDataCube
project/OGC-IPT
Developments related to OGC Integrity, Provenance, and Trust
triage/feature
New requested feature.
Description
Given the rising need for IPT (Integrity, Provenance, Trust) through OGC APIs and their workflow processing, the provenance capabilities of CWL should be leveraged to accomplish this goal. This would add metadata references within the CWL Application Packages themselves, allowing better open-science and IPT workflow tracking.
To Do
GET /jobs/{jobId}/run
to return the PROV-JSON produced by cwltool --provenance(edit: won't do)
GET /jobs/{jobId}/prov
as alternateendpointconsider additional PROV endpoints vs what
cwlprov
offershttps://gitlab.ogc.org/ogc/T20-GDC/-/wikis/GDC-Provenance-demonstration-GeoLabs#usage
GET /jobs/{jobId}/prov
(contents of variousmetadata/provenance/primary.cwlprov.{ext}
)GET /jobs/{jobId}/prov/info
(ascwlprov info
or)metadata/manifest.json
GET /jobs/{jobId}/prov/who
(ascwlprov who
)GET /jobs/{jobId}/prov/inputs
(ascwlprov inputs
)GET /jobs/{jobId}/prov/inputs/{id}
(ascwlprov inputs [<run-id>]
)GET /jobs/{jobId}/prov/outputs
(ascwlprov outputs
)GET /jobs/{jobId}/prov/outputs/{id}
(ascwlprov outputs [<run-id>]
)GET /jobs/{jobId}/prov/run
(ascwlprov run --inputs --outputs --labels --duration --steps
)(use all flags to get all available metadata)
GET /jobs/{jobId}/prov/run/{id}
(ascwlprov run [<run-id>]
)Alternate PROV-XML/RDF/etc. if
Accept
requests it(all variants should already be generated by
cwltool
as various manifest representations)When generating
cwltool --provenance
results, avoid duplicating results already found in WPS-outputs to save space (use their URI for cross-reference).File
is saved only as the{"class":"File", "path": "..."}
definition, this could be allowed to avoid extra code managing the referencescwltool
's own content limit)Any additional metadata/links pointing at the specific job and process executed that should be embedded in the PROV contents
Cross-walk with Support
POST /jobs
for various workflow implementations #716 requirementsInclude ORCID and other relevant PROV metadata #783
(see
cwltool --orcid --enable-user-provenance --enable-host-provenance
)update CLI to provide a
provenance
operationensure provenance-related requirements are added to conformance
ensure provenance links are returned in job status response
add more W3C PROV details about process I/O #780
References
cwltool --provenance
)Prov data input output common-workflow-language/cwltool#1989
Implementation
The text was updated successfully, but these errors were encountered: