Update Part 2 with the consistent use of process "definition" and process "description" #282

pvretano · 2022-03-01T20:23:27Z

Update part 2 to reflect that fact that you:
GET a process description from /processes/{processId}' (same terminology is used in part 1) and you POST a process **definition** to /processes` to deploy a process. In other words a process definition is an application package (i.e. description + execution unit + possibly other stuff).

Also, a new endpoint /processes/{processId}/package for get the process definition OR we use the same endpoint (i.e. /processes/{processId}) and use content negotiation to get the description versus the definition. There is already a MIME type for an application package so that should be straight forward.

The text was updated successfully, but these errors were encountered:

jerstlouis · 2022-03-01T20:33:24Z

@pvretano At the moment, the OGC API Application Package is only one potential language/format for the process definition (OGC Application Package is a separate conformance class).

If we keep it that way, neither of those options (calling it /package or apppkg media type) seem appropriate.

I feel that the definition and description are different enough that they should be separate resources, especially given the amount of confusion that mixing those up generate.

I would prefer GET /processes/{processId}/definition or GET /processes/{processId}/executionUnit.
I would have suggested GET /processes/{processId}/execution if we were not already planning execution via GET.

pvretano · 2022-03-01T20:36:53Z

@jerstlouis application pkg as defined in PART 2 is NOT MANDATORY. If it suited to your purpose it is recommended. Other language/formats are possible and if another one emerges (e.g. CWL or MOAW) we can add that as another recommendation.

jerstlouis · 2022-03-01T20:37:35Z

@pvretano Yes exactly, that's what I meant.

cportele · 2022-03-01T20:46:51Z

I am not able to follow the Processes discussion that is happening, but I saw the issue and think it is strange to POST a representation of a resource of type A (process definition) and when you GET the resource that has been created you get a representation of a different resource type B (process description). This doesn't sounds like HTTP to me. I am probably missing the necessary context, so feel free to ignore...

fmigneault · 2022-03-01T20:47:54Z

I'm fine with any of:

GET /processes/{processId}/package
GET /processes/{processId}/executionUnit
GET /processes/{processId} + Accept: application/ogcapppkg+json
GET /processes/{processId}?f=ogcapppkg

I will redirect all of these to whichever gets decided.
If possible, I prefer to have path components over headers/queries.

It is important to ensure Accept/f are required if /processes/{processId} is used for POST/PUT.

fmigneault · 2022-03-01T20:50:50Z

@cportele
It is mostly to preserve backward compatibility with Core. Deployed process or builtin are all obtained through GET /processes/{processId}.
Any extra information needed to define the process can be POSTed as well, but not mandatory.
It doesn't make sense to POST to another location, because the executionUnit has no reason to exist if not nested under the process it defines.

jerstlouis · 2022-03-01T20:58:36Z

Of those @fmigneault is happy with, I only like GET /processes/{processId}/executionUnit because deployment is not limited to application packages, so calling the definition 'package' or relying on the ogcapppkg media type to distinguish description vs. definition doesn't work for other definitions.

I agree with @cportele about the mismatched GET (description) vs. POST/PUT (definition), but I am not sure how we could solve this.
One possibility might be to only support PUT at /processes/{processId}/definition to deploy a process? (but that is also odd if /processes/{processId} does not exist yet)

jerstlouis · 2022-03-01T21:19:15Z

@fmigneault

Any extra information needed to define the process can be POSTed as well, but not mandatory.

I still argue that the description explaining how a process should be invoked is very different from instructions defining the process :)

pvretano · 2022-03-01T21:19:19Z

Lets step back for a minute.

In my mind the there is no difference between a process description and a process definition! They are simply two different representations of the same resource ... the process. One happens to include more information than the other. So, POST'ing to /processes/{processId} to add a new process seems consistent to me. Its just that a process description does not contain enough information for the server to deploy a working process.

Also, I am starting to wonder whether we actually need a new endpoint from which we get the process definition. If we agree that a process description and a process definition are simply two different representations of the same resource than content negotiation should be sufficient to get one or the other.

From my server you can do a GET /process/{processId}?f=application/ogcapppkg+json and you will get the application package of the process rather than the description. If you set f=json you get the description ... which is also the default as per the core.

jerstlouis · 2022-03-01T21:20:24Z

If we agree that a process description and a process definition are simply two different representations

I disagree, in my mind there is a fundamental difference, just like a C function prototype vs. a C function body.

pvretano · 2022-03-01T21:22:46Z

But a C prototype and a C function represent the same thing ... the function ... and, again, one includes more information about the function that the other.

pvretano · 2022-03-01T21:26:37Z

One other point ... since the very beginning of deploying processes to an OAProc deployment the process definition has been referred to as an "application package" and so I think we need to respect that terminology. So the endpoint name for getting the definition ... if in fact it needs to be a separate endpoint ... should have a name related to that legacy. So, /processes/{processId}/package or /processes/{processId}/apppkg or something like that ...

jerstlouis · 2022-03-01T21:33:00Z

@pvretano Wasn't that specifically in the context of the OGC Application Package though (which is the only example I am aware of implemented with process deployment so far)?
One possibility too is to merge the Application Package into Core, leaving the flexibility of how the executionUnit is defined (e.g. CWL, MOAW workflow, OpenEO). But I am worried that this would prevent the ability to upload a binary component directly as the deploy payload, e.g. a Jupyter notebook or docker inside a ZIP, as opposed to linking to something available from a URL.

jerstlouis · 2022-03-01T21:41:01Z

@pvretano

But a C prototype and a C function represent the same thing ... the function ... and, again, one includes more information about the function that the other.

True... (That is one way of seeing it) But for example, I may want to retrieve:

A process description using OGC JSON as described in Part 1 Core
A process description using OpenAPI
An OGC Application package with the executionUnit using CWL
An OGC Application package with the executionUnit in MOAW execution unit workflow
An OGC Application package with the executionUnit in OpenEO process graph
The CWL directly
The MOAW execution unit workflow directly
The OpenEO process graph directly

Each of these would need its own media type if it is a single resource.

pvretano · 2022-03-01T22:16:55Z

I don't see a problem here especially if it means the interface less cluttered.

Some of these, I believe, already have media types defined (CWL application/cwl+json, OpenAPI application/openapi+json).

I don't really see deploying a processes with a CWL execution unit and then asking the server to return that with a MOAW execution unit so as long the application package itself has a place to identify the type of execution unit (which it does) we only need one media type for an application packages ... perhaps parameterized to return the prototype, the body or both. There is a media type defined in Part 2 application/ogcappkg+json and we can add a parameter named content to get the proto, the body or the proto+body.

As for MOAW and OpenEO DAG, those should probably have media types defined ... no?

jerstlouis · 2022-03-01T22:25:59Z

@pvretano As you say the apppkg media type could still only support returning the original or native flavor of the execution unit, while you could use the execution unit media types to return only that part directly in different flavors.

Or the ogcappkg media type could also potentially have a parameter defined for the execution unit format, e.g. application/ogcappkg+json; executionType=application/cwl+json.

as for MOAW and OpenEO DAG, those should probably have media types defined ... no?

Agreed :) The MOAW media type could also be the same as for a generic execution request (those POSTed to /execution).

pvretano · 2022-03-01T22:31:11Z

@jerstlouis The form of the media type is open for discussion but do you really expect someone to deploy a process using a CWL execution unit and then asking the server to return "application/moaw+json"? If not then saying executionType=application/cwl+json is irrelevant ... no?

jerstlouis · 2022-03-01T22:33:32Z

@pvretano It does seem that for some of the scenarios in #279, some processing engines would be able to translate between MOAW / OpenEO / CWL.

e.g. deploying an apppkg containing MOAW executionUnit and generating CWL to run it.

pvretano · 2022-03-01T22:51:48Z

Yeah, I suppose that if the execution unit of a process is a scripted workflow then a server being able to translate from one to the other might be possible. If, however, the execution unit was a Docker container with a binary executable then that type of translation would be really, really hard if not impossible. Generally, however, I think that a process would be deployed with an execution unit of one type and that would be the type you get when you request the body or execution unit of the process. I would consider translating from one execution unit type to another as being an edge case and something that we could not necessarily document in the standard.

I keep pushing the use of the OPTIONS method without success but OPTIONS /processes would tell you which media type you can request from the endpoint and if your server supported translating between execution unit types you would see media type for CWL, MOAW, OpenEO DAG or whatever your server supports.

fmigneault · 2022-03-02T00:35:29Z

I also think process description and process definition are different entities. The first is like metadata about the process while the second details how to run it. But, even though they represent different portion of the process as a whole, they both don't make any sense if separated in the context of a deployed process. The distinction becomes clearer IMO because there are multiple ways to represent the definition, but the process description should always be the same schema, just like those 2 components are properly represented under 2 separate fields processDescription and executionUnit when deploying.

@jerstlouis
Please don't consider it is "the only example of implemented with process deployment so far". There are a large group of people that worked on this and based their implementation around this design. We must maintain backward compatibility as much as possible. You would not like it also if Part 3 was fully integrated and then some components where modified or dropped afterwards.

I don't want a major 2.0 just for moving stuff around in a deploy payload. We can work with processDescription and executionUnit as it already is defined for "OGC Application Package". That should remain the same. This goes as well for the extra PUT request. It breaks existing implementations and makes deployment even more convoluted then it already is. My implementation is able to patch missing parts in both processDescription and executionUnit when it can infer details from the other, so I need both at the same time. If an implementation wants to deploy some different content (eg: Jupyter notebook, zip, bin, ...), it can still do so using POST with a different Content-Type of the desired format.

I like the idea of returning different representations of /processes/{processId}/package using application/cwl+json, application/ogcappkg+json or application/moaw+json. That could allow interoperability between servers that have at least 1 overlapping representation they both support. My server would at least support the first 2, and potentially the 3rd.

OPTIONS /processes would tell you which media type you can request from the endpoint

That would be nice. Definitely the cleanest approach.

jerstlouis · 2022-03-02T01:09:07Z

@fmigneault

Please don't consider it is "the only example of implemented with process deployment so far"

I think you misunderstood what I meant by that, but I stand corrected because Peter clarified that the EO Application Package Best Practice actually does NOT use the OGC Application Package -- I thought it did. Rather it is an example of POSTing CWL directly as the Content-Type, if I understand correctly.

We must maintain backward compatibility as much as possible.

I never suggested otherwise :)

I don't want a major 2.0 just for moving stuff around in a deploy payload.

OGC API - Processes - Part 2 1.0 has not even been published yet...

/processes/{processId}/package

I don't feel strongly about this (so it's fine with me if that is the consensus), but it seems to me that the name is not intuitive to refer to the process definition.

I am also already confused with what an "application package" is given that the EO Application Package Best Practice is directly CWL, whereas the OGC Application Package contains a process description + an execution which can be different things (CWL among those).

As I mentioned previously, I would also recommend to define a new Application Package format (application/ogcappkg+zip?) as a compressed archive that can include a process description, execution unit, and potentially other files like a docker or Jupyter notebook or binary executable that can be directly embedded within the package itself -- that sounds more like an actual (self-contained) "package" to me.

NOTE: I am happy if we clarify that an "application package" is the same as the "process definition", and that "OGC JSON Application Package" (application/ogcapppkg+json) is one specific format of that. /package then makes more sense.

It might make sense to also include the EO Application Package (or could that just be CWL Application Package?) and maybe this new OGC Zipped Application Package as other conformance classes of Part 2?

fmigneault · 2022-03-02T02:16:48Z

@pvretano @jerstlouis

I am also already confused with what an "application package" is given that the EO Application Package Best Practice is directly CWL, whereas the OGC Application Package contains a process description + an execution which can be different things (CWL among those).

You are not the only one 😅 Best Practices mixes things in its own documentation.
It seems to refer to a similar format as "OGC Application Package":

https://docs.ogc.org/bp/20-089r1.html#toc16

The Application Package includes the following information:
- Reference to the executable block that implements the Application functionality
- Description of its input/output interface

i.e.: represented in the executionUnit and the processDescription respectively.

Even some sections somewhat contradicts themselves:

https://docs.ogc.org/bp/20-089r1.html#toc17

This best practice focuses on the scenario where an application is directly packaged as an Application Package, registered in a Platform and made available as an implementation of OGC API — Processes.
[...]
Validate and accept an application deployment request (OGC API — Processes Transaction Extension)

So, deploy directly ~~OGC~~ "Application Package" (ie: CWL), but validate with Transactions, a.k.a: OGC API - Processes - Part 2: Deploy, Replace, Undeploy before it was renamed.

I think adding application/ogcapppkg+json and application/cwl+json (and others of course like the zip) will resolve these ambiguities once and for all.

pvretano · 2022-03-02T02:21:55Z

I really hate to do this, having argued previously for the name /processes/{processId}/package but I think I now agree with @jerstlouis that the name is maybe less intuitive than it could be. The name /processes/{processId}/definition might be a better match for the way we currently view the relationship between the description and the definition of a process. Thoughts?

p3dr0 · 2022-03-02T09:17:15Z

Common Workflow Language (CWL) is a standard for describing computational data-analysis workflows
20-089r1 documents how CWL can be THE encoding of the OGC Application Package and how it can be used in Part 2 when deploying an application
From the CWL document we are able to obtain the process description (input/output interface) and execute the application (whatever might be)
CWL is a well-known format with a large community and open source implementations
ZIP files are archives that store multiple files

jerstlouis · 2022-03-02T10:31:02Z

@p3dr0

20-089r1 documents how CWL can be THE encoding of the OGC Application Package

This is where things get confusing, because currently Part 2 describes an encoding and conformance class called the "OGC Application Package" (application/ogcapppkg+json) specifically as a JSON encoding of a particular schema including processDescription and executionUnit properties, and this encoding can support different types of execution units (CWL being one of them), which is a part of the overall OGC Application Package.

From the CWL document we are able to obtain the process description (input/output interface) and execute the application (whatever might be)

Yes it makes sense to be able to infer the process description from the process definition execution unit (e.g. CWL).

ZIP files are archives that store multiple files

Yes, I suggest a ZIP-based application package type, with one or more files with pre-determined names inside whose content follow a particular schema, like several archive-based package formats e.g. .deb, .cab etc., because it would allow to deploy fully self-contained packages that do not require external references (all required resources can be embedded within the archive), avoiding the need to e.g. maintain a separate docker repository, or requiring a separate authentication mechanism when referencing private resources.

Would it make sense to refer to the 20-089r1 application packages simply as CWL process definition (or maybe CWL application package) and use the application/cwl+json Content-Type when deploying them directly via POST to /processes?

p3dr0 · 2022-03-02T12:48:35Z

please check Change request for the OGC Application Package CWL conformance class #258 (Oct 2021)

fmigneault · 2022-03-02T15:11:20Z

@pvretano
I can live with that modification (/processes/{processId}/definition) with support to return previously mentioned different content-types.

@mr-c
Is there an OpenAPI YAML schema location of the top-level CWL definition we could refer to in order to directly reference it to properly define what is proposed in #258?

mr-c · 2022-03-02T15:21:25Z

@mr-c
Is there an OpenAPI YAML schema location of the top-level CWL definition we could refer to in order to directly reference it to properly define what is proposed in #258?

Identifiers for particular versions of the CWL standards have the form https://w3id.org/cwl/v1.2/

https://w3id.org/cwl/v1.2/cwl.ttl is an RDF description of CWL concepts

There is also https://w3id.org/cwl/v1.2/cwl-context.json

An OpenAPI YAML Schema doesn't make sense to me because CWL is a serialization format, not an API. Maybe @tetron or @stain have other thoughts on the topic?

So I would recommend the use of https://w3id.org/cwl/v1.2/ to identify the expected content.

If it would be a better fit, then we can ask for a CWL identifier in the IANA media types database (previously know as MIME types).

tetron · 2022-03-03T03:51:05Z

There's a data definition part of openapi which I believe is quite similar to schema salad, it might be reasonable to translate the document schema of cwl to an open api object type definition. Then an open api library could perform at least partial validation of cwl objects.

fmigneault · 2022-03-03T17:54:51Z

That would greatly help promote the use of CWL as a fully-defined OGC Application Package and would help validate schema received by the API.

mr-c · 2022-03-03T18:59:23Z

Good to know @fmigneault! I've created common-workflow-language/schema_salad#517 to track the OpenAPI for CWL idea. We can flesh out a design there

jerstlouis · 2022-03-03T19:10:28Z

@mr-c @fmigneault Would this OpenAPI idea have anything to do with providing an OpenAPI process description at /processes/{processId} (negotiating OpenAPI representation)?
e.g. if the process was defined with CWL, whether it was deployed inside an OGC Application Package execution unit, or directly CWL as in the EO appkpkg BP, but also if the process was something other than CWL entirely?

mr-c · 2022-03-03T19:39:01Z

@jerstlouis My understanding is that the request is for generating an OpenAPI schema for an inline CWL description itself, and not for generating an OpenAPI schema for the valid inputs for a particular CWL description (though that should be doable as a separate effort, I think)

fmigneault · 2022-03-03T19:57:49Z

@mr-c @jerstlouis
I would like to be able to have something in the form of:

paths:
  /processes:
    post:
      requestBody:
        required: true
        content:
          application/cwl+json:
            schema:
              $ref: "https://www.commonwl.org/v1.2/schemas/cwl.yml"
          application/moaw+json:
            schema:
              $ref: "path/to/moaw.yml"
          application/ogcapppkg+json:
            schema:
              $ref: "https://raw.githubusercontent.com/opengeospatial/ogcapi-processes/master/extensions/deploy_replace_undeploy/standard/openapi/schemas/ogcapppkg.yaml"

With ogcapppkg.yml updated with something like:

type: object
required:
  - executionUnit
properties:
  processDescription:
    $ref: https://raw.githubusercontent.com/opengeospatial/wps-rest-binding/master/core/openapi/schemas/process.yaml
  executionUnit:
    type: array
    items:
      oneOf:
        - type: object
          properties:
            config: [...]
        - schema:
            $ref: "https://www.commonwl.org/v1.2/schemas/cwl.yml"
        - type: object
          additionalProperties: true

jerstlouis · 2022-03-03T20:05:54Z

@fmigneault Small fix: the /processes/{processId} should be PUT (replace), POST is to /processes (though I don't know if you would want to change the PUT (replace) to be at /processes/{processId}/definition to match the GET, if those become separate resources)

As per opengeospatial#282 (comment), remove CommonWorkflowLanguage.yml

bpross-52n · 2024-04-15T13:46:09Z

SWG meeting from 2024-04-15: The SWG decided to implement the /processes/{processId}/package endpoint to get the application package that was originally used to deploy the process.

Add /processes/{processId}/package path for accessing the formal description used to deploy a process Add specific recommendations and requirements for OGC Application Package and CWL conformance classes

Fix issue #282

gfenoy · 2024-09-30T15:36:51Z

Maybe this issue #410 was blocking this one?

Both can be closed once the PR #443 is merged.

pvretano mentioned this issue Mar 1, 2022

Workflow examples and use cases ... #279

Open

mr-c mentioned this issue Mar 3, 2022

codegen: OpenAPI export common-workflow-language/schema_salad#517

Open

bpross-52n added discussion Part 2 (DRU) labels Feb 2, 2023

gfenoy added a commit to GeoLabs/ogcapi-processes that referenced this issue May 31, 2023

Fix indentation and remove the CommonWorkflowLanguage.yml

b47410e

As per opengeospatial#282 (comment), remove CommonWorkflowLanguage.yml

bpross-52n assigned gfenoy Apr 15, 2024

bpross-52n added a commit that referenced this issue Jun 10, 2024

Merge pull request #404 from GeoLabs/features/fix-issue-282

1df9bb7

Fix issue #282

pvretano added this to OGC API - Processes - Part 2: Deploy, Replace, Undeploy Aug 5, 2024

pvretano moved this to To be drafted in OGC API - Processes - Part 2: Deploy, Replace, Undeploy Aug 5, 2024

bpross-52n closed this as completed Oct 28, 2024

gfenoy removed this from OGC API - Processes - Part 2: Deploy, Replace, Undeploy Dec 23, 2024

Update Part 2 with the consistent use of process "definition" and process "description" #282

Update Part 2 with the consistent use of process "definition" and process "description" #282

Comments

pvretano commented Mar 1, 2022

jerstlouis commented Mar 1, 2022 • edited Loading

pvretano commented Mar 1, 2022

jerstlouis commented Mar 1, 2022

cportele commented Mar 1, 2022

fmigneault commented Mar 1, 2022

fmigneault commented Mar 1, 2022

jerstlouis commented Mar 1, 2022 • edited Loading

jerstlouis commented Mar 1, 2022 • edited Loading

pvretano commented Mar 1, 2022

jerstlouis commented Mar 1, 2022

pvretano commented Mar 1, 2022

pvretano commented Mar 1, 2022

jerstlouis commented Mar 1, 2022 • edited Loading

jerstlouis commented Mar 1, 2022 • edited Loading

pvretano commented Mar 1, 2022

jerstlouis commented Mar 1, 2022 • edited Loading

pvretano commented Mar 1, 2022

jerstlouis commented Mar 1, 2022

pvretano commented Mar 1, 2022

fmigneault commented Mar 2, 2022

jerstlouis commented Mar 2, 2022 • edited Loading

fmigneault commented Mar 2, 2022

pvretano commented Mar 2, 2022

p3dr0 commented Mar 2, 2022 • edited Loading

jerstlouis commented Mar 2, 2022 • edited Loading

p3dr0 commented Mar 2, 2022

fmigneault commented Mar 2, 2022

mr-c commented Mar 2, 2022

tetron commented Mar 3, 2022

fmigneault commented Mar 3, 2022

mr-c commented Mar 3, 2022

jerstlouis commented Mar 3, 2022 • edited Loading

mr-c commented Mar 3, 2022

fmigneault commented Mar 3, 2022 • edited Loading

jerstlouis commented Mar 3, 2022 • edited Loading

bpross-52n commented Apr 15, 2024

gfenoy commented Sep 30, 2024

jerstlouis commented Mar 1, 2022 •

edited

Loading

jerstlouis commented Mar 1, 2022 •

edited

Loading

jerstlouis commented Mar 1, 2022 •

edited

Loading

jerstlouis commented Mar 1, 2022 •

edited

Loading

jerstlouis commented Mar 1, 2022 •

edited

Loading

jerstlouis commented Mar 1, 2022 •

edited

Loading

jerstlouis commented Mar 2, 2022 •

edited

Loading

p3dr0 commented Mar 2, 2022 •

edited

Loading

jerstlouis commented Mar 2, 2022 •

edited

Loading

jerstlouis commented Mar 3, 2022 •

edited

Loading

fmigneault commented Mar 3, 2022 •

edited

Loading

jerstlouis commented Mar 3, 2022 •

edited

Loading