Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Central job submission endpoint (also: POST /processes/{processID}/execution is not RESTful and is confusing for workflows) #419

Open
m-mohr opened this issue Jun 21, 2024 · 24 comments
Labels
Part 4 (Job Management) OGC API - Processes - Part 4: Job Management

Comments

@m-mohr
Copy link

m-mohr commented Jun 21, 2024

The standard says:

The standard specifies a processing interface to communicate over a RESTful protocol

In REST, everything should be centered around resources. The endpoint POST /processes/{processID}/execution is not a resource and POSTing to it, should create e.g. /processes/{processID}/execution/{executionId}, but it doesn't. Instead it may for example create /jobs/{jobId} (in async) (or return a result directly in sync).

To create a job, asynchronous processing requests should be sent to POST /jobs. This would also remove the issue that there is this weirdness that for workflows you need to pick a "main process" to work underneath.

Singular processes could also be sent there with a workflow that just consists of a single processing node.
if you just send async requests to the endpoint issues with the Prefer header would also be solved: #413

Synchronous processing for a single process could still be sent to POST /processes/{processID}/execution but it would be weird for a workflow to be sent to that endpoint, too. So maybe it should be a separate top-level endpoint?

PS: This came up in some discussions with @fmigneault and @aljacob recently so posting it here for discussion.

@m-mohr m-mohr changed the title POST /processes/{processID}/execution is not RESTful and is confusing for workflows Central job submission endpoint (also: POST /processes/{processID}/execution is not RESTful and is confusing for workflows) Jun 21, 2024
@fmigneault
Copy link
Contributor

I will repeat my answer during the Testbed's meeting just for the sake of sharing with everyone openly.

POST /processes/{processID}/execution was introduced (to my understanding), because an OAP implementation is allowed to omit the creation of a job. This is relevant, notably, if the OAP decides to only support sync execution mode, where a job resource is not necessary (though it could still create one for later reference if desired), since the results are obtained directly.

Given that no job would be created in this case (which is technically considered the default/minimal requirement of OAP), the inverse non-RESTful argument arises if POST /processes/{processID}/jobs was used, since no job entity is created and 200 is returned. The way to avoid this ambiguity in REST is usually to replace the term by an action/verb, hence the execution (arguably, a better choice could have been execute?), to indicate that an operation is "created" rather than a resource.

Note that I agree with /processes/{processID}/jobs being better, since my implementation supports both sync/async, and creates a job for reference in both cases anyway, but I understand the reasoning of the execution counterpart. Since it is not much overhead, and for the sake of backward compatibility, my server handles both paths interchangeably.

I think POST /jobs makes sense as well (especially for alignment with openEO and potentially submitting an ad-hoc Workflow). It makes sense to add a POST definition for this path since it is already available, and would (in the case of async at least) deal with corresponding resources. However, I think this does not resolve the RESTful convention issue in the case of sync that would still not require a job resource to be created.

I think sync/async and job creation implies opposite ways to think about it, and none will be satisfied with either approach. My preference is to reduce the number of endpoints doing the same thing, even if it might feel odd for one approach over the other. That being said, I would most probably support either way regardless for historical/compatibility reasons.

@jerstlouis
Copy link
Member

If a particular "root" process really does not make sense for some workflow definition (although there are work arounds for that, like a simple process gathering outputs from multiple processes, whether as separate outputs, as an array of outputs, or as an actual combining operation like merging multiple bands into a single GeoTIFF), then we could probably agree on some other end-point where to execute a workflow in an ad-hoc manner. For pre-deployment of workflow, Part 2 should still be used (POST a new workflow definition at /processes to create a new process).

Whether using /jobs for this purpose makes it easier or harder for openEO integration probably depends on #420 discussion in terms of whether it conflicts with existing capabilities or ends up working exactly the same as current functionality.

@gfenoy
Copy link
Contributor

gfenoy commented Jul 9, 2024

During the SWG meeting on 2024-07-08, I introduced the idea of defining a conformance class in the OGC API - Processes - Part 3: Workflows to add POST on /jobs with an execute request that would define a "workflow." When I say "workflow" here, I mean a JSON object that would conform to the execute-workflows.yaml schema, so a processes chain (execute request with a root process).

With Part 1, it stays the same:

POST /processes/{processId}/execution
execute request conform to execute.yaml

The response contains a JSON object conforming to statusInfo.yaml and a header with the location of the created job (/job/<jobid>).

With Part 3, you would be also able to use the following:

POST /jobs/
execute request conform to execute.yaml (adding a "process" attribute pointing to the process to execute)

Here, there are options for what happens.

  1. The behavior can be the same as with the execute endpoint (POST /processes/{processId}/execution), and the execution starts right away. There is no real interest in adding such an end-point if it offers no capability other than the one defined in Part 1.
  2. Another option would be to return a JSON conforming to statusInfo.yaml containing a new jobid and status=accepted. A Location header can also be included to point to the created /job/{jobId}. But no execution occurs at that time; you only ask for a job instantiation.

Using the second option, we may then imagine using POST on /jobs/{jobId}/execution to start the execution of the "prepared job" effectively (I was willing to use the same /jobs/{jobId}/results initially. Still, it conflicts with the currently available end-point). Then, the behavior remains the same as for a standard execution.

I think adding this modification in the Part 3 draft specification would help align OGC API - Processes with OpenEO.

If there is interest in adding this to the Part 3 draft, I volunteer to start the writing and work on PR for this addition for further discussion.

@fmigneault
Copy link
Contributor

@gfenoy

With Part 3, you would be also able to use the following:
POST /jobs/

Ideally, this should be handled in Part 1 as well.
There is no reason to limit this capability to workflows only.

[...] the execution starts right away. There is no real interest in adding such an end-point if it offers no capability other than the one defined in Part 1.

Even if the execution starts right away, there is an advantage. It allows the definition of a graph (rather than a chain) that does not have a single root process, which is a requirement with POST /processes/{processId}/execution from the reference processId.

The workaround is to define the process corresponding to that graph to invoke it as a root process, but this involves support of Part 2 to deploy it.

Another option would be to return a JSON conforming to statusInfo.yaml containing a new jobid and status=accepted. A Location header can also be included to point to the created /job/{jobId}. But no execution occurs at that time; you only ask for a job instantiation.

Note that status=accepted will still not behave like openEO. Although status=accepted is available, OAP does not require to POST again to "start" the job. Accepted only means that it was received. At the time the next GET status request is done, it could already be started, completed, or still in queue depending on server availability. In openEO's case, it will remain pending until the "start" request is sent.

I believe that a POST /jobs returning similar statuses to what POST /processes/{processId}/execution returns, but with different meaning regarding how the job is actually started will be confusing and cause inconsistent implementations. Maybe a distinct status=pending (or created, ...) should be considered to allow POST /jobs/{jobId}/execution strategy.

If this is the strategy that is taken, I would also like to have a way (query or body parameter?) to indicate whether the job should wait (status=pending) or is allowed to run immediately (status=accepted) to avoid having to do 2 requests each time if we want to submit the execution right away.

@gfenoy
Copy link
Contributor

gfenoy commented Jul 12, 2024

Ideally, this should be handled in Part 1 as well. There is no reason to limit this capability to workflows only.

I agree with that point, but the schema that defines the process key is currently only available in the Part 3 draft specification. If we move it to the core, I don't see any objection.

[...] the execution starts right away. There is no real interest in adding such an end-point if it offers no capability other than the one defined in Part 1.

Even if the execution starts right away, there is an advantage. It allows the definition of a graph (rather than a chain) that does not have a single root process, which is a requirement with POST /processes/{processId}/execution from the reference processId.

The workaround is to define the process corresponding to that graph to invoke it as a root process, but this involves support of Part 2 to deploy it.

Another option would be to return a JSON conforming to statusInfo.yaml containing a new jobid and status=accepted. A Location header can also be included to point to the created /job/{jobId}. But no execution occurs at that time; you only ask for a job instantiation.

Note that status=accepted will still not behave like openEO. Although status=accepted is available, OAP does not require to POST again to "start" the job. Accepted only means that it was received. At the time the next GET status request is done, it could already be started, completed, or still in queue depending on server availability. In openEO's case, it will remain pending until the "start" request is sent.

I believe that a POST /jobs returning similar statuses to what POST /processes/{processId}/execution returns, but with different meaning regarding how the job is actually started will be confusing and cause inconsistent implementations. Maybe a distinct status=pending (or created, ...) should be considered to allow POST /jobs/{jobId}/execution strategy.

If this is the strategy that is taken, I would also like to have a way (query or body parameter?) to indicate whether the job should wait (status=pending) or is allowed to run immediately (status=accepted) to avoid having to do 2 requests each time if we want to submit the execution right away.

I support the new pending (prepared or created) and queued status ideas and the additional parameter for the client application to choose between the execution modes (waiting or run). I think that openeo uses different endpoints for both cases /results for executing immediately and /jobs/{jobId}/results to execute asynchronously. If we already define GET on /jobs/{jobId}/results to access the execution results, I don't have any objection to adding support for POST on this same path for executing prepared tasks and adding support for POST on /results with the same request body for synchronous execution, as we don't have anything defined for /results yet and we did not support the execute-workflow.yaml before Part 3 schema was made available, so there was no way to send a process chain. Still, I don't understand why the word "results" is used in this path as we discuss execution here.

I thought returning a statusInfo makes sense as it corresponds to how we get information regarding a job, and we POST on /jobs to create it. In this statusInfo, one link with rel=http://www.opengis.net/def/rel/ogc/1.0/execute would be pointing to /jobs/{jobId}/execution (in openeo, it is /jobs/{jobId}/results), and using POST on this path would create an entity like: /jobs/{jobId}/execution/{runId} (that can be reduced to /jobs/{jobId}). I don't think this runId is required. A job is traditionally a mutable entity, so I don't see any issue with reusing the same jobId to follow the execution progress.

To summarize, this would result in the addition of the following endpoints:

  • POST on /results => synchronous execution of a process chain (body: execute request execute-workflow.yaml)
  • POST on /jobs => statusInfo with status=created and Location header set to /jobs/{jobId} (same body as before)
  • GET on /jobs/{jobId} => statusInfo (up-to-date)
  • POST on /jobs/{jobId}/results => statusInfo with status=queued or status=running (empty body request, I think)

I would prefer to change from /results to /execution.

If there is interest in moving in that direction, I would be happy to volunteer to start a PR with the required updates in Part 1 or Part 3, depending on what we decide about the execute-workflow.yaml schema (moving it to core).

@pvretano
Copy link
Contributor

@gfenoy @fmigneau STOP! We were almost ready to submit Part 1 and Part 2 to the OAB for review and then RFC. Now we are proposing to change or add a completely NEW way to execute processes. Sorry but this is not something we can just throw into Part 1 and Part 3 without (a) a lot of dicussion in the SWG and (b) at least some testing in a code sprint or two.

@jerstlouis
Copy link
Member

jerstlouis commented Jul 12, 2024

My preference would be to leave Part 1 as-is except for bug fixes / clarifications of ambiguity, maintaining what so far was a mostly backward compatible path from 1.0.

We can have the discussion about new end-points and/or new methods at existing end-points for Part 3: Workflows, since it's still at a relatively early stage compared to the Part 1 revision.

@sptillma
Copy link
Contributor

I would second @pvretano and @jerstlouis sentiment. We should continue with Part 1 as-is and address these suggestion in Part 3 or as a future item. I'm happy to have a conversation about it at the next SWG meeting - but I will need a lot of convincing at this point.

@fmigneault
Copy link
Contributor

fmigneault commented Jul 12, 2024

@pvretano
I want to highlight that I am NOT proposing those changes to be integrated in Part 1 and Part 2 before their current state are accepted on their own. I want a prior Part 1/2 release as well!

I would even be fine to define an entirely separate "Part 4" for POST /jobs/{jobId} on its own, which could be reused by Part 1 Core Processes, Part 2 Deployed Processes and Part 3 Workflow Processes. I do see this as a new alternate capability that can build on top of existing items without breaking existing Part 1/2/3. It SHOULD NOT replace the Part 1 Job execution strategy. Users should be allowed to opt-in or not with this alternate execution method.

That being said, I think it is worthwhile to have these discussions early on since TB20 is working on it. It does mean it has to be integrated yet at all. However, not discussing these issues now will lead to bifurcating implementations, and more problems down the road when we try to "realign" them.

@gfenoy
Copy link
Contributor

gfenoy commented Jul 12, 2024

@pvretano, @sptillma, @jerstlouis, @fmigneault, I agree with all of you. I'm as eager as you are to see the new releases from Part 1 and Part 2.

I want to clarify that initially, I only mentioned the integration within the Part 3 draft.

I am also open to @fmigneault's proposal to start a new Part 4 for OGC API - Processes.

@fmigneault
Copy link
Contributor

fmigneault commented Aug 29, 2024

Proposal

Shared by @gfenoy
https://geolabs.fr/dl/job-management/ogcapi-processes/extensions/job_management/standard/20-044.html#_requirements_class_job_management

  • Adding POST /jobs
  • Adding extra Job status to allow better cross-walk with non-OAP statuses
    (ie: openEO's additional statuses for pending job, etc.).
  • Adding additional endpoints for retrieval of (inputs, outputs, run) submitted to the Job

Note

PR: #437


Review

Note

Following points are updated according to following discussions to keep items grouped in a single comment.

  1. I think Content-Type: application/json for POST /jobs could have some additional types to support OGC API — Processes — Workflow Execute Request body and OpenEO Process Graph body. Alternatively, a Content-Schema could be used to distinguish between Content-Type: application/json contents, or allow auto-detection (which should detailed explicitly 'how' to do it, since contents can overlap).

  2. Job Run Requirement must describe what is expected in the response. I'm guessing this relies heavily on contents generated by cwltool --provenance, but non-CWL variants need to have a structure provided as well. A schema should be provided (or at least details of what is expected in it, so we can define this schema).

  3. Since additional endpoints are being defined, I propose we had /jobs/{jobID}/logs as well, since many implementations already support it.

    • TBD: recommended formats to support (JSON, XML, TEXT, YAML, etc.)
    • TBD: recommended structure (directly log lines or with nested logs/links?)
  4. As highlighted by @pvretano, the Quotation Requirement should be removed from Part 4, as it could be used on its own with Part 1: Core without the additional job requirements. Consider a Part 5 for https://github.com/opengeospatial/ogcapi-processes/tree/master/extensions/quotation and https://github.com/opengeospatial/ogcapi-processes/tree/master/extensions/billing.
    removed

  5. Conformance Class Deploy, Replace, Undeploy seems incorrectly named. It describes only the POST (job creation aspect). I think it should be renamed to "Job Creation" to avoid confusion with Part 2: DRU.
    updated

  6. To help in the cross-walk with openEO, a PATCH /jobs/{jobId} might need to be considered. More specifically, that could be used to initialize a (TBD: pending, created) job, requesting it to start. Following this, it should be expected that the job switches to accepted or directly to running if the server decides it can start processing it right away.

  7. Job "Replace" (PUT) should be distinguished from "Update" (PATCH) in https://geolabs.fr/dl/job-management/ogcapi-processes/extensions/job_management/standard/20-044.html#update (considering that Part 2: DRU uses R for "Replace"/PUT).
    updated

    In either case, a pre-requirement for Job Modification should be that the Job MUST be in (TBD: pending, created) state (i.e.: awaiting submission/execution). Otherwise, weird side effects could occur for a job already in use by an accepted/running process.

  8. Considering (6) and (7) above, Staring a job should be renamed to "Job Creation", and should allow additional statuses than permitted by statusCode.yml. Otherwise, any subsequent PUT/PATCH do not make sense.
    updated

  9. Building upon (6)-(8), the POST /jobs should include some kind of indication that the (TBD: pending, created) status is desired (in contrast to default: started).

    • Since sync(wait)/async(respond-async) execution is already controlled via Prefer on other endpoints, I propose to reuse some other term [TBD] in Prefer. This example (https://www.rfc-editor.org/rfc/rfc7240.html#section-2.1) shows a priority parameter. Maybe priority=0 could be used to indicate that a job must not start running/accepted immediately? Maybe another parameter entirely could be used? This seems possible since priority itself is not described, although given as example.

    • Given that openEO BatchJob uses statuses (created, queued, running, canceled, finished and error) (see also Crosswalk between openEO API and OGC API - Processes), Part 4: Job Management should preferably reuse created. A job modification (PATCH|PUT /jobs/{jobId}) would be allowed only in created status, as in openEO.

    • Using PATCH /jobs/{jobId} to start a job (ie: switching from created to accepted/running) could allow openEO to do necessary "under-the-hood" adjustments (eg: subrequest to their POST /jobs/{job_id}/results BatchJob Start operation with mapping of accepted = queued), which would make the transition transparent for a client interacting with the server as a OGC API - Processes. Note: a openEO client would still be allowed to do POST /jobs/{job_id}/results directly if preferred, the same result would happen.

    • Alternatively, use POST /jobs/{jobId}/results like openEO to avoid potentially confusing PATCH endpoint (see reasoning in comment), but consider handing of different Content-Schema for OAP/openEO content-negotiation of the submitted body and possibly different response structures/behaviors (negotiate via Accept-Schema or default to the same "API-style" as the submitted Content-Schema?).

  10. Section Requirement OGC API — Process — Workflow Execute Request should be more explicit about using Part 3: Deployable Workflows. A subsection should also be added for the Part 2: CWL case, since Part 4: openEO Process Graph is specified. Alternatively (and maybe simpler), a single class could be defined (eg: Job Content Requirement), which would list all of those (and maybe more) combinations, and their corresponding media-type, in a table.

  11. For any applicable point above mentioning Content-Schema,
    a Content-Type: application/json; profile={openeo|ogcapi-processes} could alternatively be considered.


Suggestions (meeting 2024-09-09)

  1. Rename the jobID to simply id, to align with openEO and other OGC APIs for describing the "current" object (i.e.: the Job).

  2. Ensure type is defined, with type: process referring to OGC API - Processes, coverage for OGC API - Coverage, etc. and type: openeo for openEO. The additional metadata (and applicable schemas) to validate the Job contents should then rely on this type value.

  3. Modify status to accommodate for any string value (not a specific enum). Using type, specific values can be resolved. The current OAP values (accepted, running, failed, etc.) can be considered deprecated, but still allowed for v1.0 compatibility, and gradually phasing toward openEO status values (queued, running, error, etc.) (see mapping: Crosswalk between openEO API and OGC API - Processes) for v2.0 alignment.

  4. Use "specific Job type" references to links when possible:

    • processID to become a rel: process pointing at that "process" definition, what ever it might be (OAP, openEO, ad-hoc workflow, etc.)

Extra

  1. consider rel: profile IANA link-relation with OGC Naming Authority to describe the Job (similar to what type would do)?

@pvretano
Copy link
Contributor

Quotation should not be part of job management. It should be its own part.

@fmigneault
Copy link
Contributor

Quotation should not be part of job management.

I agree. A server could handle quotation/billing on its own, with only OGC API - Processes considerations and no intention to cross-walk with openEO and alternate job management/submissions.

gfenoy added a commit to GeoLabs/ogcapi-processes that referenced this issue Aug 29, 2024
@gfenoy
Copy link
Contributor

gfenoy commented Sep 5, 2024

Thanks a lot for your feedback.

Proposal

Shared by @gfenoy https://geolabs.fr/dl/job-management/ogcapi-processes/extensions/job_management/standard/20-044.html#_requirements_class_job_management

  • Adding POST /jobs
  • Adding extra Job status to allow better cross-walk with non-OAP statuses
    (ie: openEO's additional statuses for pending job, etc.).
  • Adding additional endpoints for retrieval of (inputs, outputs, run) submitted to the Job

Review

(updating with following discussions to keep items grouped in a single comment)

  1. I think Content-Type: application/json for POST /jobs could have some additional types to support OGC API — Processes — Workflow Execute Request body and OpenEO Process Graph body. Alternatively, a Content-Schema could be used to distinguish between Content-Type: application/json contents, or allow auto-detection (which should detailed explicitly 'how' to do it, since contents can overlap).

Initially, I was thinking of using application/openeo+json and application/processes+json, but in the end, I choose your proposed Content-Schema option and added the following permission (I wonder if it should not a recommendation rather than a permission).

  1. Job Run Requirement must describe what is expected in the response. I'm guessing this relies heavily on contents generated by cwltool --provenance, but non-CWL variants need to have a structure provided as well. A schema should be provided (or at least details of what is expected in it, so we can define this schema).

I fully agree with that point, and we may investigate the RO-Crate Specification, it seems to be linked to the Research Object manifest produced by cwlprov. What do you think?

  1. Since additional endpoints are being defined, I propose we had /jobs/{jobID}/logs as well, since many implementations already support it.

I support this idea of a /jobs/{jobId}/logs endpoint and would like to discuss this more specifically. What should the content schema of the response be?

I was initially thinking of something like this:

type: object
required:
  - logs
properties:
  logs:
    type: array
    items:
      $ref: "../common-core/link.yaml"

It is an array of links pointing to individual (potentially multiple) log files.

Do you have something else in mind?

For example, text/plain content is directly accessible as the content of a JSON object.

type: object
required:
  - logs
properties:
  logs:
    type: array
    items:
      type: object
      required:
        -content
      properties:
        id: string
        description: string
        content: string

Or we can have the main log file accessible in a logs attribute with the associated log files accessible through the provided links.

type: object
required:
  - logs
  - links
properties:
  logs:
    type: string
  links:
    type: array
    items:
      $ref: "../common-core/link.yaml"

Or it can also be the text/plain response; in such a case, I would propose defining the following endpoint: /jobs/{jobId}/logs/links that would return the links to the log files associated with the job execution.

We may also rely on the logs from OpenEO directly: https://api.openeo.org/#tag/Batch-Jobs/operation/debug-job.

  1. As highlighted by @pvretano, the Quotation Requirement should be removed from Part 4, as it could be used on its own with Part 1: Core without the additional job requirements. Consider a Part 5 for https://github.com/opengeospatial/ogcapi-processes/tree/master/extensions/quotation and https://github.com/opengeospatial/ogcapi-processes/tree/master/extensions/billing.

I agree with both of you and removed this part from the draft. Nevertheless, I would like to point out this was related to alignment with OpenEO and the /jobs/{jobId}/estimate endpoint.

  1. Conformance Class Deploy, Replace, Undeploy seems incorrectly named. It describes only the POST (job creation aspect). I think it should be renamed to "Job Creation" to avoid confusion with Part 2: DRU.

It is an issue with the copy/paste from DRU, which I started this draft with. It is now fixed.

  1. To help in the cross-walk with openEO, a PATCH /jobs/{jobId} might need to be considered. More specifically, that could be used to initialize a (TBD: pending, created) job, requesting it to start. Following this, it should be expected that the job switches to accepted or directly to running if the server decides it can start processing it right away.

Please take a look at the previous comment.

  1. Job "Replace" (PUT) should be distinguished from "Update" (PATCH) in https://geolabs.fr/dl/job-management/ogcapi-processes/extensions/job_management/standard/20-044.html#update (considering that Part 2: DRU uses R for "Replace"/PUT).

Right, the draft should now use PATCH to update the job definition in place of PUT.

In either case, a pre-requirement should be that the Job MUST be in (TBD: pending, created) state. Otherwise, weird side effects could occur for a job already in use by a running process.

In the current draft, the created status is used.

  1. Considering (6) and (7) above, Staring a job should be renamed to "Job Creation", and should allow additional statuses than permitted by statusCode.yml. Otherwise, any subsequent PUT/PATCH do not make sense.

The following schema was added: https://github.com/GeoLabs/ogcapi-processes/blob/proposal-part4-initial/openapi/schemas/processes-job-management/statusCode.yaml. I would have preferred extending the original statusCode.yaml rather than redefining it from scratch.

  1. Building upon (8), the POST /jobs should include some kind of indication that the (TBD: pending, created) status is desired (in contrast to default: started).

I would say that it would be possible to start the job except if the status is accepted or running.

The same applies to the update operation.

  • Since sync(wait)/async(respond-async) execution is already controlled via Prefer on other endpoints, I propose to reuse some other term [TBD] in Prefer. This example (https://www.rfc-editor.org/rfc/rfc7240.html#section-2.1) shows a priority parameter. Maybe priority=0 could be used to indicate that a job must not start running/accepted immediately? Maybe another parameter entirely could be used? This seems possible since priority itself is not described, although given as example.
  • Given that openEO BatchJob uses statuses (“created”, “queued”, “running”, “canceled”, “finished” and “error”), Part 4: Job Management should preferably reuse created. A job modification (PATCH|PUT /jobs/{jobId}) would be allowed only in created status, as in openEO.
  • Using PATH /jobs/{jobId} to start a job (ie: switching from created to accepted/running) could allow openEO to do necessary "under-the-hood" adjustments (eg: subrequest to their POST /jobs/{job_id}/results BatchJob Start operation with mapping of accepted = queued), which would make the transition transparent for a client interacting with the server as a OGC API - Processes. Note: a openEO client would still be allowed to do POST /jobs/{job_id}/results directly if preferred, the same result would happen.

Doing so would lead to having two operations with the same PATCH method, updating the job definition, and starting its execution. I prefer using the POST method directly to the /jobs/{jobId}/results endpoint. I think that it is handled the same way in OpenEO.

  1. Section Requirement OGC API — Process — Workflow Execute Request should be more explicit about using Part 3: Deployable Workflows. A subsection should also be added for the Part 2: CWL case, since Part 4: openEO Process Graph is specified. Alternatively (and maybe simpler), a single class could be defined (eg: Job Content Requirement), which would list all of those (and maybe more) combinations, and their corresponding media-type, in a table.

I am not sure to follow here, but I can read the following in the current draft:

"The Deployable Workflows conformance class specifies how a workflow execution request as defined in OGC API — Processes — Part 1: Core, with or without the capabilities defined in other conformance classes of this Part 3 extension, can be used as an application package payload to deploy new processes using OGC API — Processes — Part 2: Deploy, Replace, Undeploy."

Here, the idea was to POST an execute request body following the execute-workflow.yaml schema. In other words, a processes chain would be sent to this endpoint to deploy a job (with created status).

As defined currently, only nested-processes requirement class should be a pre-requisite.

I agree that the CWL can fit with the definition of a job.

To be continued...

@fmigneault
Copy link
Contributor

@gfenoy

I wonder if it should not a recommendation rather than a permission

I think what would make it a recommendation is to provide specific Content-Schema values to use for representing OGC API - Processes and openEO respectively. This will also increase chances of users reusing the same values, for better interoperability.

we may investigate the RO-Crate Specification, it seems to be linked to the Research Object manifest produced by cwlprov. What do you think?

Since CWL-Prov uses as Research Object under the hood (see: https://github.com/common-workflow-language/cwlprov#overview), it seems to be a logical choice. However, I'm not quite sure to understand yet what "more" the Research Object provides. Looking for example at https://www.researchobject.org/ro-crate/tlcmap, it seems all the metadata would be referenced one way or another by the PROV-JSON/XML/RDF definition. The RO-crate seems relevant only if we want to export everything with metadata in a single ZIP. However, do we really want to (or even can feasibly) export all source data, considering it might be massive? I'm not sure of all implications regarding the RO, but the PROV definition seems relatively simple to implement and can be represented by any equivalent JSON/XML/RDF structure without depending explicitly on CWL.

I support this idea of a /jobs/{jobId}/logs endpoint and would like to discuss this more specifically. What should the content schema of the response be?

Currently, CRIM's implementation supports text/plain, application/x-yaml, application/json and application/xml. In text, it is returned line by line directly. For JSON/YAML, it is directly an array of strings. For XML, it is a <logs> object with nested <item type="str"> for each line. I believe CubeWerx (@pvretano confirm?) also has similar formats for at least text and JSON. I guess we could add a logs section to contain the log-lines array, but we preferred to keep it simple for the moment without links to simplify parsing the log contents directly. I find the links somewhat redundant in that case since they are usually obtained from the job endpoint in the first place.

I would say that it would be possible to start the job except if the status is accepted or running.

I think it is better to mention it can be started only if in created state. Any other status implies it is already in queue/running, or has completed/was dismissed, which means the corresponding job resources might not be usable anymore.

Doing so would lead to having two operations with the same PATCH method, updating the job definition, and starting its execution. I prefer using the POST method directly to the /jobs/{jobId}/results endpoint. I think that it is handled the same way in OpenEO.

I'm slightly torn about this. On one hand, POST /jobs/{jobId}/results makes sense for openEO alignment and the fact of generating results (either right away: sync, or at some point: async) could be found again on that path with GET /jobs/{jobId}/results. However, it is somewhat confusing that the job status would be modified as a side effect, when POST is not applied on a "Job resource" according to REST. I agree though that PATCH with the same endpoint, sometimes starting the job and sometimes not whether status: started was specified can be just as confusing. I do not have a strong feeling/preference about either approach. Probably needs more discussions to see if POST /jobs/{jobId}/results can work without conflict with openEO using a similar Content-Schema for the submitted body.

I am not sure to follow here [...] (The Deployable Workflows)

You're right. Workflow Execute Request is enough with Nested Processing. I had assumed that given a created Job would be waiting until execution trigger, its definition would be "deployed" (or stored), but it doesn't necessarily need to result in a deployed process. You can ignore this point.

Thanks for the other draft fixes.

@gfenoy
Copy link
Contributor

gfenoy commented Sep 25, 2024

@gfenoy

I wonder if it should not a recommendation rather than a permission

I think what would make it a recommendation is to provide specific Content-Schema values to use for representing OGC API - Processes and openEO respectively. This will also increase chances of users reusing the same values, for better interoperability.

we may investigate the RO-Crate Specification, it seems to be linked to the Research Object manifest produced by cwlprov. What do you think?

Since CWL-Prov uses as Research Object under the hood (see: https://github.com/common-workflow-language/cwlprov#overview), it seems to be a logical choice. However, I'm not quite sure to understand yet what "more" the Research Object provides. Looking for example at https://www.researchobject.org/ro-crate/tlcmap, it seems all the metadata would be referenced one way or another by the PROV-JSON/XML/RDF definition. The RO-crate seems relevant only if we want to export everything with metadata in a single ZIP. However, do we really want to (or even can feasibly) export all source data, considering it might be massive? I'm not sure of all implications regarding the RO, but the PROV definition seems relatively simple to implement and can be represented by any equivalent JSON/XML/RDF structure without depending explicitly on CWL.

I agree about supporting PROV. For reference, here is the RELIANCE ROcrate profile that may be of interest.

Do we agree that it means that we would then access from the GET /jobs/{jobId}/run endpoint to the PROV content in the different supported formats (ie. PROV-JSON, PROV-O as JSON-LD / as RDF N-Triples / Turtle, PROV-XML, PROV-N)?

We can use the Accept header to select the expected encoding.

I support this idea of a /jobs/{jobId}/logs endpoint and would like to discuss this more specifically. What should the content schema of the response be?

Currently, CRIM's implementation supports text/plain, application/x-yaml, application/json and application/xml. In text, it is returned line by line directly. For JSON/YAML, it is directly an array of strings. For XML, it is a <logs> object with nested <item type="str"> for each line. I believe CubeWerx (@pvretano confirm?) also has similar formats for at least text and JSON. I guess we could add a logs section to contain the log-lines array, but we preferred to keep it simple for the moment without links to simplify parsing the log contents directly. I find the links somewhat redundant in that case since they are usually obtained from the job endpoint in the first place.

I agree, so I propose adding a new endpoint, GET /jobs/{jobId}/logs, to access the various formats you described (if supported by the server).

I don't think any schema should be defined in the draft for such a log endpoint (each server implementation can then provide its own schema in the API definition). Do you agree?

I would say that it would be possible to start the job except if the status is accepted or running.

I think it is better to mention it can be started only if in created state. Any other status implies it is already in queue/running, or has completed/was dismissed, which means the corresponding job resources might not be usable anymore.

As described here:

[...]if the job status is canceled, finished, or error, which restarts the job and discards previous results[...]

I think can be handled the same way in Part 4.

Doing so would lead to having two operations with the same PATCH method, updating the job definition, and starting its execution. I prefer using the POST method directly to the /jobs/{jobId}/results endpoint. I think that it is handled the same way in OpenEO.

I'm slightly torn about this. On one hand, POST /jobs/{jobId}/results makes sense for openEO alignment and the fact of generating results (either right away: sync, or at some point: async) could be found again on that path with GET /jobs/{jobId}/results. However, it is somewhat confusing that the job status would be modified as a side effect, when POST is not applied on a "Job resource" according to REST. I agree though that PATCH with the same endpoint, sometimes starting the job and sometimes not whether status: started was specified can be just as confusing. I do not have a strong feeling/preference about either approach. Probably needs more discussions to see if POST /jobs/{jobId}/results can work without conflict with openEO using a similar Content-Schema for the submitted body.

Maybe we would need feedback from @m-mohr here.

As said previously, I would like to add a POST /result endpoint for synchronous execution, as described here. It would take the same content as the POST /jobs for creating a job but would start the execution right away.

@fmigneault
Copy link
Contributor

Do we agree that it means that we would then access from the GET /jobs/{jobId}/run endpoint to the PROV content in the different supported formats (ie. PROV-JSON, PROV-O as JSON-LD / as RDF N-Triples / Turtle, PROV-XML, PROV-N)?

We can use the Accept header to select the expected encoding.

Yes (2x)

I don't think any schema should be defined in the draft for such a log endpoint (each server implementation can then provide its own schema in the API definition). Do you agree?

OK.
The requirement should mention that content-negotiation MAY be supported to provide alternate representations, but the standard does not dictate any preferred content type or content schema.

[...]if the job status is canceled, finished, or error, which restarts the job and discards previous results[...]

I think can be handled the same way in Part 4.

I personally prefer to create a separate job to avoid conflicting definitions and keep the logs of the failing case. This can be used for monitoring, error reporting and statistics. However, I am not against implementations using a job-replacement approach. Part 4 should most probably allow this flexibility, but describe what is expected in each case. For example, if a server decides to disallow it, which code/error should be returned? Otherwise, if allowed, what is the expected response (StatusInfo and status = ...?).

I would like to add a POST /result endpoint for synchronous execution [...]

I find this goes against the "Job Management" idea of Part 4. It would add an endpoint with no job reference, no retrievable logs/provenance/etc., and doesn't give the option to easily switch between sync/async just by modifying the Prefer header on the same POST /jobs/{jobId}/results. Using a common endpoint allows to easily reuse the Prefer resolution logic used by POST /processes/{processId}/execution. I think POST /result must remain an openEO-only thing.

@gfenoy
Copy link
Contributor

gfenoy commented Sep 30, 2024

Yes (2x)

We added the following requirement /req/provenance/run/response.

The current requirement states that the PROV-JSON is expected by default if no content negotiation is used.

Also, the following permission was added: /per/provenance/run/content-negotiation to add optional content type.

Maybe application/json should also be referenced from this permission.

The requirement should mention that content-negotiation MAY be supported to provide alternate representations, but the standard does not dictate any preferred content type or content schema.

We still need to add an endpoint to access logs. I noted this issue, in which you concluded that there may not be a requirement for such a log(s) endpoint.

Adding this endpoint would make sense for the job-management extension.

[...]if the job status is canceled, finished, or error, which restarts the job and discards previous results[...]
I think can be handled the same way in Part 4.

I personally prefer to create a separate job to avoid conflicting definitions and keep the logs of the failing case. This can be used for monitoring, error reporting and statistics. However, I am not against implementations using a job-replacement approach. Part 4 should most probably allow this flexibility, but describe what is expected in each case. For example, if a server decides to disallow it, which code/error should be returned? Otherwise, if allowed, what is the expected response (StatusInfo and status = ...?).

I am okay with that.

Maybe the standard can add a requirement class /req/job-management/replacement, meaning that the server supports the job-replacement approach (meaning that a job can be run for created, succeeded, and failed status). In other cases, it does not support the operation, and a type such as /job-replacement-not-supported may be set in the exception returned.

I would like to add a POST /result endpoint for synchronous execution [...]

I find this goes against the "Job Management" idea of Part 4. It would add an endpoint with no job reference, no retrievable logs/provenance/etc., and doesn't give the option to easily switch between sync/async just by modifying the Prefer header on the same POST /jobs/{jobId}/results. Using a common endpoint allows to easily reuse the Prefer resolution logic used by POST /processes/{processId}/execution. I think POST /result must remain an openEO-only thing.

I agree on that point. But can we imagine adding a new requirement class called openeo-results-endpoint to permit the POST /results endpoint addition? This way, servers that would like to align with openeo and add this specific endpoint would be able to do so.

@ghobona
Copy link
Contributor

ghobona commented Sep 30, 2024

@gfenoy The URIs of the specification elements that you have listed above break OGC-NA policy. Please see chapter 6 and 8 of https://docs.ogc.org/pol/10-103r1.html

For requirements classes, there should only be one segment after the /req segment.

Similarly, for conformance classes, there should only be one segment after the /conf segment.

For requirements, there should only be two segments after the /req segment. The first of the segments should match the segment of the requirements class that the requirement belongs to.

Similarly, for abstract tests, there should only be two segments after the /conf segment. The first of the segments should match the segment of the conformance class that the requirement belongs to.

Please fix the segments.

@fmigneault
Copy link
Contributor

@gfenoy

We still need to add an endpoint to access logs. I noted this #273 (comment), in which you concluded that there may not be a requirement for such a log(s) endpoint.

Good point.
I think the requirement would be that a rel: logs link must be provided in the /jobs/{jobId} response, but its reference can be hosted anywhere. The /jobs/{jobId}/logs endpoint would be a recommendation for local job logs. Another possibility could be that /jobs/{jobId}/logs is required, but allowed to redirect. Therefore, the requirement would be that, if logs are found elsewhere, an HTTP 303 with the Location header should be returned.

@gfenoy
Copy link
Contributor

gfenoy commented Sep 30, 2024

@gfenoy The URIs of the specification elements that you have listed above break OGC-NA policy. Please see chapter 6 and 8 of https://docs.ogc.org/pol/10-103r1.html

For requirements classes, there should only be one segment after the /req segment.

Similarly, for conformance classes, there should only be one segment after the /conf segment.

For requirements, there should only be two segments after the /req segment. The first of the segments should match the segment of the requirements class that the requirement belongs to.

Similarly, for abstract tests, there should only be two segments after the /conf segment. The first of the segments should match the segment of the conformance class that the requirement belongs to.

Please fix the segments.

Thanks a lot, @ghobona, for pointing out this mistake.

It should now be solved for the WIP Part 4 06656a8, part of #437.

Also, for Part 2 it should be solved here: #444.

@m-mohr
Copy link
Author

m-mohr commented Oct 7, 2024

Good point. I think the requirement would be that a rel: logs link must be provided in the /jobs/{jobId} response

In openEO we use the relation type monitor for logs...

@fmigneault
Copy link
Contributor

In openEO we use the relation type monitor for logs...

This seems like another incompatibility with OGC API - Processes that uses rel=monitor for the job itself: https://docs.ogc.org/is/18-062r2/18-062r2.html#req_core_job-results-success-sync

@gfenoy
Copy link
Contributor

gfenoy commented Oct 18, 2024

@gfenoy The URIs of the specification elements that you have listed above break OGC-NA policy. Please see chapter 6 and 8 of https://docs.ogc.org/pol/10-103r1.html

For requirements classes, there should only be one segment after the /req segment.

Similarly, for conformance classes, there should only be one segment after the /conf segment.

For requirements, there should only be two segments after the /req segment. The first of the segments should match the segment of the requirements class that the requirement belongs to.

Similarly, for abstract tests, there should only be two segments after the /conf segment. The first of the segments should match the segment of the conformance class that the requirement belongs to.

Please fix the segments.

This should have been fined in Part 4 and also applied here to the Part 2 which was containing the same issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Part 4 (Job Management) OGC API - Processes - Part 4: Job Management
Projects
None yet
Development

No branches or pull requests

8 participants