Write a Rodan job package

The job packages reside in the folder /rodan/jobs/. A job vendor provides its directory under this folder, where multiple Rodan jobs are defined. A job vendor can define the resource types that are required for its jobs as well.

1. Describe a Rodan Job

A Rodan job is defined by a class that inherits rodan.jobs.base.RodanTask. The class should define the following attributes as its description:

attribute	description
`name`	`string` a unique name within all the jobs provided by the vendor.
`author`	`string` the author of the job.
`description`	`string`
`settings`	`[JSON Schema](http://json-schema.org/)`¹ the validation schema that describes the requirements of the job settings.
`enabled`	`boolean`
`category`	`string`
`interactive`	`boolean` indicates whether the job will pause at some point and wait for manual input.²
`input_port_types`	`list of Python dictionary`
`output_port_types`	`list of Python dictionary`

1 - At present, Rodan only supports a JSON object as the topmost structure of settings.

2 - It is only informative for the users. It does not affect whether the job will pause. The behaviour of the job is determined by the return value of its execution code.

For input_port_types and output_port_types, the following keys should be defined:

key	description
`name`	`string`
`resource_types`	`list of string OR lambda: string -> boolean` describes all possible resource MIME-types. If provided with a lambda function, Rodan will automatically filter the matched resource types in its registry.
`minimum`	`number` minimum requirement of the job. 0 indicates no minimum requirement.
`maximum`	`number` maximum requirement of the job. 0 indicates no maximum requirement.

2. Implement the Job

The execution of a job can have two possible phases: automatic phase and manual phase. In automatic phase, the job is sent to background workers that are distributed on the network; in manual phase, the job communicates with human through a web interface via HTTP protocol.

A job always starts and ends with an automatic phase. It is allowed to go back and fro between automatic phases and manual phases:

The automatic phases are implemented in the method run_my_task (and my_error_information). The manual phases are implemented in the methods get_my_interface and validate_my_user_input.

2.1 Implement Automatic Phases

The signature of method run_my_task should be:

run_my_task(self, inputs, settings, outputs)

This method is expected to read the resource files as described in inputs, process them according to the configuration in settings, and produce the result files at the paths as described in outputs.

The parameter inputs is a Python dictionary. Every key-value pair maps a type of input ports to the list of details of the input resources. The details are Python dictionaries that include:

key	value
`resource_path`	`string` the path to the input resource file
`resource_type`	`string` the MIME-type of the input resource

Therefore, inputs is a dictionary, which maps a string (the name of the input port type), to a list of dictionaries (the details of the input resources). For example, if a job is executed with 2 inputs typed "image" and 1 input typed "mask", the inputs will be structured like:

{
    "image": [{
        "resource_path": "/some/path/file1",
        "resource_type": "image/jpeg"
    }, {
        "resource_path": "/some/path/file2",
        "resource_type": "image/png"
    }],
    "mask": [{
        "resource_path": "/some/path/file3",
        "resource_type": "image/bmp"
    }]
}

The parameter outputs is alike the parameter inputs, but the detail of resource is a little bit different:

key	value
`resource_path`	`string` the path that is supposed to been written into
`resource_type`	`string` the MIME-type of the output resource

The parameter settings is a Python dictionary that is validated against the JSON schema that the job has defined.

The job can raise any exceptions in automatic phases. By default, the exception message and traceback are as the error summary and details, respectively. This behaviour can be changed by defining the method my_error_information(self, exc, traceback), where exc is the exception object and traceback is a traceback object. The method should return a Python dictionary that includes error_summary and error_details.

If the job needs a temporary directory to work with, the recommended way is:

with self.tempdir() as tempdir:
    # do things inside tempdir

... to avoid producing filesystem garbage upon an exception.

run_my_task method can return an instance of self.WAITING_FOR_INPUT to indicate its requirement of a manual phase (see section 2.3). Other types of return value will be ignored and treated as a signal of job completion.

2.2 Implement Manual Phases

In manual phases, the job is put forward to receive and response HTTP requests. Upon a GET request, the job needs to provide its web interface; upon a POST request, the job validates the input data and updates its settings accordingly.

2.2.1 `get_my_interface`

get_my_interface method returns the web interface. Its signature is:

get_my_interface(self, inputs, settings)

The data structure of argument inputs is alike the counterpart in automatic phases. But in manual phases, inputs provides more details for the interface to locate resource URLs remotely:

key	value
`resource_path`	`string` the path to the input resource file
`resource_type`	`string` the MIME-type of the input resource
`resource_url`	`string` the URL to the original resource file
`small_thumb_url`	`string` the URL to the small thumbnail
`medium_thumb_url`	`string` the URL to the medium thumbnail
`large_thumb_url`	`string` the URL to the large thumbnail

The argument settings is structured the same as its automatic counterpart.

get_my_interface method is expected to return a tuple (t, c), where t is the relative path to the template HTML file. The path should be relative to the vendor's package, and the template HTML file should be written in Django template language.

c is a Python dictionary that defines the variables and their values to be rendered in the HTML template.

2.2.2 `validate_my_user_input`

Signature:

validate_my_user_input(self, inputs, settings, user_input)

This method validates the user input through HTTP POST request. The user input is provided as JSON data in user_input. If validation fails, it is expected to raise an instance of self.ManualPhaseException that incurs an HTTP 400 response (with error message) back to the interface.

If validation passes, the method should return a Python dictionary of the update of the settings. All updated keys should start with '@' or they will be discarded (reason see section 2.3).

The inputs and settings arguments are structured in the same way as in automatic phases.

2.3 State Transform between Automatic Phases and Manual Phases

3. Test the Job

test_my_task(self, testcase)

This method is called during the unit test of Rodan.

This method should call run_my_task() and/or get_my_interface() and/or validate_my_user_input. Before calling the job code, this method needs to construct inputs, settings, and outputs objects as parameters to feed the methods.

Its own parameter testcase refers to the Python TestCase object. Aside from assertion methods like assertEqual() and assertRaises(), it provides new_available_path() which returns a path to a nonexist file in the temporary filesystem. test_my_task method can thus create input files in these paths and feed them into the job methods.

4. Describe Resource Types

The resource MIME-types should be defined for Rodan to recognize them. A vendor can describe the required resource MIME-types through a file resource_types.yaml in the vendor directory. It is a list of mappings, which include:

name	description
`mimetype`	`string`
`description`	(optional) `string`
`extension`	(optional) `string` the suggested extension of this resource type.

5. Import the Job

Rodan imports the vendor module. Therefore, it is the vendor's responsibility to import the jobs in outermost __init__.py. It is not necessary to import every class, though -- import the Python file that contains the job classes, and Rodan will find the job classes and register them.

It is safer to use rodan.jobs.module_loader function to import the job modules. module_loader will catch the ImportError and write it into the log file instead of throwing an exception that terminates Rodan.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Write a Rodan job package

1. Describe a Rodan Job

2. Implement the Job

2.1 Implement Automatic Phases

2.2 Implement Manual Phases

2.2.1 `get_my_interface`

2.2.2 `validate_my_user_input`

2.3 State Transform between Automatic Phases and Manual Phases

3. Test the Job

4. Describe Resource Types

5. Import the Job

Home

Rodan Docker Setup and Deployment

Rodan Roadmap

Installation

Developing

Deployment

Miscellaneous

Rodan Core

Rodan Job Package System

Rodan Workflow Model

Rodan Server API Overview

Rodan Permission System

Unit Test

Run Unit Tests Locally

Installation without Docker

Miscellaneous

Clone this wiki locally

Write a Rodan job package

1. Describe a Rodan Job

2. Implement the Job

2.1 Implement Automatic Phases

2.2 Implement Manual Phases

2.2.1 get_my_interface

2.2.2 validate_my_user_input

2.3 State Transform between Automatic Phases and Manual Phases

3. Test the Job

4. Describe Resource Types

5. Import the Job

Home

Rodan Docker Setup and Deployment

Rodan Roadmap

Installation

Developing

Deployment

Miscellaneous

Rodan Core

Rodan Job Package System

Rodan Workflow Model

Rodan Server API Overview

Rodan Permission System

Unit Test

Run Unit Tests Locally

Installation without Docker

Miscellaneous

Clone this wiki locally

2.2.1 `get_my_interface`

2.2.2 `validate_my_user_input`