-
Notifications
You must be signed in to change notification settings - Fork 13
Write a Rodan job package
The job packages reside in the folder /rodan/jobs/
. A job vendor provides its directory under this folder, where multiple Rodan jobs are defined. A job vendor can define the resource types that are required for its jobs as well.
A Rodan job is defined by a class that inherits rodan.jobs.base.RodanTask
. The class should define the following attributes as its description:
attribute | description |
---|---|
name |
string a unique name within all the jobs provided by the vendor. |
author |
string the author of the job. |
description |
string |
settings |
[JSON Schema](http://json-schema.org/) 1 the validation schema that describes the requirements of the job settings. |
enabled |
boolean |
category |
string |
interactive |
boolean indicates whether the job will pause at some point and wait for manual input.2
|
input_port_types |
list of Python dictionary |
output_port_types |
list of Python dictionary |
1 - At present, Rodan only supports a JSON object as the topmost structure of settings.
2 - It is only informative for the users. It does not affect whether the job will pause. The behaviour of the job is determined by the return value of its execution code.
For input_port_types
and output_port_types
, the following keys should be defined:
key | description |
---|---|
name |
string |
resource_types |
list of string OR lambda: string -> boolean describes all possible resource MIME-types. If provided with a lambda function, Rodan will automatically filter the matched resource types in its registry. |
minimum |
number minimum requirement of the job. 0 indicates no minimum requirement. |
maximum |
number maximum requirement of the job. 0 indicates no maximum requirement. |
The execution of a job can have two possible phases: automatic phase and manual phase. In automatic phase, the job is sent to background workers that are distributed on the network; in manual phase, the job communicates with human through a web interface via HTTP protocol.
A job always starts and ends with an automatic phase. It is allowed to go back and fro between automatic phases and manual phases:
The automatic phases are implemented in the method run_my_task
(and my_error_information
). The manual phases are implemented in the methods get_my_interface
and validate_my_user_input
.
The signature of method run_my_task
should be:
run_my_task(self, inputs, settings, outputs)
This method is expected to read the resource files as described in inputs
, process them according to the configuration in settings
, and produce the result files at the paths as described in outputs
.
The parameter inputs
is a Python dictionary. Every key-value pair maps a type of input ports to the list of details of the input resources. The details are Python dictionaries that include:
key | value |
---|---|
resource_path |
string the path to the input resource file |
resource_type |
string the MIME-type of the input resource |
Therefore, inputs
is a dictionary, which maps a string (the name of the input port type), to a list of dictionaries (the details of the input resources). For example, if a job is executed with 2 inputs typed "image" and 1 input typed "mask", the inputs
will be structured like:
{
"image": [{
"resource_path": "/some/path/file1",
"resource_type": "image/jpeg"
}, {
"resource_path": "/some/path/file2",
"resource_type": "image/png"
}],
"mask": [{
"resource_path": "/some/path/file3",
"resource_type": "image/bmp"
}]
}
The parameter outputs
is alike the parameter inputs
, but the detail of resource is a little bit different:
key | value |
---|---|
resource_path |
string the path that is supposed to been written into
|
resource_type |
string the MIME-type of the output resource |
The parameter settings
is a Python dictionary that is validated against the JSON schema that the job has defined.
The job can raise any exceptions in automatic phases. By default, the exception message and traceback are as the error summary and details, respectively. This behaviour can be changed by defining the method my_error_information(self, exc, traceback)
, where exc
is the exception object and traceback
is a traceback object. The method should return a Python dictionary that includes error_summary
and error_details
.
If the job needs a temporary directory to work with, the recommended way is:
with self.tempdir() as tempdir:
# do things inside tempdir
... to avoid producing filesystem garbage upon an exception.
run_my_task
method can return an instance of self.WAITING_FOR_INPUT
to indicate its requirement of a manual phase (see section 2.3). Other types of return value will be ignored and treated as a signal of job completion.
In manual phases, the job is put forward to receive and response HTTP requests. Upon a GET request, the job needs to provide its web interface; upon a POST request, the job validates the input data and updates its settings accordingly.
get_my_interface
method returns the web interface. Its signature is:
get_my_interface(self, inputs, settings)
The data structure of argument inputs
is alike the counterpart in automatic phases. But in manual phases, inputs
provides more details for the interface to locate resource URLs remotely:
key | value |
---|---|
resource_path |
string the path to the input resource file |
resource_type |
string the MIME-type of the input resource |
resource_url |
string the URL to the original resource file |
small_thumb_url |
string the URL to the small thumbnail |
medium_thumb_url |
string the URL to the medium thumbnail |
large_thumb_url |
string the URL to the large thumbnail |
The argument settings
is structured the same as its automatic counterpart.
get_my_interface
method is expected to return a tuple (t, c)
, where t
is the relative path to the template HTML file. The path should be relative to the vendor's package, and the template HTML file should be written in Django template language.
c
is a Python dictionary that defines the variables and their values to be rendered in the HTML template.
Signature:
validate_my_user_input(self, inputs, settings, user_input)
This method validates the user input through HTTP POST request. The user input is provided as JSON data in user_input
. If validation fails, it is expected to raise an instance of self.ManualPhaseException
that incurs an HTTP 400 response (with error message) back to the interface.
If validation passes, the method should return a Python dictionary of the update of the settings. All updated keys should start with '@' or they will be discarded (reason see section 2.3).
The inputs
and settings
arguments are structured in the same way as in automatic phases.
test_my_task(self, testcase)
This method is called during the unit test of Rodan.
This method should call run_my_task()
and/or get_my_interface()
and/or validate_my_user_input
. Before calling the job code, this method needs to construct inputs
, settings
, and outputs
objects as parameters to feed the methods.
Its own parameter testcase
refers to the Python TestCase object. Aside from assertion methods like assertEqual()
and assertRaises()
, it provides new_available_path()
which returns a path to a nonexist file in the temporary filesystem. test_my_task
method can thus create input files in these paths and feed them into the job methods.
The resource MIME-types should be defined for Rodan to recognize them. A vendor can describe the required resource MIME-types through a file resource_types.yaml
in the vendor directory. It is a list of mappings, which include:
name | description |
---|---|
mimetype |
string |
description |
(optional) string
|
extension |
(optional) string the suggested extension of this resource type. |
Rodan imports the vendor module. Therefore, it is the vendor's responsibility to import the jobs in outermost __init__.py
. It is not necessary to import every class, though -- import the Python file that contains the job classes, and Rodan will find the job classes and register them.
It is safer to use rodan.jobs.module_loader
function to import the job modules. module_loader
will catch the ImportError
and write it into the log file instead of throwing an exception that terminates Rodan.
- Repository Structure
- Working on Rodan
- Testing Production Locally
- Working on Interactive Classifier
- Job Queues
- Testing New Docker Images
- Set up Environment Variables
- Set up SSL with Certbot
- Set up SSH with GitHub
- Deploying on Staging
- Deploying on Production
- Import Previous Data