-
Notifications
You must be signed in to change notification settings - Fork 69
Framework Configuration
In the previous section you set up a Maven project capable of compiling a StreamFlow Framework JAR. In order to identify the Spouts and Bolts that are available in your Framework JAR, StreamFlow utilizes a single Framework configuration file. The framework configuration is integral to building Components that can be used within the StreamFlow UI to dynamically build topologies. Although the project you created in the last section will compile, the lack of a framework.yml
configuration file will prevent StreamFlow from registering any Spouts or Bolts within your JAR.
Important: Even though you provide the source code in your framework project, Spouts and Bolts which have not been registered in the
framework.yml
will not be visible by the StreamFlow UI.
The framework configuration file is used to register all of the Spouts, Bolts, Resources, and Serializations that are present within the framework jar project. This configuration can be defined using a YAML format or a JSON format. Selection of the configuration format (YAML/JSON) is solely a personal preference as all settings are available in each format. Although either format can be used, YAML is the recommended format as it is less verbose than JSON when formatting the configuration file.
The framework.yml
and framework.json
configuration files must be located in a STREAMFLOW-INF
folder at the root of the class path (e.g. src/main/resources/STREAMFLOW-INF/framework.yml
or src/main/resources/STREAMFLOW-INF/framework.json
). The following sample framework.yml
and framework.json
files outline the format of these configuration files. Following the examples, each major section of the configuration will be covered in detail.
Note: The following YAML and JSON configurations are equivalent and you should only define either
framework.yml
ORframework.json
in your project.
# Framework Properties
name: sample-framework
label: Sample Framework
version: 1.0.0-SNAPSHOT
description: Spouts and Bolts implemented for demonstration purposes
# Framework Components
components:
- name: sample-bolt
label: Sample Bolt
type: storm-bolt
description: Sample bolt used for demonstration purposes
mainClass: streamflow.bolt.SampleBolt
properties:
- name: field-one
label: Field One
description: Description of field one
defaultValue: 10
required: true
type: number
- name: field-two
label: Field Two
description: Description of field two
defaultValue: second
required: false
type: select
options:
listItems:
- first
- second
- third
inputs:
- key: default
description: Takes any tuple as input
outputs:
- key: default
description: Processed activity content
# Framework Resources
resources:
- name: file-resource
label: File Resource
description: File Resource for Testing
resourceClass: streamflow.resource.FileResource
properties:
- name: file-resource
label: File Resource
description: Use this resource to upload and save files
defaultValue:
required: true
type: file
# Framework Serializations
serializations:
- typeClass: streamflow.serializer.SampleType
serializerClass: streamflow.serializer.SampleTypeSerializer
- typeClass: streamflow.serializer.AnotherSampleType
{
"name": "sample-framework",
"label": "Sample Framework",
"version": "1.0.0-SNAPSHOT",
"description": "Spouts and Bolts implemented for demonstration purposes",
"components": [
{
"name": "sample-bolt",
"label": "Sample Bolt",
"type": "storm-bolt",
"description": "Sample bolt used for demonstration purposes",
"mainClass": "streamflow.bolt.SampleBolt",
"properties": [
{
"name": "field-one",
"label": "Field One",
"description": "Description of field one",
"defaultValue": "10",
"required": true,
"type": "number"
},
{
"name": "field-two",
"label": "Field Two",
"description": "Description of field two",
"defaultValue": "second",
"required": false,
"type": "select",
"options": {
"listItems": [
"first",
"second",
"third"
]
}
}
],
"inputs": [
{
"key": "default",
"description": "Takes any tuple as input"
}
],
"outputs": [
{
"key": "default",
"description": "Processed activity content"
}
]
}
],
"resources": [
{
"name": "file-resource",
"label": "File Resource",
"description": "File Resource for Testing",
"resourceClass": "streamflow.resource.FileResource",
"properties": [
{
"name": "file-resource",
"label": "File Resource",
"description": "Use this resource to upload and save files",
"defaultValue": "",
"required": true,
"type": "file"
}
]
}
],
"serializations": [
{
"typeClass": "streamflow.serializer.SampleType",
"serializerClass": "streamflow.serializer.SampleTypeSerializer"
}
]
}
Framework properties define general information about a framework that is used to identify the framework. These properties are typically listed at the top of the framework configuration for clarity, although it can be located anywhere in the configuration.
Let's look at a snippet of the framework properties from the above example and walk through each property in detail.
name: sample-framework
label: Sample Framework
version: 1.0.0-SNAPSHOT
description: Spouts and Bolts implemented for demonstration purposes
-
Description: Globally unique identifier for the framework. You must ensure that no other frameworks use the same name, otherwise you the two frameworks will collide during framework upload. To help protect against this, it is common to prefix your framework name with namespace style information (e.g.
streamflow.core.sample.framework
) - Default: None
- Description: User friendly name that users will see in the UI to identify your framework.
- Default: None
- Description: Identifies the version number of your framework to users. Although you are free to enter any value for this field, we recommend incrementing your version using Semantic Versioning.
- Default: None
- Description: Helps users understand the purpose of your framework
- Default: None
The components
section of the configuration is used to register Storm Spouts and Bolts in the framework. The components
property is an array which can be used to register multiple components in the configuration. The components that are registered in this section will appear in the palette of the topology editor when the framework is uploaded to the StreamFlow server.
Let's look at a snippet of the components section from the above example and walk through each property in detail.
components:
- name: sample-bolt
label: Sample Bolt
type: storm-bolt
description: Sample bolt used for demonstration purposes
mainClass: streamflow.bolt.SampleBolt
properties: []
inputs:
- key: default
description: Takes any tuple as input
outputs:
- key: default
description: Processed activity content
- Description: Key used to uniquely identify the component within the framework. The name does not have to be globally unique across different frameworks, only within the same framework.
- Default: None
- Description: User friendly name that users will see in the UI to identify the component.
- Default: None
-
Description: Special value to indicate what type of component is being implemented. It is used to help organize your components by type in the component palette of the topology builder. Currently, the only valid type values are as follows:
- storm-spout
- storm-bolt
- trident-spout
- trident-batch-spout
- trident-partitioned-spout
- trident-opaque-partitioned-spout
- trident-function
- trident-filter
- trident-aggregator
- trident-combiner-aggregator
- trident-reducer-aggregator
-
Default: None
- Description: Short description to describe the purpose of the component. This value is visible in the framework details page and as a link in the component palette.
- Default: None
- Description: Fully qualified class name of the component implementation (e.g. streamflow.bolt.SampleBolt)
- Default: None
- Description: Array of configurable properties to associate with the component. When adding the component to build a topology, these properties will be available for modification and will be injected into the component implementaton. Properties are a shared concept between Components and Resources and is covered in detail in the Property Configuration section. The example properties section above is intentionally left blank since properties are discussed in greater detail in the Property Configuration section.
- Default: Empty Array
- Description: Array of objects which define the input settings for the component. When used for Storm, only one input is allowed as Storm only accepts a single input for each component.
- Default: Empty Array
- Description: Unique key to identify the input. This value should map to the Storm streamId for your component implementation. By default, Storm uses a streamId of "default" for simplicity if none is specified in the implementation. If you did not configure a custom streamId, it is best to use the default value of "default"
- Default: default
- Description: Short description of the purpose of the input. It is often helpful to indicate any limitations on data types or formats in the description.
- Default: None
- Description: Array of objects which define the output settings for the component. You can specify multiple output objects if your implmentation supports this.
- Default: Empty Array
- Description: Unique key to identify the output. This value should map to the Storm streamId for your component implementation. By default, Storm uses a streamId of "default" for simplicity if none is specified in the implementation. If you did not configure a custom streamId, it is best to use the default value of "default"
- Default: default
- Description: Short description of the purpose of the output. It is often helpful to indicate any limitations on data types or formats in the description.
- Default: None
The resources
section is used to register new Resources with StreamFlow. Resources are used to register frequently used logic such as establishing a database connection. StreamFlow Resources are implemented as Guice modules which can accept user configurable properties similar StreamFlow Components. Due to the reuse of the property feature, you will find many similarities between the Resource and Component configuration formats.
Let's look at a snippet of the resources
section from the above example and walk through each property in detail.
resources:
- name: file-resource
label: File Resource
description: File Resource for Testing
resourceClass: streamflow.resource.FileResource
properties: []
- Description: Key used to uniquely identify the resource within the framework. The name does not have to be globally unique across different frameworks, only within the same framework.
- Default: None
- Description: User friendly name that users will see in the UI to identify the resource.
- Default: None
- Description: Short description to describe the purpose of the resource.
- Default: None
- Description: Fully qualified class name of the Guice module class that supports the resource (e.g. streamflow.resource.SampleResource)
- Default: None
- Description: Array of configurable properties to associate with the resource. Properties are a shared concept between Components and Resources and is covered in detail in the Property Configuration section. The example properties section above is intentionally left blank since properties are discussed in greater detail in the Property Configuration section.
- Default: Empty Array
The serializations
section is used to register classes within your framework that are required for serialiation. The Storm implementation uses the classes configured in the serialization section to initialize the Kryo implementation used to serialize Tuple data between Spouts and Bolts. Any top level serialization and serializer classes which would typically be registered in Storm using the topology.kryo.register
property should be registered in this section.
Let's look at a snippet of the serializations
section from the above example and walk through each property in detail.
serializations:
- typeClass: streamflow.serializer.SampleType
serializerClass: streamflow.serializer.SampleTypeSerializer
- typeClass: streamflow.serializer.AnotherSampleType
- Description: The fully qualified class name of a class to register with Kryo.
- Default: None
-
Description: A map from the name of a class to register to an implementation of
com.esotericsoftware.kryo.Serializer
- Default: None
Properties are an important part of the framework configuration as it provides a mechanism for Components and Resources to accept dynamic configuration in the StreamFlow topology builder. Properties are defined using a specific configuration format which is used to dynamically generate HTML form elements in the topology builder. Most HTML5 input element types are supported with some additional options included to provide validation of input types. Any properties which are registered in the configuration will be injected into the Spout or Bolt implementation using a javax.inject.Named annotation. Instance variables are initialized with the actual values of the properties at runtime by using a setter based injection combined with the @Named annotation.
Let's look at a snippet of the properties section and walk through each option in detail.
properties:
- name: field-one
label: Field One
description: Description of field one
defaultValue: 10
required: true
type: number
- name: field-two
label: Field Two
description: Description of field two
defaultValue: second
required: false
type: select
options:
listItems:
- first
- second
- third
- Description: Unique identifier for the property in the context. Names must be unique within the same component or resource as the name is used by the Guice injection component as the @Named value.
- Default: None
- Description: User friendly name that users will see when viewing the component or resource properties
- Default: None
- Description: Short description which describes the purpose and usage of the property in the context of the component or resource.
- Default: None
- Description: A value to use as default if the property is left unchanged.
- Default: None
- Description: Flag to determine if the property is required or optional. The property is required if "true", otherwise optional
- Default: false
- Description: Input type to use for the property. The type dictates which input element is generated by the StreamFlow UI when editing properties. Additional implementation details about each type can be found in the Property Types section.
- Default: string
- Description: Options are used in the property configuration to provide additional validation or rendering information to accompany the type. Each type has a specific set up options that can be added to this section to aid in rendering HTML elements.
- Default: None
As seen in the previous Property Configuration section, properties can specify a data type and some additional options. This section will cover all of the property types that StreamFlow supports and provide example configurations for each property type.
Let's start with a table outlining each of the property types and we will follow that with a sample YAML configuration for each type.
Type | HTML Form Element | Description | Options |
---|---|---|---|
Default | text box | Standard input text box with no format limitations | None |
string | text box | Standard input text box with no format limitations | None |
e-mail or email | HTML5 text box w/ email validation | Input text box that validates user input | None |
url | HTML5 text box w/ URL validation | Input text box that validates user input | None |
password | password text box | Standard password input text box | None |
number | HTML5 numeric integer spinner | Numeric spinner with customization options | minNumber, maxNumber, numericStep |
float | HTML5 floating point spinner | Numeric spinner with fractional intervals | minNumber, maxNumber, floatStep |
boolean | check box | Standard input text box | None |
textarea | text area | Standard multi-line text area input | None |
select | select box | Standard select with custom values | listItems |
file | custom file uploader | Custom built file upload input with metadata feedback | None |
date | HTML5 date input | HTML5 date input which simplifies date entry | dateFormat |
time | HTML5 time input | HTML5 time input which simplifies time entry | minuteStep |
serialization | select box | Select box populated with list of registered serializations | None |
resource | select box | Select box populated with resource entries for the specified resource | resourceFramework, resourceName |
The following code shows an example configuration for each of the above input types. Please use this as a reference when specifying the property type.
- name: string-test
label: String Test
type" : string
description: Creates a simple Input box with no validation
defaultValue:
- name: email-test
label: E-mail
type: email
description: Creates an Input Box that Validates e-mails on HTML5 compatible Browsers
defaultValue:
- name: url-test
label: URL
type: url
description: Creates an Input Box that Validates Web Urls on HTML5 compatible Browsers
defaultValue:
- name: password-test
label: Password
type: password
description: Creates an input with the value hidden for passwords.
defaultValue:
- name: number-test
label: Number
type: number
description: Creates a numeric stepper to allow for easier integer entry
defaultValue:
options:
numericStep: 5
minNumber: 0
maxNumber: 100
- name: float-test
label: Float
type: float
description: Creates a javascript based Numberic stepper for Floats.
defaultValue:
options:
floatStep: 0.01
minNumber: 0
maxNumber: 100
- name: boolean-test
label: Boolean
type: boolean
description: Creates a checkbox for inputting Boolean values
defaultValue: true
- name: area-test
label: Area
type: text-area
description: Creates a html text area for entering large blocks of text
defaultValue:
- name: select-test
label: Select
type : select
description: Creates a select input with items specified in the options
defaultValue:
options:
listItems:
- one
- two
- three
- four
- name: file-test
label: File
type: file
description: Creates a custom file upload and file metadata display input
defaultValue:
- name: date-test
label: Date
type: date
description: Creates a Date Picker for entering Month / Day / Year Dates
defaultValue:
options:
dateFormat: mm/dd/yy
- name: time-test
label: Time
type: time
description: Creates a Time input for Entering a current time (no date)
defaultValue:
options:
minuteStep: 5
- name: serialization-test
label: Serialization
type: serialization
description: Creates a select input with the type classes from serializations
defaultValue:
- name: resource-test
label: Resource
type: resource
description: Creates a select input with resource entries of the specific type
defaultValue:
options:
resourceFramework: resource-framework-name
resourceName: resource-name