Skip to content

Framework Configuration

Julien Cruz edited this page Jan 18, 2015 · 17 revisions

In the previous section you set up a Maven project capable of compiling a StreamFlow Framework JAR. In order to identify the Spouts and Bolts that are available in your Framework JAR, StreamFlow utilizes a single Framework configuration file. The framework configuration is integral to building Components that can be used within the StreamFlow UI to dynamically build topologies. Although the project you created in the last section will compile, the lack of a framework.yml configuration file will prevent StreamFlow from registering any Spouts or Bolts within your JAR.

Important: Even though you provide the source code in your framework project, Spouts and Bolts which have not been registered in the framework.yml will not be visible by the StreamFlow UI.

The framework configuration file is used to register all of the Spouts, Bolts, Resources, and Serializations that are present within the framework jar project. This configuration can be defined using a YAML format or a JSON format. Selection of the configuration format (YAML/JSON) is solely a personal preference as all settings are available in each format. Although either format can be used, YAML is the recommended format as it is less verbose than JSON when formatting the configuration file.

The framework.yml and framework.json configuration files must be located in a STREAMFLOW-INF folder at the root of the class path (e.g. src/main/resources/STREAMFLOW-INF/framework.yml or src/main/resources/STREAMFLOW-INF/framework.json). The following sample framework.yml and framework.json files outline the format of these configuration files. Following the examples, each major section of the configuration will be covered in detail.

Note: The following YAML and JSON configurations are equivalent and you should only define either framework.yml OR framework.json in your project.

Sample framework.yml

# Framework Properties
name: sample-framework
label: Sample Framework
version: 1.0.0-SNAPSHOT
description: Spouts and Bolts implemented for demonstration purposes

# Framework Components
components: 
  - name: sample-bolt
    label: Sample Bolt
    type: storm-bolt
    description: Sample bolt used for demonstration purposes
    mainClass: streamflow.bolt.SampleBolt
    properties: 
      - name: field-one
        label: Field One
        description: Description of field one
        defaultValue: 10
        required: true
        type: number
      - name: field-two
        label: Field Two
        description: Description of field two
        defaultValue: second
        required: false
        type: select
        options:
            listItems:
              - first
              - second
              - third   
    inputs: 
      - key: default
        description: Takes any tuple as input
    outputs:
      - key: default
        description: Processed activity content

# Framework Resources      
resources:
  - name: file-resource
    label: File Resource
    description: File Resource for Testing
    resourceClass: streamflow.resource.FileResource
    properties:
      - name: file-resource
        label: File Resource
        description: Use this resource to upload and save files
        defaultValue: 
        required: true
        type: file

# Framework Serializations
serializations:
  - typeClass: streamflow.serializer.SampleType
    serializerClass: streamflow.serializer.SampleTypeSerializer
  - typeClass: streamflow.serializer.AnotherSampleType

Sample framework.json

{
    "name": "sample-framework",
    "label": "Sample Framework",
    "version": "1.0.0-SNAPSHOT",
    "description": "Spouts and Bolts implemented for demonstration purposes",
    "components": [
        {
            "name": "sample-bolt",
            "label": "Sample Bolt",
            "type": "storm-bolt",
            "description": "Sample bolt used for demonstration purposes",
            "mainClass": "streamflow.bolt.SampleBolt",
            "properties": [
                {
                    "name": "field-one",
                    "label": "Field One",
                    "description": "Description of field one",
                    "defaultValue": "10",
                    "required": true,
                    "type": "number"
                },
                {
                    "name": "field-two",
                    "label": "Field Two",
                    "description": "Description of field two",
                    "defaultValue": "second",
                    "required": false,
                    "type": "select",
                    "options": {
                        "listItems": [
                            "first",
                            "second",
                            "third"
                        ]
                    }
                }
            ],
            "inputs": [
                {
                    "key": "default",
                    "description": "Takes any tuple as input"
                }
            ],
            "outputs": [
                {
                    "key": "default",
                    "description": "Processed activity content"
                }
            ]
        }
    ],    
    "resources": [
        {
            "name": "file-resource",
            "label": "File Resource",
            "description": "File Resource for Testing",
            "resourceClass": "streamflow.resource.FileResource",
            "properties": [
                {
                    "name": "file-resource",
                    "label": "File Resource",
                    "description": "Use this resource to upload and save files",
                    "defaultValue": "",
                    "required": true,
                    "type": "file"
                }
            ]
        }
    ],
    "serializations": [
        {
            "typeClass": "streamflow.serializer.SampleType",
            "serializerClass": "streamflow.serializer.SampleTypeSerializer"
        }
    ]
}

General Configuration

Framework properties define general information about a framework that is used to identify the framework. These properties are typically listed at the top of the framework configuration for clarity, although it can be located anywhere in the configuration.

Let's look at a snippet of the framework properties from the above example and walk through each property in detail.

name: sample-framework
label: Sample Framework
version: 1.0.0-SNAPSHOT
description: Spouts and Bolts implemented for demonstration purposes

name

  • Description: Globally unique identifier for the framework. You must ensure that no other frameworks use the same name, otherwise you the two frameworks will collide during framework upload. To help protect against this, it is common to prefix your framework name with namespace style information (e.g. streamflow.core.sample.framework)
  • Default: None

label

  • Description: User friendly name that users will see in the UI to identify your framework.
  • Default: None

version

  • Description: Identifies the version number of your framework to users. Although you are free to enter any value for this field, we recommend incrementing your version using Semantic Versioning.
  • Default: None

description

  • Description: Helps users understand the purpose of your framework
  • Default: None

Component Configuration

The components section of the configuration is used to register Storm Spouts and Bolts in the framework. The components property is an array which can be used to register multiple components in the configuration. The components that are registered in this section will appear in the palette of the topology editor when the framework is uploaded to the StreamFlow server.

Let's look at a snippet of the components section from the above example and walk through each property in detail.

components: 
  - name: sample-bolt
    label: Sample Bolt
    type: storm-bolt
    description: Sample bolt used for demonstration purposes
    mainClass: streamflow.bolt.SampleBolt
    properties: [] 
    inputs: 
      - key: default
        description: Takes any tuple as input
    outputs:
      - key: default
        description: Processed activity content

components.name

  • Description: Key used to uniquely identify the component within the framework. The name does not have to be globally unique across different frameworks, only within the same framework.
  • Default: None

components.label

  • Description: User friendly name that users will see in the UI to identify the component.
  • Default: None

components.type

  • Description: Special value to indicate what type of component is being implemented. It is used to help organize your components by type in the component palette of the topology builder. Currently, the only valid type values are as follows:

    • storm-spout
    • storm-bolt
    • trident-spout
    • trident-batch-spout
    • trident-partitioned-spout
    • trident-opaque-partitioned-spout
    • trident-function
    • trident-filter
    • trident-aggregator
    • trident-combiner-aggregator
    • trident-reducer-aggregator
  • Default: None

components.description

  • Description: Short description to describe the purpose of the component. This value is visible in the framework details page and as a link in the component palette.
  • Default: None

components.mainClass

  • Description: Fully qualified class name of the component implementation (e.g. streamflow.bolt.SampleBolt)
  • Default: None

components.properties

  • Description: Array of configurable properties to associate with the component. When adding the component to build a topology, these properties will be available for modification and will be injected into the component implementaton. Properties are a shared concept between Components and Resources and is covered in detail in the Property Configuration section. The example properties section above is intentionally left blank since properties are discussed in greater detail in the Property Configuration section.
  • Default: Empty Array

components.inputs

  • Description: Array of objects which define the input settings for the component. When used for Storm, only one input is allowed as Storm only accepts a single input for each component.
  • Default: Empty Array

components.inputs.key

  • Description: Unique key to identify the input. This value should map to the Storm streamId for your component implementation. By default, Storm uses a streamId of "default" for simplicity if none is specified in the implementation. If you did not configure a custom streamId, it is best to use the default value of "default"
  • Default: default

components.inputs.description

  • Description: Short description of the purpose of the input. It is often helpful to indicate any limitations on data types or formats in the description.
  • Default: None

components.outputs

  • Description: Array of objects which define the output settings for the component. You can specify multiple output objects if your implmentation supports this.
  • Default: Empty Array

components.outputs.key

  • Description: Unique key to identify the output. This value should map to the Storm streamId for your component implementation. By default, Storm uses a streamId of "default" for simplicity if none is specified in the implementation. If you did not configure a custom streamId, it is best to use the default value of "default"
  • Default: default

components.outputs.description

  • Description: Short description of the purpose of the output. It is often helpful to indicate any limitations on data types or formats in the description.
  • Default: None

Resource Configuration

The resources section is used to register new Resources with StreamFlow. Resources are used to register frequently used logic such as establishing a database connection. StreamFlow Resources are implemented as Guice modules which can accept user configurable properties similar StreamFlow Components. Due to the reuse of the property feature, you will find many similarities between the Resource and Component configuration formats.

Let's look at a snippet of the resources section from the above example and walk through each property in detail.

resources:
  - name: file-resource
    label: File Resource
    description: File Resource for Testing
    resourceClass: streamflow.resource.FileResource
    properties: []

resources.name

  • Description: Key used to uniquely identify the resource within the framework. The name does not have to be globally unique across different frameworks, only within the same framework.
  • Default: None

resources.label

  • Description: User friendly name that users will see in the UI to identify the resource.
  • Default: None

resources.description

  • Description: Short description to describe the purpose of the resource.
  • Default: None

resources.resourceClass

  • Description: Fully qualified class name of the Guice module class that supports the resource (e.g. streamflow.resource.SampleResource)
  • Default: None

resources.properties

  • Description: Array of configurable properties to associate with the resource. Properties are a shared concept between Components and Resources and is covered in detail in the Property Configuration section. The example properties section above is intentionally left blank since properties are discussed in greater detail in the Property Configuration section.
  • Default: Empty Array

Serialization Configuration

The serializations section is used to register classes within your framework that are required for serialiation. The Storm implementation uses the classes configured in the serialization section to initialize the Kryo implementation used to serialize Tuple data between Spouts and Bolts. Any top level serialization and serializer classes which would typically be registered in Storm using the topology.kryo.register property should be registered in this section.

Let's look at a snippet of the serializations section from the above example and walk through each property in detail.

serializations:
  - typeClass: streamflow.serializer.SampleType
    serializerClass: streamflow.serializer.SampleTypeSerializer

  - typeClass: streamflow.serializer.AnotherSampleType

serializations.typeClass

  • Description: The fully qualified class name of a class to register with Kryo.
  • Default: None

serializations.serializerClass

  • Description: A map from the name of a class to register to an implementation of com.esotericsoftware.kryo.Serializer
  • Default: None

Property Configuration

Properties are an important part of the framework configuration as it provides a mechanism for Components and Resources to accept dynamic configuration in the StreamFlow topology builder. Properties are defined using a specific configuration format which is used to dynamically generate HTML form elements in the topology builder. Most HTML5 input element types are supported with some additional options included to provide validation of input types. Any properties which are registered in the configuration will be injected into the Spout or Bolt implementation using a javax.inject.Named annotation. Instance variables are initialized with the actual values of the properties at runtime by using a setter based injection combined with the @Named annotation.

Let's look at a snippet of the properties section and walk through each option in detail.

properties: 
  - name: field-one
    label: Field One
    description: Description of field one
    defaultValue: 10
    required: true
    type: number
  - name: field-two
    label: Field Two
    description: Description of field two
    defaultValue: second
    required: false
    type: select
    options:
        listItems:
          - first
          - second
          - third

properties.name

  • Description: Unique identifier for the property in the context. Names must be unique within the same component or resource as the name is used by the Guice injection component as the @Named value.
  • Default: None

properties.label

  • Description: User friendly name that users will see when viewing the component or resource properties
  • Default: None

properties.description

  • Description: Short description which describes the purpose and usage of the property in the context of the component or resource.
  • Default: None

properties.defaultValue

  • Description: A value to use as default if the property is left unchanged.
  • Default: None

properties.required

  • Description: Flag to determine if the property is required or optional. The property is required if "true", otherwise optional
  • Default: false

properties.type

  • Description: Input type to use for the property. The type dictates which input element is generated by the StreamFlow UI when editing properties. Additional implementation details about each type can be found in the Property Types section.
  • Default: string

properties.options

  • Description: Options are used in the property configuration to provide additional validation or rendering information to accompany the type. Each type has a specific set up options that can be added to this section to aid in rendering HTML elements.
  • Default: None

Property Types

As seen in the previous Property Configuration section, properties can specify a data type and some additional options. This section will cover all of the property types that StreamFlow supports and provide example configurations for each property type.

Let's start with a table outlining each of the property types and we will follow that with a sample YAML configuration for each type.

Type HTML Form Element Description Options
Default text box Standard input text box with no format limitations None
string text box Standard input text box with no format limitations None
e-mail or email HTML5 text box w/ email validation Input text box that validates user input None
url HTML5 text box w/ URL validation Input text box that validates user input None
password password text box Standard password input text box None
number HTML5 numeric integer spinner Numeric spinner with customization options minNumber, maxNumber, step
float HTML5 floating point spinner Numeric spinner with fractional intervals minNumber, maxNumber, step
boolean check box Standard input text box None
textarea text area Standard multi-line text area input None
select select box Standard select with custom values listItems
file custom file uploader Custom built file upload input with metadata feedback None
date HTML5 date input HTML5 date input which simplifies date entry mm, dd, yy
time HTML5 time input HTML5 time input which simplifies time entry minuteStep
serialization select box Select box populated with list of registered serializations None
resource select box Select box populated with resource entries for the specified resource resourceFramework, resourceName

The following code shows an example configuration for each of the above input types. Please use this as a reference when specifying the property type.

- name: string-test
  label: String Test
  type" : string            
  description: Creates a simple Input box with no validation
  defaultValue:

- name: email-test
  label: E-mail
  type: email        
  description: Creates an Input Box that Validates e-mails on HTML5 compatible Browsers
  defaultValue:

- name: url-test
  label: URL
  type: url
  description: Creates an Input Box that Validates Web Urls on HTML5 compatible Browsers
  defaultValue:

- name: password-test
  label: Password
  type: password          
  description: Creates an input with the value hidden for passwords.
  defaultValue:

- name: number-test
  label: Number
  type: number         
  description: Creates a numeric stepper to allow for easier integer entry
  defaultValue:
  options:
      numericStep: 5
      minNumber: 0
      maxNumber: 100
      displayUnits: seconds

- name: float-test
  label: Float
  type: float           
  description: Creates a javascript based Numberic stepper for Floats.
  defaultValue:
  options:
      floatStep: 0.01
      minNumber: 0
      maxNumber: 100

- name: boolean-test
  label: Boolean
  type: boolean       
  description: Creates a checkbox for inputting Boolean values
  defaultValue: true

- name: area-test
  label: Area 
  type: text-area
  description: Creates a html text area for entering large blocks of text
  defaultValue:

- name: select-test
  label: Select
  type : select
  description: Creates a select input with items specified in the options
  defaultValue:
  options:
      listItems:
          - one
          - two
          - three
          - four

- name: file-test
  label: File 
  type: file
  description: Creates a custom file upload and file metadata display input
  defaultValue:

- name: date-test
  label: Date
  type: date          
  description: Creates a Date Picker for entering Month / Day / Year Dates
  defaultValue:
  options:
      dateFormat: mm/dd/yy

- name: time-test
  label: Time 
  type: time           
  description: Creates a Time input for Entering a current time (no date)
  defaultValue:
  options:
      minuteStep: 5

- name: serialization-test
  label: Serialization
  type: serialization           
  description: Creates a select input with the type classes from serializations
  defaultValue:

- name: resource-test
  label: Resource 
  type: resource          
  description: Creates a select input with resource entries of the specific type
  defaultValue:
  options:
      resourceFramework: resource-framework-name
      resourceName: resource-name