Skip to content

Extending Scumblr

Scott Behrens edited this page Oct 17, 2016 · 2 revisions

Extending Scumblr

Scumblr can be extended in a variety of ways including adding new workflows and new search providers. These capabilities will be discussed in this section.

Workflow

Scumblr uses the Workflowable gem (www.github.com/netflix/workflowable) to allow defining flexible workflows for actioning results. This section will give a brief overview of workflows. For more information about setting up and defining workflows, see the Workflowable wiki.

In Scumblr, results can be assigned one or more workflows. This is done by adding a workflow flag from the Result view page (as discussed in the Workflow Flags section above). Once a workflow flag has been added, the result can be moved through the various phases from the result view page.

Workflow Concepts

This section will give a brief overview of the concepts used by the Workflowable gem.

Workflow: A process, generally with multiple, ordered stages

Stage: A state within the process

Initial stage: The state the workflow starts in

Action: A function that gets run as part of adding a workflow to an item or moving between stages

Before action: An action that gets run when transitioning into a specific stage

After action: An action that gets run when transitioning out of a specific stage

Global action: An action that gets run when between any stage, including moving into the initial stage (i.e. when a result is first flagged)

Workflow Setup

In order to use workflows, we first need to create one. This can be done from the Workflowable admin page (available at /workflowable or from the Admin menu). Please see the workflowable wiki for detailed instructions on setting up a workflow.

Once the workflow has been setup from the Workflowable admin page, we also need to create a Workflow Flag that can be used in Scumblr. This can be done from the Flag admin page (in the admin menu).

Workflow Actions

Workflowable allow defining custom actions that can be run when flagging a result with a workflow and/or when moving between stages. These actions are developed as classes and need to conform to the API defined by workflowable (see the Workflowable wiki). Action classes should be stored in the lib/workflowable/actions folder.

Creating a Workflow Flag

You will need to have already created the workflow through the workflowable admin page. Once this is complete, go to the Flag admin page and click "New Flag". Here you can choose a name for the workflow (which will be used in Scumblr), a description for the workflow, and add any subscribers who, if Scumblr is setup to send email, will receive notifications when a result is flagged for this workflow. You will additionally need to choose the workflow to associate with the flag by choosing it from the dropdown box. This box is populated based on the workflows created in the workflowable admin page.

Once the flag has been created it is ready to be used in Scumbr. See the "Workflow Flags" section above for instructions on how to assign and use workflows from the result view page.

Tasks

It is possible to define new tasks that are capable of syncing or analyzing results.

Base

The Base task is the most common task type. It is single threaded. If you are using it you will need to inherit from:

ScumblrTask::Base

Base Class Provider Methods

A search provider generally needs to implement 4 functions:

self.task_type_name: This function allow Scumblr to retrieve and present a readable name for the provider

self.task_category: This function allow Scumblr to determine the category (security, sync, etc.).

self.options: This function allow defined what options can/should be passed into the provider. These options are defined when creating the Task

initialize(query, options={}): This function is often overridden in order to retrieve an access token from the configuration or otherwise setup the task

run: This function performs the search and returns the results

Base Class Task Type Name Function

Parameters: None

Return Value: String

This simple function returns the task type name when called Scumblr needs to get the human readable name for the provider. For example ScumblrTask::CurlAnalyzer.task_type_name will return "Curl Analyzer"

Options Function

Parameters: None

Return Value: options (hash)

This function will return a hash that defines which options can/should be passed in when running a search. Options are defined when an individual search is created.

Options Hash

key: (symbol) A key used to identify the option value. Each option must have a unique key

value: (hash) This will contain information about the option:

value[:name] (string) The name of the option

value[:type] (string) The type of UI element to render for the Task. Examples include :choice, :string, :text, :boolean.

value[:description] (string) A description of the option

value[:required] (boolean) Is the option required

Example
    {
      :sync_type => {name: "Sync Type (Organization/User)",
                 description: "Should this task retrieve repos for an organization or for a user?",
                 required: false,
                 type: :choice,
                 default: :both,
                 choices: [:org, :user]},
      :owner => {name: "Organization/User",
                  description: "Specify the organization or user.",
                  required: true,
                  type: :string},
      :members => {name: "Import Organization Members' Repos",
                  description: "If syncing for an organization, should the task also import Repos owned by members of the organization.",
                  required: false,
                  type: :boolean},
      :scope_visibility => {name: "Repo Visibility",
                  description: "Should the task sync public repos, private repos, or both.",
                  required: true,
                  type: :choice,
                  default: :both,
                  choices: [:both, :public, :private]}
    }

Initialize Function

Parameters: query (string), options (hash, optional)

Return Value: None

The default initializer will set the options to @options and all results to @results. These values (@options, @results) can be accessed from the run function. If the initializer is overridden, the overriding function can call "super" to setup these values, or overriding function can setup the appropriate values for the run function itself.

This function is often used to pull API keys or other credentials from the config file. If using the convention used for the out of the box search providers, this data should be put in config/initializers/scumbr.rb and would be accessed using:

Rails.configuration.try(:<OPTION NAME>)

Run Function

Parameters: None

Return Value: results (array of results hashes)

The run function should perform either a sync and return any identified results or provide a security check which may create or modify existing results. For security tasks you can use the @results object to work on existing results.

Options can be accessed using the same key as that used in the options hash. So, for example, if the options hash defined an options ":follow_redirects", this value would be retrieved using @options[:follow_redirects]

Results Hash

title: (string) The title of the result

url: (string) The url for the result

domain: (string) The result's domain

metadata: (hash) An (optional) hash of additional metadata to store with the result

Example
[ 
    {title: "Netflix/sketchy", url: "http://www.github.com/Netflix/Sketchy", domain: "www.github.com", {
github_analyzer: {owner: "Netflix",private: false,account_type: "Organization"}}	
]

This example includes 1 result identified in Github. These results would be imported into Scumblr, if not previously identified.

Sample Sync Task

Below is a simple sample of a search provider class. This class will not perform a sync, but should provide guidance on what a Sync Task class should look like:

class ScumblrTask::FakeSync < ScumblrTask::Base
  def self.provider_name
    "Fake Sync"
  end

  def self.options
    {
      :max_result=>{name: "Max Results", description: "The maximum number of results to retrieve", required: false}
    }
  end

  def initialize(query, options={})
    super
    @access_token = Rails.configuration.try(:simple_search_token)
  end

  def run
    if(@access_token.blank?)
        create_error("Unable to run Fake Sync. please add access token"
        return             
    end
  
    # RUN SEARCH HERE

    # Fake results
    @results = 
    [ 
      {title: "Netflix", url: "http://www.netlfix.com", domain: "www.netflix.com", metadata: {priority: 1},
      {title: "Netflix Blog", url: "http://blog.netlfix.com/page1", domain: "blog.netflix.com", metadata: {priority: 2}
    ]

    return @results
  end
end

Sample Security Task Documentation Coming Soon

Sample Maintenance Task Documentation Coming Soon

Creating Events

You can create an event in any task type by calling create_event(event, level). By default create_event set's the event level to "Error".

Stylizing your Data with Partials Documentation Coming Soon

Async

The Async class helps with automatic multithreading.

If you are using it you will need to inherit from:

ScumblrTask::Async

Specifically for the Async task you must define a "perform_work" function that accepts one argument (the object to operate on) and define @results which is a list of objects to perform_work on. Optionally you can define @workers to be the number of worker threads.

Async Documentation Coming Soon!

Clone this wiki locally