Page tree
Skip to end of metadata
Go to start of metadata

Overview

The Attivio Intelligence Engine (AIE) includes a set of default workflows that process content, queries, and responses. A workflow is collection of ordered components that serve some kind of business logic. 

Use the AIE Administrator to modify workflows!

AIE workflows can be created, removed, modified and maintained directly from the AIE Administrator user interface using Dynamic Configuration.

View incoming links.

Workflow Definition

A workflow that extracted entities, title cased the extracted entities and then classified a document based on the discovered entities could be thought of as follows:

sample workflow

The above example is a very simplistic workflow. Complex workflows can have many more stages, complex routing rules and can call into and be called by other workflows in order to create the necessary business logic for an AIE application.

Workflows are defined in AIE configuration files as follows:

<workflows>
  <workflow name="<workflow-name>" type="<workflow-type>">
    <description>workflow description goes here</description>
    <documentTransformer name="<transformer-name>"/>
    ...
    <subflow name="<subflow-name>"/>
    ...
  </workflow>
  ...
</workflows>

Workflow Elements

Elements and sub-elements for workflows are related as follows:

Element

Sub-element

Description

<workflows>

<workflow>

The <workflows> element contains one or more <workflow> elements. The <workflow> sub-elements are a series of data-processing components that share the following characteristics:

  • Are instantiated and connected at runtime.
  • Define a chain of components where the output of one component points to the input of the next (the input of a workflow is the input of the first workflow component defined).
  • name attribute (required) - defines a namespace for components
  • type attribute (required) - defines the type of workflow. Valid values are ingest, query, and response.
  • override attribute (optional) - if true, indicates that this workflow replaces an existing workflow of the same name

<workflow>

<description>

Describe what the workflow does here. The AIE Administrator will show the description when hovering over the workflow name.

 

<documentTransformer>

Component that returns a transformed version of its input and generates status messages. During execution, AIE automatically forwards status messages from the documentTransformer to a service designated as the documentResultHandler. Typically, documentTransformers process inputs asynchronously. As a result, the message originator in the workflow does not need to wait for workflow results. May only appear in ingest workflows.

 

<queryTransformer>

Component that returns a transformed version of its input but no status messages. Because queryTransformers are not expected to provide informational status messages, use them with synchronous clients only. May only appear in query workflows.

 

<responseTransformer>

Component that returns a transformed version of its input but no status messages. Because responseTransformers are not expected to provide informational status messages, use them with synchronous clients only. May only appear in response workflows.

 

<stage>

Generic component that is capable of processing any type of message. Can have arbitrary side effects. May appear in any workflow type.

 

<subflow>

Component used to transfer processing to another workflow. The output of the previous component is transferred to the input of the subflow. When subflow processing completes, output is routed back to the next component in the calling workflow. If subflow is the last part of the workflow, its output becomes the workflow output.

 

<splitter>

Component routes messages to one or more sub flows based on message content. Splitters are used to setup conditional/branching logic inside a workflow. The most common splitter is SplitDocumentListByFieldValue , which can be used to send documents to different workflows based on field values in the document. For example, the configuration below would route documents to different workflows based on the value of the 'category' field and then rejoin all of the documents back together after the conditional workflows to continue to the next stage named log2. In this example, if rejoin attribute was false, the documents sent to other workflows would not be re-routed to log2 after the workflows ended and processing would cease on the documents. See the Workflow Routing section of the Architecture guide for more information.

<components
  <component name="categorySplitter"
             class="com.attivio.platform.transformer.ingest.routing.SplitDocumentListByFieldValue">
    <description>splits documents by the category field</description>
    <properties>
      <property name="input" value="category" />
      <property name="rejoin" value="true" />
      <workflowQueueMap name="workflowMap">
        <if value="news" workflow="newsWorkflow" />
        <if value="sports" workflow="sportsWorkflow" />
        <if value="opinion" workflow="opinionWorkflow" />
        <else workflow="otherWorkflow" />
      </workflowQueueMap>
    </properties>
  </component>
</components>

<workflows>
  <workflow name="logAndSplit">
    <description>log and split based on category</description>
    <documentTransformer name="log" />
    <splitter name="categorySplitter" />
    <documentTransformer name="log2" />
  </workflow>
</workflows>

Disabling a Stage within the Workflow

All <workflow> sub-elements support the enabled attribute. If set to false, the component still appears at runtime but passes all inputs to the next stage without any impact on the input.

Overriding Component Properties within the Workflow

A component instantiation may override properties defined in the component definition. In the example below, the queryTransformer queryFacetFinder has its searchWorkflow property overridden.

<workflow name="myTestWorkflow">
  <queryTransformer name="log">
  <queryTransformer name="queryFacetFinder">
    <property name="searchWorkflow" value="searcher"/>
  </queryTransformer>
</workflow>

Overriding a Workflow

A previous workflow definition can be replaced by using the override attribute on the newer workflow definition. This can be useful for disabling or changing default workflows. The example below effectively disables the default AIE linguistics processing workflow by replacing it with an empty workflow.

<workflow name="attivioLinguistics" override="true"/>

Standard Workflows

The following table lists the preconfigured top-level workflows included with AIE in the base install. Component definitions can be found in the <attivio-install>\conf\core-app\attivio-components.xml file. In most cases small subflows make up each of these main workflows.

Workflow

Type

Description

ingest

Ingest

Accepts and prepares text content for indexing.

xmlIngest

Ingest

Accepts and prepares XML content for indexing.

textFileIngest

Ingest

Accepts and prepares raw text files for indexing. Defaults to UTF-8 encoding unless overridden by the document's encoding field value or the transformer's defaultEncoding property. In these cases you can use any encoding supported by Java SE 7.

rowsetIngest

Ingest

Accepts row-set XML formatted data.

fileIngestIngestAccepts and prepares binary file content for Advanced Text Extraction and indexing.
xmlScopesIngestIngestAccepts and prepares XML content for indexing and subsequent scope search.

search

Query

Default workflow for searching content. This is the default workflow for the SimpleSearch UI.

admin

Admin

Container for all service components such as connectors, receivers, etc.

Viewing Workflows

The Use the Attivio Administrator displays information for all project activities, including connected workflows. The interface displays one project at a time, based on the server and port specified and the associated instance of AIE.

  • No labels