Introduction
A flow element is an atomic processing component of a Pipeline. It takes an input in the form of evidence and outputs results in the form of element data. A flow element can be seen as a black box where the internal method of processing and the way in which it is used externally are decoupled such that any element can be used in the same manner, regardless of its input, output, or method of processing. These are the building blocks of a Pipeline and do all the processing as instructed by the Pipeline they reside in.
Creation
Flow elements are built using a corresponding element builder, which follows the fluent builder pattern. All configuration of an element occurs in the element builder. By convention, the configuration of an element is immutable once it has been built. However, this is not enforced and is dependent on the implementation of each specific element.
Processing
The primary function of a flow element is to process data. Both the input (evidence) and output (element data) of the processing are contained in a single place called a flow data.
The flow element typically uses the evidence contained within the supplied flow data to determine the values it will populate in the resulting element data, which is then added to the flow data as an output.
However, flow elements may also use existing element data from the flow data as input and are not required to populate any output data.
For example, a 'user age' element might look for a date of birth in the evidence, set the age of the user in an element data instance before adding the element data into the same flow data which the evidence came from.
Hierarchy
While an implementation can implement just flow element, useful functionality is built up in layers as shown below. Any of these layers can be added to by an implementation, depending on its requirements.
In languages which support inheritance, this is a structural hierarchy. In other languages, this may be more of a conceptual hierarchy, and not reflected directly in the code.
Properties
The element data produced by an element contains values of properties based on the evidence provided. Each element has a set of properties it can populate values for.
The properties populated by an element can be queried directly to retrieve metadata relating to each property. The data available will vary by implementation but will typically include information such as the property name and data type.
Evidence Keys
Each element can only make use of certain items of evidence during processing. In the age element example above, it expects a date of birth to be present in the evidence.
The items of evidence which an element can make use of are exposed via an evidence key filter. This is also available in an aggregated form from the parent Pipeline. (This would be equivalent to combining the evidence key filter from each element of the Pipeline individually).
Using an evidence key filter means that instead of asking an element 'which items of evidence do you want?', one would ask 'do you want this item of evidence?'. This gives an element a greater degree of flexibility how it specifies the evidence that it accepts. For example, it allows an element to easily indicate that it can make use of any HTTP headers, regardless of the header name.
Data Keys
Results of an element's processing are stored in the flow data, keyed on the element's element data key. While not required, it is convention that each element has a unique key name. For example, our 'user age' example would likely have the key name 'user age'.
In addition to the name, an element data key also contains the type of element data that the element populates. Note that this is only the case in languages which support this.
Creating Data
When an element adds element data to a flow data, it cannot be assumed that an element data does not already exist for the element. For this reason, an element contains an 'element data factory' which it gives to the flow data when it asks for a new or existing element data. A method is called on the flow data, giving the factory as an argument, and the flow data returns either the element data previously created with the same key, or a new element data from the factory which it has added to its internal structure.
Scope
By convention, an element's configuration is immutable once created. Although this is not enforced.
An element can be added to any number of Pipelines. A Pipeline is merely an organizational layer which instructs element's to carry out processing on a flow data, so the element acts in isolation without the need to reference to the Pipeline.
It is also possible for an element to be added more than once to the same Pipeline. For example, an element which opens a persistent connection to a database, then closes it at another point in the Pipeline would exist more than once in the same Pipeline. In this case, it is the responsibility of the element to ensure access to a flow data does not assume it is a fresh instance, and is accessed in a safe manner.
Thread-Safety
Flow elements are required to be thread-safe in languages that support multi-threaded operation. As multiple Pipelines may be calling on an element to carry out processing simultaneously, they must be able to handle this.
Flow elements also expose whether or not they will carry out concurrent operations, as the Pipeline needs to know this.