Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Combine

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 3

Combine

Combine processes data in a number of useful ways. Combine can be used to:
 Restore hierarchies of data flattened by the SPLIT component
 Create a single output record by joining multiple input streams
 Denormalize vectors (including nested vectors)

COMBINE does not use transform functions. It determines what operations to perform on input
data by using DML that contains special introspection methods. This DML is generated for
COMBINE's input ports by the split_dml command-line utility.

COMBINE performs the inverse operations of the SPLIT component. It has a single output port
and a counted number of input ports. COMBINE (optionally) denormalizes each input data
stream, and then performs an outer join on the input records to form the output records.

Properties of Combine

count: The number of input ports.


Reject-threshold: The component's tolerance for reject events.
Logging: Specifies whether or not to log component events.
Sample Graph

Example

Consider a file with the following dml:

record
string("|") region;
string("|") state;
string("|") county;
string("|") addr_line1;
string("|") atm_id;
string("\n") mgr;
string('\0') DML_assignments() =
'region=region,state=states.state,county=states.counties.county,addr_line1=states.
counties.location.addr_line1,atm_id=states.counties.atms.atm_id,mgr=mgr';
string('\0') DML_key_specifiers() = '{region}=,{state}=states[],
{county}=states.counties[]';
end

And to roll up the fields that are marked as sort keys — region, state, and county —
into nested vectors. To do this, a single COMBINE component can be used rather
than performing a series of three rollup actions.
The desired output format is:

record
string("|") region;
record
string("|") state;
record
string("|") county;
record
string("|") addr_line1;

end location;
record
string("|") atm_id;

end[decimal(2)] atms;
end[decimal(2)] counties;
end[decimal(2)] states;
string("n") mgr;
end

Input data:

Output data:

You might also like