Combine
Combine
Combine
Combine processes data in a number of useful ways. Combine can be used to:
Restore hierarchies of data flattened by the SPLIT component
Create a single output record by joining multiple input streams
Denormalize vectors (including nested vectors)
COMBINE does not use transform functions. It determines what operations to perform on input
data by using DML that contains special introspection methods. This DML is generated for
COMBINE's input ports by the split_dml command-line utility.
COMBINE performs the inverse operations of the SPLIT component. It has a single output port
and a counted number of input ports. COMBINE (optionally) denormalizes each input data
stream, and then performs an outer join on the input records to form the output records.
Properties of Combine
Example
record
string("|") region;
string("|") state;
string("|") county;
string("|") addr_line1;
string("|") atm_id;
string("\n") mgr;
string('\0') DML_assignments() =
'region=region,state=states.state,county=states.counties.county,addr_line1=states.
counties.location.addr_line1,atm_id=states.counties.atms.atm_id,mgr=mgr';
string('\0') DML_key_specifiers() = '{region}=,{state}=states[],
{county}=states.counties[]';
end
And to roll up the fields that are marked as sort keys — region, state, and county —
into nested vectors. To do this, a single COMBINE component can be used rather
than performing a series of three rollup actions.
The desired output format is:
record
string("|") region;
record
string("|") state;
record
string("|") county;
record
string("|") addr_line1;
end location;
record
string("|") atm_id;
end[decimal(2)] atms;
end[decimal(2)] counties;
end[decimal(2)] states;
string("n") mgr;
end
Input data:
Output data: