This chapter provides details on how to use operators in a mapping to transform data and how to use the Expression Builder to create expressions. Operators are listed alphabetically.
This chapter includes the following topics:
The topics are listed in a columnar table that reads down the columns from left to right to conserve space.
Some of the data flow operators described in this chapter require that you create expressions. An expression is a statement or clause that transforms data or specifies a restriction. These expressions are portions of SQL that are used inline as part of a SQL statement. Each expression belongs to a type that is determined by the role of the data flow operator. You can create expressions using Expression Builder, or by typing them into the expression field located in the operator or attribute property windows.
You can open the Expression Builder from the operator property windows in the operators such as filters, joiners, splitters, and aggregators.
You can open the Expression Builder from the attribute property windows in the operators such as expressions, data generators, and constants.
To open the Expression Builder:
From the operator properties window or the attribute property window, click the ... button in the expression field. Figure 8-1 shows the attribute properties window.
Figure 8-1 Attribute Property Window
The Expression Builder displays as shown in Figure 8-2 .
Figure 8-2 Expression Builder Interface
Create an expression by:
Typing text into the expression field.
Dragging items from the Inputs and Transformations tabs on the left panel and dropping them into the Expression field on the right.
Double clicking on items from the Inputs and Transformations tabs on the left panel.
Clicking arithmetic operator buttons available under the Expression field.
The Expression Builder contains the following parts:
In the left panel, the navigation tree displays two tabs:
Inputs Tab: A list of input parameters.
Transformations Tab: A list of predefined functions and procedures located in the Oracle Transformation Library, the Global Shared Library, and a custom Transformation Library. See the Oracle Warehouse Builder Transformation Guide for more information.
Expression Field: At the top of the right panel is the Expression field. Use this field to type and edit expressions.
Arithmetic Operator Buttons: Below the Expression field are buttons for arithmetic operators. Use these buttons to build an expression without typing. The arithmetic operators available vary by the type of data flow operator that is active.
Others: A drop-down list of available SQL clauses that are appropriate for the active expression type.
Beginning in Oracle9
i
, the
CASE
function is recommended over the
DECODE
function because the
CASE
function generates both SQL and PL/SQL while
DECODE
is limited to SQL. If you use the
DECODE
function in an expression, Warehouse Builder promotes it to
CASE
where appropriate during code generation. This enables you to deploy the
DECODE
functionality in all operating modes (such as setbased or rowbased) and transparently across Oracle database releases (8.1, 9.0 and higher).
For example, Warehouse Builder converts the function
DECODE (T1.A, 1, 'ABC', 2, 'DEF', 3, 'GHI', 'JKL')to the following:
CASE T1.A WHEN 1 THEN 'ABC' WHEN 2 THEN 'DEF' WHEN 3 THEN 'GHI' ELSE 'JKL'Validation Results Field: At the bottom of the right panel is the Validation Results field. After you select the Validate button to the right if this field, this field displays the validation results.
Validate Button: Use this button to validate the current expression in the Expression Builder. Validation ensures that all mapping objects referred to by the expression have associated repository objects. The expressions you create with the Expression Builder are limited to the operator inputs and to any transformations available in a project. This limitation protects the expression from becoming invalid because of changes external to the operator. If the deployment database is different from the design repository, it may not accept the expression. If this happens, the expression may be valid but incorrect against the database. In this case, expression errors can only be found at deployment time.
Aggregator Operator
The Aggregator operator calculates data aggregations, such as summations and averages and provides an output row set with the aggregated data.
Because each Aggregator operator shares a GROUP BY and HAVING clause, each attribute in the output group has the same cardinality. The number of rows in the output row set is less than or equal to the number of input rows.
The Aggregator operator has one input group and one output group. Connecting the source to the input group produces the corresponding aggregated row set in the output group. The resulting output includes the column used by the GROUP BY or the HAVING clause.
The Aggregator operator contains the following properties:
Group By Clause: Defines how the incoming row set is grouped to return a single summary row for each group. An ordered list of attributes in the input group specifies how this grouping is performed. Adding an attribute in the GROUP BY clause defaults the aggregation expression to NONE.
Having Clause: A boolean condition restricting the groups of rows returned in the output group to those groups for which this condition is true. If this clause is not specified, all summary rows for all groups are returned in the output group.
Expression: Defines the aggregation functions to be performed on the attribute. For each ungrouped output attribute, select whether the aggregation expression should be a DISTINCT or ALL result. All is the default setting. For example,
ALL:
Select AVG(ALL sal) from emp;
DISTINCT:
Select AVG(DISTINCT sal) from emp;
A DISTINCT result removes all duplicate rows before the average is calculated, ensuring a correct result in the preceding example.
An ALL result returns an average value on all rows.
If no aggregation function is necessary, specify NONE for the function. Specifying NONE on the attribute aggregation automatically adds the attribute to the resulting GROUP BY function.
To use an Aggregator operator in a mapping:
Drop an Aggregator operator onto the Mapping Editor canvas.
On the canvas, connect source attributes to the input group of the Aggregator operator.
Right-click the Aggregator and select Edit.
Warehouse Builder displays the Operator Editor.
On the Output Attributes tab, select Add to add output attributes.
Click OK to close the Operator Editor.
From the attribute properties window, define expressions for each output attribute. For detailed instructions, see "Defining the Output Attributes from Aggregator Operators".
Define the Group By Clause for the operator. For detailed instructions, see "Defining the Output Attributes from Aggregator Operators".
Defining the Output Attributes from Aggregator Operators
To define expressions for output attributes:
In the Aggregator operator on the mapping canvas, right click an output attribute and select Attribute Properties.
Click the … button to the right of the Expression property.
The Expression dialog displays as shown in Figure 8-3.
Figure 8-3 Expression Dialog
Select a Function, ALL or DISTINCT, and a parameter from the drop-down lists.
Click OK.
To define the Group By Clause:
In the Aggregator operator on the mapping canvas, right click an output attribute and select Attribute Properties.
Click the … button to the right of the Group By Clause property.
The Group By Clause dialog displays as shown in Figure 8-4.
Figure 8-4 Group By Clause Dialog
Move the attributes from the Available Attributes list to the GROUP BY Attributes list.
Click OK.
Click the … button to define a HAVING clause using the Expression Builder.
Create an expression such as
sum(INGRP1.OR_TOTAL) > 10000
as shown in Figure 8-5.Figure 8-5 Expression Builder Showing a Sum Statement
Map the attributes you edited from the output group of the aggregator operator to the attributes in the target.
Constant Operator
The Constant operator enables you to define constants. Constants are initialized at the beginning of the execution of the mapping. Constant values can be used anywhere in a mapping.
The Constant operator produces a single output group that can contain one or more constant attributes. For any defined constant data type, the output expression must be a valid SQL expression returning a value of the same data type. For VARCHAR, CHAR, or VARCHAR2 data types, you must enclose constant string literals within single quotes. For example,
'my_string'
.The Constant operator contains the following property:
Expression: Defines the constant value represented by the attribute.
To use a constant operator in a mapping:
Drop a Constant operator onto the Mapping Editor canvas.
Right-click the operator and select Edit.
Warehouse Builder displays the Operator Editor.
On the Output Attributes tab, select add to create an output attribute.
Click OK to close the Operator Editor.
On the operator in the mapping canvas, right-click an attribute and select Attribute Properties.
The Attributes properties window displays.
Enter an expression in the Expression field or click … to define an expression using the Expression Builder shown in Figure 8-6. The length, precision, and scale properties assigned to the attributes of the Constant operator must match the actual values returned by the expressions defined in the mapping.
Figure 8-6 Expression Builder Showing A Constant
Select OK.
Connect the output attribute to the appropriate target attribute.
Data Generator Operator
Use a Data Generator operator to provide information such as record number, system date, and sequence values.
Recommendation:
For PL/SQL mappings use a Constant Operator or Mapping Sequence Operator instead of a Data Generator.For mappings with flat file sources, the Data Generator operator also provides a place to enter constant information. For mappings with Flat File sources and targets, the Data Generator operator connects the mapping to SQL*Loader to generate the data stored in the database record.
The following functions are available:
RECNUM
SYSDATE
SEQUENCE
It is possible for Warehouse Builder to generate data by specifying only sequences, record numbers, system dates, and constants as field specifications. SQL*Loader inserts as many records as are specified by the LOAD keyword.
Setting a Column to the Data File Record Number
Use the RECNUM keyword to set an attribute to the number of the records that the record was loaded from. Records are counted sequentially from the beginning of the first data file, starting with record 1. RECNUM increments as each logical record is assembled. It increments for records that are discarded, skipped, rejected, or loaded. For example, if you use the option SKIP=10, the first record loaded has a RECNUM of 11.
Setting a Column to the Current Date
A column specified with SYSDATE gets the current system date, as defined by the SQL language SYSDATE function.
The target column must be of type CHAR or DATE. If the column is of type CHAR, then the date is loaded in the format dd-mon-yy. After the load, you only access it in that format. If the system date is loaded into a DATE column, then you can only access it in a variety of formats including the time and the date. A new system date/time is used for each array of records inserted in a conventional path load and for each block of records loaded during a direct path load.
Setting a Column to a Unique Sequence Number
The SEQUENCE keyword ensures a unique value for a column. SEQUENCE increments for each record that is loaded or rejected. It does not increment for records that are discarded or skipped.
The combination of column name and the SEQUENCE function is a complete column specification. Table 8-1 lists the options available for sequence values.
Table 8-1 Sequence Value Options
Value DescriptionIf a record is rejected because of a format error or an Oracle error, the generated sequence numbers are not reshuffled to mask this. For example, if four rows are assigned sequence numbers 10, 12, 14, and 16 in a column, and the row with 12 is rejected, the three rows inserted are numbered 10, 14, and 16, not 10, 12, 14. The sequence of inserts is preserved despite data errors. When you correct the rejected data and reinsert it, you can manually set the columns to match the sequence.
Although the Data Generator operator has only one output group, it has predefined attributes corresponding to Record Number, System Date, and a typical Sequence. While modification of these attributes is not recommended, you can create new attributes. The Data Generator operator is only valid for a SQL*Loader mapping.
Note:
There can only be one Data Generator operator in a mapping.The Data Generator attribute contains the following property:
Expression: Expression to use when you map this attribute. Make sure the value entered for the expression is valid SQL*Loader syntax.
To use a data generator in a mapping:
Drop a Data Generator operator onto the Mapping Editor canvas.
Right-click the operator and select Edit.
Warehouse Builder displays the Operator Editor.
Select the Output Attributes tab.
Warehouse Builder displays the pre-defined output attributes RECNUM, SYS_DATE, and SEQUENCE.
On the Output Attributes tab, define the properties and type an optional description for the predefined output attributes.
Click OK to close the Operator Editor.
On the operator in the mapping canvas, right-click the RECUM attribute and select Attribute Properties.
Warehouse Builder displays the Attribute Properties window.
In the expression field, click the ... to open the Expression Builder and define an expression.
Repeat steps 6 and 7 for the SEQUENCE attribute.
Deduplicator Operator
The Deduplicator enables you to remove duplicate data in a source by placing a DISTINCT clause in the select code represented by the mapping.
To remove duplicates:
Drop the Deduplicator operator onto the Mapping Editor canvas.
Connect the attributes from the source operator to the group input/output of the Deduplicator operator.
Connect the attributes from the Deduplicator operator group to the attributes of the target operator.
Expression Operator
The Expression operator enables you to write SQL expressions that define non-procedural algorithms for one output parameter of the operator.
The expression text can contain combinations of input parameter names, variable names, and library functions. Use the Expression operator to transform the column value data of rows within a row set using SQL-type expressions, while preserving the cardinality of the input row set. To create these expressions, open the Attribute properties window for the output attribute and then open the Expression Builder.
The Expression operator contains only one input group and one output group. These are created automatically when you drop the operator onto the Mapping Editor canvas.
The output expressions for this operator cannot contain any aggregation functions. To use aggregation functions, use the Aggregator operator. See "Aggregator Operator".
Every output attribute in an Expression operator contains the following properties:
Data Type: The data type of the attribute.
Precision: The precision of the attribute, used for numeric type attributes only.
Scale: The scale of the attribute, used for numeric type attributes only.
Length: The length of the attributes, used for character type attributes only.
Expression: The expression template for the output attribute. For code generation, the input attributes are replaced by the input attribute names in the expression template.
To use an expression operator in a mapping:
Drop an Expression operator onto the Mapping Editor canvas.
Connect the appropriate source attributes to the Expression input group.
Warehouse Builder copies the input attributes into the operator.
Right-click the operator in the mapping canvas and select Edit.
Warehouse Builder displays the Operator Editor.
On the Output Attributes tab, select Add and specify the attribute name, data type, and other properties.
Click OK to close the Operator Editor.
From the Expression operator, right-click the output attribute and select the Attribute Properties.
Click the field to the right of the Expression property and enter a filter condition expression or click ... to open the Expression Builder and define an expression.
Close the Attribute Properties window.
Connect the Expression output attribute to the appropriate target attribute.
You can conditionally filter out rows using the Filter operator.
The Filter operator filters data from a source to a target by placing a WHERE clause in the code represented by the mapping. You connect a source operator to the Filter operator, apply a filter condition, and send a subset of rows to the next operator.
A Filter operator has only one input/output group that can be connected to both a source and target row set. The resulting row set is a filtered subset of the source row set based on a boolean filter condition expression.
The Filter operator contains the following property:
Filter Condition: The boolean condition that determines which rows are sent to the output row set.
When you generate a mapping that includes a Filter operator, the Code Viewer displays filter condition expressions as a WHERE clause for set-based view mode. The filter input names in the original filter condition are replaced by actual column names from the source table, qualified by the source table alias.
To use a filter operator in a mapping:
Drag and drop the Filter operator onto the Mapping Editor canvas.
Connect source attributes to the input attribute for the Filter operator.
Right-click the Filter operator header and select Operator Properties.
The Filter Properties window displays.
Click the field to the right of the Filter Condition property and enter a filter condition expression. Or click ... to define a filter condition using the Expression Builder as shown in Figure 8-7.
Figure 8-7 Expression Builder Showing a Filter Condition
Click OK in the Expression Builder and close the Filter Properties window.
Connect the Filter operator outputs to the input/output group in the target.
Joiner Operator
You can use the Joiner operator to join multiple row sets from different sources with different cardinalities, and produce a single output row set.
The Joiner operator uses a boolean condition that relates column values in each source row set to at least one other row set.
Note:
Operators placed between data sources and a Joiner can generate complex SQL or PL/SQL.
If the input row sets are related through foreign keys, that relationship is used to form a default join condition. You can use this default condition or you can modify it. If the sources are not related through foreign keys, then you must define a join condition.
If the default foreign keys result in duplicate WHERE clauses, the Joiner operator will remove the duplicate clauses. This can happen if the join condition references several foreign keys. For example, if table T1 has a foreign key FK1 point to unique key UK1 in table T2 and table T2 has a foreign key FK2 pointing to unique key UK2 in T1, the resulting join condition
T1.A = T2.A AND T1.B = T2.B /*All instances of FK1 -> UK1 are reduced to one where clause*/ AND T2.B = T1.B AND T2.C = T1.C /*All instances of FK2 -> UK2 are reduced to one where clause*/is generated by the Joiner operator as
T2.A = T2.A AND T1.B = T2.B AND T2.C = T1.CThe Joiner operator contains the following properties:
Join Condition: The text expression template for the Join Condition. For code generation, the input attributes are replaced by the source columns. The expression is a valid SQL expression that can be used in a WHERE clause.
Note:
The join condition is defined in a PL/SQL context. For SAP sources, Warehouse Builder can generate ABAP code by interpreting the PL/SQL join condition in the ABAP context. ABAP can only join over defined foreign key relationships.Attributes in the Joiner operator contain the following properties:
Data Type: The data type of the attribute.
Precision: The precision of the attribute, used for numeric type attributes only.
Scale: The scale of the attribute, used for numeric type attributes only.
Length: The length of the attributes, used for string type attributes only.
To use a Joiner operator in a mapping:
Drag and drop the Joiner operator onto the Mapping Editor canvas.
Connect an output group from the first source to the Joiner input group.
The output attributes are created with data types matching the corresponding input data types.
Connect a group from the second source operator to the INGRP2 group of the Joiner operator.
Figure 8-8 Joiner Operator in a Mapping
Right-click the Joiner operator header and select Operator Properties.
The Joiner properties window displays.
Enter a join condition in the Join Condition field or click … to define an expression using the Expression Builder.
Close the Joiner property window.
Joiner Restrictions
The join condition expression cannot contain aggregation functions, such as SUM. Compile errors result when deploying the generated code for the mapping. A Joiner can have an unlimited number of input groups but only one output group.
The order of input groups in a joiner is used as the join order. The major difference between ANSI join and an Oracle join is that ANSI join must clearly specify join order. An Oracle join does not require it.
The filter condition is applied after join. For example, consider the following join:
Input1.c --- + Input2.c --- +---> Joiner Input3.c --- +with the following conditions:
Condition 1: Input1.c (+) = Input2.c (+)
Condition 2: Input2.c = Input3.c
Condition 3: Input1.c is null
The first two conditions are true joins while the third is a filter condition. If ANSI code is to be generated, Warehouse Builder interprets the statement as
select ... from Input1 full outer join Input2 on (Input1.c = Input2.c) join Input3 on (Input2.c = Input3.c) where Input1.c is not null;Specifying a Full Outer Join
If your target warehouse is based on Oracle9i or a later version, the Warehouse Builder joiner also supports the full outer join. To specify a full outer join condition, you must place the (+) sign on both sides of a relational operator. For example,
T1.A (+) = T2.B (+)The results of the full outer join are as follows:
Rows from sources T1 and T2 that satisfy the condition T1.A = T2.B.
Rows from source T1 that do not satisfy the condition. Columns corresponding with T2 are populated with nulls.
Rows from source T2 that do not satisfy the condition. Columns corresponding with T1 are populated with nulls.
Note:
The relational operator is not restricted to equality. You can also use other operators such as, >, <, !=, >=, <= .When using the Oracle SQL syntax for partial outer join such as T1.A = T2.B (+), if you place a (+) sign on both sides of the relational operator, it is invalid Oracle SQL syntax. However, Warehouse Builder translates any condition with the double (+) sign into ANSI SQL syntax. For example,
SELECT ... FROM T1 FULL OUTER JOIN T2 ON (T1.A = T2.B);When using full outer join, keep in mind the following:
If you specify a full outer join condition for a non-Oracle9i target system type, you will receive a validation error and the code will not be generated.
The ANSI join syntax is generated only if you specify a full outer join condition in the joiner. Otherwise, the following Oracle proprietary join syntax is generated:
SELECT ... FROM T1, T2 WHERE T1.A = T2.B;The input group order is used as the ANSI join order. If you are joining more than two input groups, the order of the input groups determines the join order. Unlike Oracle's proprietary join syntax, the ANSI join syntax requires you to explicitly specify the join order.
When you create a joiner, you must order the input groups exactly in the order you want to join them. For example, if you create three input groups in the order T1, T2, T3 and the join condition is
T1.A (+) = T2.A (+) and T2.A = T3.AWarehouse Builder generates the following:
SELECT ... FROM T1 FULL OUTER JOIN T2 ON (T1.A=T2.A) JOIN T3 ON (T2.A=T3.A);If you create input groups in another order, such as T1, T3, T2. Warehouse Builder will generate the following:
SELECT ... FROM T1 JOIN T3 ON (1=1) JOIN T2 ON (T1.A=T2.A and T2.A=T3.A);When T1 and T3 are joined, there is no join condition specified. Warehouse Builder fills in a condition 1=1 (essentially a boolean true) and the two conditions you specified are used to join T2.
You can specify both full outer join and join conditions in the same joiner. However, if both conditions are specified for the same sources, the stronger join type is used for generating code. For example, if you specify:
T1.A(+) = T2.A(+) and T1.B = T2.BWarehouse Builder will generate a join statement instead of a full outer join because
T1.B = T2.B
is stronger than the full outer join condition between T1 and T2.You cannot specify a full outer join and partial outer join condition in the same joiner. If you specify a full outer join, then you cannot specify a partial outer join anywhere in the join condition. For example,
T1.A (+) = T2.A (+) amd T2.B = T3.B (+)
will cause validation errors and code will not be generated.Creating Full Outer Join Conditions
In an equijoin, key values from the two tables must match. In a full outer join, key values are matched and nulls are created in the resulting table for key values that cannot be matched. A left or a right outer join retains all rows in the specified table.
In Oracle8i, you create an outer join in SQL using the join condition variable (+):
SELECT ... FROM A, B WHERE A.key = B.key (+);This example is a left outer join. Rows from table A are included in the joined result even though no rows from table B match them. To create a full outer join in Oracle8i, you must use multiple SQL statements.
The Expression Builder allows the following syntax for a full outer join:
TABLE1.COL1 (+) = TABLE2.COL2 (+)This structure is not supported by Oracle8i. Oracle Database is ANSI SQL 1999 compliant. The ANSI SQL 1999 standard includes a solution syntax for performing full outer joins. The code generator translates the preceding expression into an ANSI SQL 1999 full outer join statement, similar to:
SELECT ... FROM table1 FULL OUTER JOIN table2 ON (table1.col1 = table2.col2)Because the full outer join statement complies to ANSI SQL 1999, it is only valid if the generated code is deployed to an Oracle Database database. Specifying a full outer join to an Oracle8i database results in a validation error.
A full outer join and a partial outer join can be used together in a single SQL statement, but it must in an AND or an AND/OR condition. If a full outer join and partial outer join are used in the OR condition, an unexpected AND condition will result. For example,
SELECT ... FROM table1 FULL OUTER JOIN table2 ON (A = B or C = D)is evaluated by Oracle Server as
A (+) = B (+) AND C = D
.To use a full outer join in a mapping:
Follow steps one through four for adding a Joiner operator.
Enter a full outer join statement in the Join Condition field or click … to define an expression using the Expression Builder.
Close the Joiner operator property window.
Key Lookup Operator
Use the Key Lookup operator to lookup data from a table, view, cube, or dimension.
For example, you can use the Key Lookup operator when you define a mapping that loads a cube and when you define surrogate keys on the dimension. In this example, you create a Key Lookup operator that looks up the surrogate key in the dimension table and returns the corresponding original record to form the foreign key relationship.
The table, view, cube, or dimension is bound to the Key Lookup operator. You can have multiple Key Lookup operators in the same mapping.
The key that you look up can be any unique value. It does not need to be a primary or unique key, as defined in an RDBMS. The Key Lookup operator reads data from a lookup table using the key input you supply and finds the matching row. This operator returns a row for each input key.
The output of the Key Lookup operator corresponds to the columns in the lookup object. If multiple rows in the lookup table match the key inputs, the cardinality of the output differs from the input. This produces results inconsistent with the data flowing into the target operator and generates an error at runtime. To ensure that only a single lookup row is found for each key input row, use keys in your match condition.
You can use inbound reconciliation on Key Lookup outputs. Outbound reconciliation is disabled. See "Reconciling Operators and Repository Objects" beginning for more information.
Each output attribute for the key lookup has a property called DEFAULT VALUE. The DEFAULT VALUE property is used instead of NULL in the outgoing row set if no value is found in the lookup table for an input value. The generated code uses the NVL function. The Key Lookup always results in an outer-join statement.
When you validate this operator:
Warehouse Builder displays a warning if the condition does not use a complete unique key in the lookup table. If a unique key is not used, multiple rows may be retrieved for a single input row.
Warehouse Builder displays a warning if the condition contains an equal comparison between attributes of mismatched data types. Although the Oracle database performs implicit conversions, the results may cause a runtime error.
To use a Key Lookup operator in a mapping:
Drop a Key Lookup operator onto the Mapping Editor canvas.
Warehouse Builder displays the Add Mapping Key Lookup dialog.
Use the Add Mapping Key Lookup dialog to select one or more tables.
Warehouse Builder adds to the mapping canvas one Key Lookup operator for each table you select. For more information on using the Add Mapping Key Lookup dialog, see "Adding Bindable Operators".
Connect the source attributes to the input group of the Key Lookup operator.
Right-click the Key Lookup operator header and select Operator Properties.
Warehouse Builder displays the Key Lookup Properties window.
Click the field to the right of the Lookup Condition and click the ... button.
The Lookup Condition dialog displays as shown in Figure 8-9.
Figure 8-9 Lookup Condition Dialog
Select the lookup entity attribute from the Lookup Table Column or Key drop-down list shown in Figure 8-10.
Choose attributes to compare to the selected lookup table column.
For a non-composite key, select an input attribute or a key from the drop-down list. Click Add to List. The column or input pairs are added to the table at the bottom of the main dialog.
For a composite key, click Add to List. The Match Key Columns to Input for Key dialog displays. Select the key input attributes from the Input Attribute drop-down list. Click OK.
Figure 8-10 Match Key Columns to Input for Key Dialog
Click OK to close the Lookup Condition dialog.
Mapping Advanced Queue Operator
Mapping Advanced Queue (AQ) operators are used to propagate messages in queues from source to target systems.
In Warehouse Builder, you can use AQs as sources or targets. You can map an AQ source object to a target table or a staging table and then deploy the mapping to your target database. When you run the AQ mapping, the changes are propagated from the AQ to the target table in your warehouse. After a mapping runs successfully, subsequent invocations of the same mapping will not see the messages that have already been processed. If the mapping fails, no messages will be dequeued from the source AQs.
Warehouse Builder also enables you to use AQs as targets in your warehouse. Source AQs in your repository may represent a central AQ that integrates data from different messaging systems or applications. Warehouse Builder enables you to map this source AQ to a target AQ and propagate the messages from one type of messaging system to another.
For more information, see:
Creating Definitions for Advanced Queues
Creating AQ Mappings
To use a Mapping Advanced Queue operator in a mapping:
Drag and drop the Mapping Advanced Queue operator icon onto the mapping canvas.
The Add Mapping Advanced Queue dialog displays.
You can choose to import new AQ definitions into the repository or select a previously imported AQ from the repository.
Each AQ bound to an AQ operator in a mapping must belong to the same warehouse module as the mapping. If not, you will receive an error when you validate the mapping.
For more information on how to use the Add Mapping Advanced Queue dialog, see "Adding Bindable Operators".
Click OK.
Connect the output attribute of a Mapping AQ operator to the input group of a target operator.
Example of an AQ in a Mapping
A local company assigns a customer service representative to every customer by region. As the customer base increases, the company hires new service representatives to manage new and existing customers. The new service representative assignments to customers are tracked by using queues. You create an AQ with messages capturing the changes made to the Customer table.
You can import this AQ into Warehouse Builder and use it as a data source in a mapping. Connect the AQ source to the Customer target table on the mapping editor.Warehouse Builder generates PL/SQL scripts for this mapping, which you can deploy to your target warehouse. When you run the mapping, the messages are dequeued and the new assignments of the service representatives in the source Customer table are propagated to the Customer table in the target database.
If you register each mapping in Warehouse Builder as a separate subscriber of each AQ, then different mappings referencing the same AQ do not interfere with each other. Each mapping sees all the messages in the AQ published to all its subscribers.
Reconciling Advanced Queues
You can perform an inbound reconciliation on an AQ operator to update it with any changes in the repository definition of the AQ object to which it is bound. For more information, see "Inbound Reconciliation". Warehouse Builder does not currently enable you to outbound reconcile an AQ operator.
Advanced Queue Operator Properties
You can configure the following properties for a Mapping AQ operator used in a mapping:
From the Mapping Editor, right-click the AQ operator and select Operator Properties.
Warehouse Builder displays the Mapping Advanced Queue Properties dialog containing the Bound Name property. This name is used to identify the AQ during code generation.
From the Mapping Editor, right-click the AQ operator and select Edit.
Warehouse Builder displays the Mapping Advanced Queue Editor containing three tabs: General, Groups, Input/Output.
General
Rename the Mapping AQ operator by highlighting the previous name and typing over it. Type an optional description for the Mapping AQ operator.
Groups
Rename the Mapping AQ operator group by highlighting the previous name and typing over it. Each AQ operator has exactly one INOUT group. This is a read-only field. Type an optional description for the Mapping AQ operator group.
Input/Output
You cannot add more attributes to the INOUT group of an AQ operator. You can view and edit the attribute name, datatype, length, precision, scale, and optional description.
AQ Prerequisites for Mapping Execution
AQs deployed from Warehouse Builder are proxy AQs that represent the AQ source on a remote system. Because Oracle does support remote dequeuing of messages, you need to first deploy an agent of the source AQ in your local schema and register the locally deployed proxy AQ as a subscriber to the AQ on the source system. For the Warehouse Builder AQ mappings to run correctly, you must follow these steps in your source and warehouse systems:
Grant the following privileges to your AQ source system user:
execute on dbms_aq execute on dbms_aqadm execute on aq$_agentGrant the following privileges to your warehouse system user:
execute on dbms_aq execute on dbms_aqadm execute on aq$_agentFrom your AQ source system, create a database link to the AQ target system.
The target user can now deploy the target proxy AQ to the target instance.
When you configure AQs for deployment, make sure that both the AQ object and the mapping that has AQ operators bound to this AQ object are deployed in the same location.
Before deploying a mapping, you must successfully deploy the temporary tables associated with the AQ object to which the Mapping AQ operator is bound.
After deploying the proxy AQ to the target system, you must register the locally deployed proxy AQ as a subscriber to the AQ on the source system. See "Registering AQ as a Subscriber".
Concurrent executions of the same mapping using the AQ source operators is not supported.
If an AQ is a target in a mapping, make sure that this target AQ has at least one subscriber before the mapping is executed. Otherwise, you will receive a runtime error during mapping execution.
The maximum number of subscribers to an AQ cannot exceed 1024. For an AQ q1, the number of mappings that have a source AQ staging component bound to q1 should be less than 1024.
The upgrade of a mapping which contains an AQ may result in the loss of data contained in the AQ before the upgrade. This is because the subscriber of the proxy AQ will be dropped and re-created during the upgrade.
Registering AQ as a Subscriber
If an AQ is defined in your source system, you need to follow these manual steps to register the deployed proxy AQ as a subscriber to the source AQ and schedule propagation of the AQ to your target warehouses:
Create a DB link from your source system to your data warehouse.
Register the AQ as a subscriber to all messages.
Schedule propagation of AQ from source to the target system.
The following is an example of how you can register the AQ as a subscriber and schedule its propagation to the target system.
declare subscriber sys.aq$_agent; begin subscriber := sys.aq$_agent('subscribername','schema.queue2',0); -- schema.queue2 refers to the queue that is subscribing dbms_aqadm.add_subscribe(queue_name=>'QUEUE1',subscriber=>subscriber,rule=>' ',transformation=>''); -- QUEUE1 refers to the queue that 'queue2 is subscribing to dbms_aqadm.schedule_propagation('SCHEMA.QUEUE1',NULL,SYSDATE,'','','60'); -- SCHEMA.QUEUE1 are the schema and queuename of queue1. This command starts the message propagation from queue1 to queue2 with a latency of 60 secondsMapping Flat File Operator
You can use a Mapping Flat File operator as either a source or target.
However, the two are mutually exclusive within the same mapping. There are differences in code generation languages for flat file sources and targets. Subsequently, mappings can contain a mix of flat files, relational objects, and transformations, but with the restrictions discussed further in this section.
Flat File Source Operators
As a source, the Mapping Flat File operator acts as the row set generator that reads from a flat file using the SQL*Loader utility. You cannot use a flat file source operator if you are mapping to a flat file target or to an external table. When you design a mapping with a flat file source, you can use the following operators:
Other relational target objects, excluding the External Table operator.
Note:
If you use the Sequence, Expression, or Transformation operators, you cannot use the SQL*Loader Direct Load setting as a configuration parameter.When you use a flat file as a source in a mapping, remember to create a connector from the flat file source to the relational target for the mapping to deploy successfully.
Flat File Target Operators
A mapping with a flat file target generates a PL/SQL package that loads data into a flat file instead of loading data into rows in a table. You can either use an existing flat file, or identify a new flat file as a target during design and define it using outbound reconciliation. See "Outbound Reconciliation".
Note:
A mapping can contain a maximum of 50 flat file target operators at one time.When you use a flat file as a target, understanding both flat files and external tables will help you determine which feature to use. If you are loading large volumes of data, loading to a flat file enables you to use the DIRECT PATH SQL*Loader option, which results in better performance. If you are not loading large volumes of data, you can benefit from many of the relational transformations available in the external table feature. Refer to "External Tables versus Flat File Operators" for more information.
If you use a multiple-record-type flat file as a target, you can only map to one of the record types. If you want to load all of the record types in the flat file from the same source, you can either drop the same flat file into the mapping as a target again and map to a different record type, or create a separate mapping for each record type you want to load.
You have the following options for Mapping Flat File operators:
Using Previously Imported Flat Files
Importing and Binding New Flat Files into Your Mapping
Defining New Flat File Sources or Targets in Mappings
Using Previously Imported Flat Files
This scenario describes using a flat file object that has been previously imported and sampled. You can find instructions for importing and sampling flat file objects in Chapter 4, "Importing Data Definitions".
To use a previously imported flat file as a source or a target:
Drag and drop a Mapping Flat File operator onto your mapping editor canvas.
The Add File dialog displays.
Click Select from existing repository file and bind.
Highlight the desired flat file object.
Click OK.
Prior to deployment, you must follow the steps for "Configuring Flat File Operators" and for "Configuring Mappings Reference".
Importing and Binding New Flat Files into Your Mapping
In this option, you use a flat file as either a source or target, but you have not yet imported or sampled the flat file. You import the flat file as you design your mapping.
To import and bind a new flat file into your mapping:
Drop a Mapping Flat File onto your mapping canvas.
The Add File dialog displays.
Click Import file into repository and bind.
Highlight the module into which you want to import the flat file.
Click OK.
The Import File Wizard Welcome page displays. For more information on importing Flat Files, see "Using the Import Metadata Wizard Flat Files".
At the Welcome page, click Next.
The Import File Wizard Object Selection page displays.
Navigate to the location and select the flat file.
Click Next.
The Import File Wizard Summary and Import page displays. A red ball next to the file indicates that there is no metadata information available about the file structure.
If the flat file has the same structure as another file you have already imported and sampled, click the Same As field and select the identical file from the drop list.
If there are no previously imported or sampled files with this file structure, click Sample and follow the instructions for "Using the Flat File Sample Wizard".
Defining New Flat File Sources or Targets in Mappings
As you build your mapping, you can create a new flat file object by selecting the Create unbound mapping file with no attributes option. Using this method, the only option is to create a comma-delimited, single-record-type flat file operator. You can leave the flat file unbound, or you can outbound reconcile to the repository.
Note:
You can only perform outbound reconciliation if the flat file is new to that repository. For more information on reconciliation, refer to "Reconciling Operators and Repository Objects".To use a new unbound Flat File operator in a mapping:
Drag and drop a Mapping Flat File operator onto the mapping canvas.
The Add File dialog displays.
Click Create unbound mapping File with no attributes.
Enter the new file name.
Click OK.
The Mapping Flat File operator with no attributes appears in your mapping.
Define attributes for the new flat file operator by right-clicking on the operator and choosing Edit. Enter the attributes manually on the Input/Output tab, or by auto-mapping them from another operator.
For instructions on defining attributes, refer to "Editing Operators".
Right-click the flat file operator and select Reconcile Outbound to create a new repository flat file.
Select the flat file module in which you want to create the new flat file.
Click OK.
A new comma-delimited flat file is created in your repository.
Prior to deployment, you must follow the steps for "Configuring Flat File Operators" and for "Configuring Mappings Reference".
Extracting a Master-Detail Structure from Flat Files
If you are extracting data from a multiple-record-type file with a master-detail structure and mapping to tables, add a Mapping Sequence operator to the mapping to retain the relationship between the master and detail records through a surrogate primary key or foreign key relationship. A master-detail file structure is one where a master record is followed by its detail records. In Example 8-1, records beginning with "E" are master records with Employee information and records beginning with "P" are detail records with Payroll information for the corresponding employee.
Example 8-1 A Multiple-Record-Type Flat File with a Master-Detail Structure
E 003715 4 153 09061987 014000000 "IRENE HIRSH" 1 08500 P 01152000 01162000 00101 000500000 000700000 P 02152000 02162000 00102 000300000 000800000 E 003941 2 165 03111959 016700000 "ANNE FAHEY" 1 09900 P 03152000 03162000 00107 000300000 001000000 E 001939 2 265 09281988 021300000 "EMILY WELLMET" 1 07700 P 01152000 01162000 00108 000300000 001000000 P 02152000 02162000 00109 000300000 001000000In Example 8-1, the relationship between the master and detail records is inherent only in the physical record order: payroll records correspond to the employee record they follow. However, if this is the only means of relating detail records to their masters, this relationship is lost when Warehouse Builder loads each record into its target table.
Maintaining Relationships Between Master and Detail Records
You can maintain the relationship between master and detail records if both types of records share a common field. If Example 8-1 contains a field Employee ID in both Employee and Payroll records, you can use it as the primary key for the Employee table and as the foreign key in the Payroll table, thus associating Payroll records to the correct Employee record.
However, if your file does not have a common field that can be used to join master and detail records, you must add a sequence column to both the master and detail targets (see Table 8-2 and Table 8-3) to maintain the relationship between the master and detail records. Use the Mapping Sequence operator to generate this additional value.
Table 8-2 represents the target table containing the master records from the file in Example 8-1. The target table for the master records in this case contains employee information. Columns E1-E10 contain data extracted from the flat file. Column E11 is the additional column added to store the master sequence number. Notice that the number increments by one for each employee.
Table 8-2 Target Table Containing Master Records
Table 8-3 represents the target table containing the detail records from the file in Example 8-1. The target table for the detail records in this case contains payroll information, with one or more payroll records for each employee. Columns P1-P6 contain data extracted from the flat file. Column P7 is the additional column added to store the detail sequence number. Notice that the number for each payroll record matches the corresponding employee record in Table 8-2.
Table 8-3 Target Table Containing Detail Records
Extracting and Loading Master-Detail Records
This section contains instructions on creating a mapping that extracts records from a master-detail flat file and loads those records into two different tables. One target table stores master records and the other target table stores detail records from the flat file. The Mapping Sequence is used to maintain the master-detail relationship between the two tables.
Note:
These instructions are for conventional path loading. For instructions on using direct path loading for master-detail records, see "Direct Path Loading for Performance".
This procedure outlines general steps for building such a mapping. Additional detailed instructions are available:
For additional information on importing flat file sources, see "Using the Import Metadata Wizard Flat Files".
For additional information on using the Mapping Flat File as a source, see "Flat File Source Operators".
For additional information on using Mapping Tables, see "Adding Bindable Operators".
For additional information on using Mapping Sequences, see "Mapping Sequence Operator".
For additional information on configuring mappings, see "Configuring Mappings Reference".
To extract from a master-detail flat file and maintain master-detail relationships:
Import and sample a flat file source that consists of master and detail records.
When naming the record types as you sample the file, assign descriptive names to the master and detail records, as shown in Figure 8-11. This makes it easier to identify those records in the future.
Figure 8-11 shows the Flat File Sample Wizard for a multiple-record-type flat file containing department and employee information. The master record type (for employee records) is called EmployeeMaster, while the detail record type (for payroll information) is called PayrollDetail.
Figure 8-11 Naming Flat File Master and Detail Record Types
Drop a Mapping Flat File operator onto the mapping editor canvas and specify the master-detail file from which you want to extract data.
Drop a Mapping Sequence operator onto the mapping canvas.
Drop a Mapping Table operator for the master records onto the mapping canvas.
You can either select an existing repository table that you created earlier or create a new unbound mapping table with no attributes. You can then map or copy all required fields from the master record of the file operator to the master table operator (creating columns) and perform an outbound reconciliation to define the table later.
The table must contain all the columns required for the master fields you want to load plus an additional numeric column for loading sequence values, as shown in Figure 8-12.
Figure 8-12 Adding a Sequence Column to the Master and Detail Target Tables
Drop a Mapping Table operator for the detail records onto the mapping canvas.
You can either select an existing repository table that you created earlier or create a new unbound mapping table with no attributes. You can then map or copy all required fields from the master record of the file operator to the master table operator (creating columns) and perform an outbound reconcile to define the table later.
The table must contain all the columns required for the detail fields you want to load plus an additional numeric column for loading sequence values.
Map all of the necessary flat file master fields to the master table and detail fields to the detail table, as shown in Figure 8-13.
Map the Mapping Sequence
NEXTVAL
attribute to the additional sequence column in the master table, as shown in Figure 8-13.Map the Mapping Sequence
CURRVAL
attribute to the additional sequence column in the detail table, as shown in Figure 8-13.Figure 8-13 shows a completed mapping with the flat file master fields mapped to the master target table, the detail fields mapped to the detail target table, and the
NEXTVAL
andCURRVAL
attributes from the Mapping Sequence mapped to the master and detail target tables, respectively.Figure 8-13 Completed Mapping from Master-Detail Flat File to Two Target Tables
Configure the mapping with the following parameters:
Direct Mode: False
Errors Allowed: 0
Row: 1
Trailing Nullcols: True (for all tables)
Error Handling Suggestions
This section contains error handling recommendations for files with varying numbers of errors.
If your data file almost never contains errors:
Create a mapping with a Sequence operator (see "Mapping Sequence Operator").
Configure a mapping with the following parameters:
Direct Mode= false
ROW=1
ERROR ALLOWED = 0
Generate the code and run an SQL Loader script.
If the data file has errors, the loading stops when the first error happens.
Fix the data file and run the control file again with the following configuration values:
CONTINUE_LOAD=TRUE
SKIP=
number of records already loaded
If your data file is likely to contain a moderate number of errors:
Create a primary key (PK) for the master record based on the
seq_nextval
column.Create a foreign key (FK) for the detail record based on the
seq_currval
column which references the master table PK.In this case, master records with errors will be rejected with all their detail records. You can recover these records by following these steps.
Delete all failed detail records that have no master records.
Fix the errors in the bad file and reload only those records.
If there are very few errors, you may choose to load the remaining records and manually update the table with correct sequence numbers.
In the log file, you can identify records that failed with errors because those errors violate the integrity constraint. The following is an example of a log file record with errors:
Record 9: Rejected - Error on table "MASTER_T", column "C3".ORA-01722: invalid number Record 10: Rejected - Error on table "DETAIL1_T".ORA-02291: integrity constraint (SCOTT.FK_SEQ) violated - parent key not found Record 11: Rejected - Error on table "DETAIL1_T".ORA-02291: integrity constraint (SCOTT.FK_SEQ) violated - parent key not found Record 21: Rejected - Error on table "DETAIL2_T".ORA-02291: invalid numberIf your data file always contains many errors:
Load all records without using the Mapping Sequence operator.
Load the records into independent tables. You can load the data in Direct Mode, with the following parameters that increase loading speed:
ROW>1
ERRORS ALLOWED=MAX
Correct all rejected records.
Reload the file again with a Sequence operator (see "Mapping Sequence Operator").
Direct Path Loading for Performance
If you are using a master-detail flat file where the master record has a unique field (or if the concatenation of several fields can result in a unique identifier), you can use Direct Path Load as an option for faster loading.
For direct path loading, the record number (
RECNUM
) of each record is stored in the master and detail tables. A post-load procedure uses theRECNUM
to update each detail row with the unique identifier of the corresponding master row.This procedure outlines general steps for building such a mapping. Additional detailed instructions are available:
For additional information on importing flat file sources, see "Using the Import Metadata Wizard Flat Files".
For additional information on using the Mapping Flat File as a source, see "Flat File Source Operators".
For additional information on using Mapping Tables, see "Adding Bindable Operators".
For additional information on using the Data Generator operator, see "Data Generator Operator".
For additional information on using the Constant operator, see "Constant Operator".
For additional information on configuring mappings, see "Configuring Mappings Reference".
To extract from a master-detail flat file using direct path load to maintain master-detail relationships:
Import and sample a flat file source that consists of master and detail records.
When naming the record types as you sample the file, assign descriptive names to the master and detail records, as shown in Figure 8-11. This will make it easier to identify those records in the future.
Drop a Mapping Flat File operator onto the mapping canvas and specify the master-detail file from which you want to extract data.
Drop a Data Generator and a Constant operator onto the mapping canvas.
Drop a Mapping Table operator for the master records onto the mapping canvas.
You can either select an existing repository table that you created earlier, or create a new unbound mapping table with no attributes and perform an outbound reconcile to define the table later.
The table must contain all the columns required for the master fields you plan to load plus an additional numeric column for loading the
RECNUM
value.Drop a Mapping Table for the detail records onto the mapping canvas.
You can either select an existing repository table that you created earlier, or create a new unbound mapping table with no attributes and perform an outbound reconcile to define the table later.
The table must contain all the columns required for the detail fields you plan to load plus an additional numeric column for loading a
RECNUM
value, and a column that will be updated with the unique identifier of the corresponding master table row.Map all of the necessary flat file master fields to the master table and detail fields to the detail table, as shown in Figure 8-15.
Map the Data Generator operator's
RECNUM
attribute to theRECNUM
columns in the master and detail tables, as shown in Figure 8-15.Add a constant attribute in the Constant operator.
If the master row unique identifier column is of a
CHAR
datatype, make the constant attribute aCHAR
type with the expression'*'
.If the master row unique identifier column is a number, make the constant attribute a
NUMBER
with the expression'0'
. Figure 8-14 shows the expression property of the constant attribute set to'0'
. This constant marks all data rows as Òjust loaded.ÓFigure 8-14 Constant Operator Properties
Map the constant attribute from the Constant operator to the detail table column that will later store the unique identifier for the corresponding master table record.
Figure 8-15 shows a completed mapping with the flat file's master fields mapped to the master target table, the detail fields mapped to the detail target table, the
RECNUM
attributes from the Data Generator operator mapped to the master and detail target tables, respectively, and the constant attribute mapped to the detail target table.Figure 8-15 Completed Mapping from Master-Detail Flat File with a Direct Path Load
Configure the mapping with the following parameters:
Direct Mode: True
Errors Allowed: 0
Trailing Nullcols: True (for each table)
After you validate the mapping and generate the SQL*Loader script, create a post-update PL/SQL procedure and add it to the Warehouse Builder library.
Run the SQL*Loader script.
Execute an UPDATE SQL statement by running a PL/SQL post-update procedure or manually executing a script.
The following is an example of the generated SQL*Loader control file script:
OPTIONS ( DIRECT=TRUE,PARALLEL=FALSE, ERRORS=0, BINDSIZE=50000, ROWS=200, READSIZE=65536) LOAD DATA CHARACTERSET WE8MSWIN1252 INFILE 'g:\FFAS\DMR2.dat' READBUFFERS 4 INTO TABLE "MATER_TABLE" APPEND REENABLE DISABLED_CONSTRAINTS "REC_TYPE"='P' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' TRAILING NULLCOLS "REC_TYPE" POSITION (1) CHAR , "EMP_ID" CHAR , "ENAME" CHAR , "REC_NUM" RECNUM INTO TABLE "DETAIL_TABLE" APPEND REENABLE DISABLED_CONSTRAINTS "REC_TYPE"='E' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' TRAILING NULLCOLS "REC_TYPE" POSITION (1) CHAR , "C1" CHAR , "C2" CHAR , "C3" CHAR , "EMP_ID" CONSTANT '*', "REC_NUM" RECNUMThe following is an example of the post-update PL/SQL procedure:
create or replace procedure wb_md_post_update( master_table varchar2 ,master_recnum_column varchar2 ,master_unique_column varchar2 ,detail_table varchar2 ,detail_recnum_column varchar2 ,detail_masterunique_column varchar2 ,detail_just_load_condition varchar2) v_SqlStmt VARCHAR2(1000); BEGIN v_SqlStmt := 'UPDATE '||detail_table||' l '|| ' SET l.'||detail_masterunique_column||' = (select i.'||master_unique_column|| ' from '||master_table||' i '|| ' where i.'||master_recnum_column||' IN '|| ' (select max(ii.'||master_recnum_column||') '|| ' from '||master_table||' ii '|| ' where ii.'||master_recnum_column||' < l.'||detail_recnum_column||') '|| ' ) '|| ' WHERE l.'||detail_masterunique_column||' = '||''''||detail_just_load_condition||''''; dbms_output.put_line(v_sqlStmt); EXECUTE IMMEDIATE v_SqlStmt;Subsequent Operations
After the initial loading of the master and detail tables, you can use the loaded sequence values to further transform, update, or merge master table data with detail table data. For example, if your master records have a column that acts as a unique identifier (such as an Employee ID), and you want to use it as the key to join master and detail rows (instead of the sequence field you added for that purpose), you can update the detail table(s) to use this unique column. You can then drop the sequence column you created for the initial load. Operators such as the Aggregator, Filter, or Match and Merge operator can help you with these subsequent transformations.
Mapping Input Parameter Operator
You can introduce information external to Warehouse Builder as input into a mapping using a Mapping Input Parameter. For example, you can use an Input Parameter operator to pass SYSDATE to a mapping that loads data to a staging area. Use the same Input Parameter to pass the timestamp to another mapping that loads the data to a target.
When you a generate mapping, Warehouse Builder creates PL/SQL package. Mapping input parameters become part of the signature of the main procedure in the package.
The Mapping Input Parameter has a cardinality of one. It creates a single row set that can be combined with another row set as input to the next operator.
The names of the input attributes become the names of the mapping output parameters. The parameters can be used by connecting the attributes of the Mapping Input Parameters operator within the mapping editor. You can have only one Mapping Input Parameter in a mapping.
The Mapping Input Parameter contains the following properties:
Default Value: The character string value which, if specified, is placed in the generated code as the default value for the specified attribute. For example, if the value entered is '1-JUN-2001' then the generated code contains DEFAULT '1-JUN-2001'.
The default value for the mapping input parameter appears in the DEFAULT clause following the function parameter declarations in the generated PL/SQL package. For example, if a mapping parameter named param1 with data type VARCHAR2 is defined with a default value of 'HELLO', the generated main function in the PL/SQL package appears as:
param1 IN VARCHAR2 DEFAULT 'HELLO'
Data Type: Specifies the data type for this input parameter.
To use a Mapping Input Parameter operator in a mapping:
Drag and drop a Mapping Input operator onto the Mapping Editor canvas.
Right-click the Mapping Input operator and select Edit.
Select the Output tab and click Add to add output attributes.
You can rename the attributes and define the data type and other attribute properties.
Connect the Input Parameter operator MAP_INPUT attribute to a group in the target operator as shown in Figure 8-16.
Figure 8-16 Mapping Editor Showing A Mapping Input Parameter
Mapping Output Parameter Operator
Use the Mapping Output Parameter operator to send values out of a mapping to applications external to Warehouse Builder. When you a generate mapping, Warehouse Builder creates a PL/SQL package. Mapping Output Parameters become part of the signature of the main procedure in the package.
The Mapping Output Parameter has only one input group. You can have only one Mapping Output Parameter in a map. Only attributes that are not associated with a row set can be mapped into a Mapping Output Parameter. For example, constant, input parameter, output from a pre-mapping process, or output from a post process can all contain attributes that are not associated with a row set. A Mapping Output Parameter is not valid for a SQL*Loader mapping.
The default value for the Mapping Output Parameter appears in the DEFAULT clause following the function parameter declarations in the generated PL/SQL package. For example, if a mapping parameter named
param1
with data type VARCHAR2 is defined with a default value of'HELLO'
, the generated main function in the PL/SQL package appears as:
param1 OUT VARCHAR2 DEFAULT 'HELLO'
If a Mapping Output Parameter named
param1
has data type VARCHAR2, the generated main function in the PL/SQL package appears as:
param1 OUT VARCHAR2
The Mapping Output Parameter contains the following properties:
Data Type: Specifies the data type for this output parameter.
Bound Name: Specifies the physical name for this output parameter.
Note:
Mapping Output Parameters cannot be mapped to any operator. They can be mapped from Constants, Mapping Input Parameters, or the Output of a Pre- or Post-Mapping Process including the return value.To use a Mapping Output Parameter operator in a mapping:
Drop a Mapping Output operator onto the Mapping Editor canvas.
Right-click the Mapping Output operator and select Edit.
Select the Input Attributes tab and click Add to add input attributes.
You can rename the attributes and define the data type and other attribute properties.
See Figure 8-17 for an example of an Mapping Output Parameter in a mapping.
Figure 8-17 Mapping Editor Showing An Output Parameter Operator
Mapping Sequence Operator
A Mapping Sequence operator generates sequential numbers that increment for each row. For example, you can use the Sequence operator to create surrogate keys while loading data into a dimension table.
You can connect a Mapping Sequence to a target operator input or to the inputs of other types of operators. You can combine the sequence outputs with outputs from other operators.
This operator contains an output group containing the following output attributes:
CURRVAL: Generates from the current value.
NEXTVAL: Generates a row set of consecutively incremented numbers beginning with the next value.
You can bind and reconcile Mapping Sequences to a repository sequence in one of the modules. The repository sequence must be generated and deployed before the mapping containing the Mapping Sequence is deployed to avoid errors in the generated code package. See "Adding Bindable Operators" for more information.
Generate mappings with sequences using Row Based mode. Sequences are incremented even if rows are not selected. If you want a sequence to start from the last number, then do not run your SQL package in Set Based or in Set Based With Failover operating modes. See "Runtime Parameters Reference" for more information on configuring mode settings.
The Mapping Sequence contains the following property:
Bound Name: The name of the sequence database object that is used in the generated code. If the sequence has been reconciled with a repository component, the Bound Name remains the same as the physical name of the repository sequence.
To use a Mapping Sequence operator in a mapping:
Drag and drop the Mapping Sequence operator onto the Mapping Editor canvas.
Warehouse Builder displays the Add Mapping Sequence dialog.
Use the Add Mapping Sequence dialog to create or select a sequence. For more information on these options, see "Adding Bindable Operators".
Connect the sequence to a target attribute.
Match-Merge Operator
The Match-Merge operator is a data quality operator that you can use to first match and then merge data.
When you match records, you determine through business rules which records in a table refer to the same data. When you merge records, you consolidate into a single record the data from the matched records.
This section includes information and examples on how to use the Match-Merge operator in a mapping. The Match-Merge operator together with the Name-Address operator support householding, the process of identifying unique households in name and address data.
Example: Matching and Merging Customer Data
Consider how you could utilize the Match-Merge operator to manage a customer mailing list. Use matching to find records that refer to the same person in a table of customer data containing 10,000 rows. For example, you can define a match rule that screens records that have similar first and last names. Through matching you may discover that 5 rows refer to the same person. You can merge those records into one new record. For example, you can create a merge rule to retain the values from the one of the five matched records with the longest address. The newly merged table now contains one record for each customer.
Table 8-4 shows records that refer to the same person prior to using the Match-Merge operator.
Table 8-4 Sample Records
FirstName LastName AddressTable 8-5 shows the single record for Jane Doe after using the Match-Merge operator. Notice that the new record retrieves data from different rows in the sample.
Table 8-5 Match-Merge Results
FirstName LastName AddressDesigning Mappings with a Match-Merge Operator
Figure 8-18 shows a mapping you can design using a Match-Merge operator. Notice that the Match-Merge operator is preceded by a Name-Address operator, NAMEADDR, and a staging table, CLN_CUSTOMERS. You can design your mapping with or without a Name-Address operator. Preceding the Match-Merge operator with a Name-Address operator is desirable when you want to ensure your data is clean and standardized before launching time consuming match and merge operations.
Figure 8-18 Match-Merge Operator in a Mapping
Whether you include a Name-Address operator or not, be aware of the following considerations as you design your mapping:
PL/SQL output: The Match-Merge operator can generate two outputs, both PL/SQL outputs only. The MERGE group includes the merged data. The XREF group is an optional group you can design to document the merge process.
Row based operating mode: When the Match-Merge operator matches records, it compares each row with the subsequent row in the source and generates row based code only. These mappings, therefore, can only run in row based mode.
SQL based operators before Match-Merge: The Match-Merge operator generates only PL/SQL outputs. If you want to include operators that generate SQL code only, you must design the mapping such that they precede the Match-Merge operator. For example, operators such as the Join, Key Lookup, and Set operators must precede the Match-Merge operator. A mapping designed with operators that generate set based code after a Match-Merge operator is invalid and Warehouse Builder does not generate code for such mappings.
SQL input: With one specific exception, the Match-Merge operator requires SQL input. If you want to precede a Match-Merge with an operator that generates only PL/SQL output such as the Name-Address operator, you must first load the data to a staging table.
Refining Data from Match-Merge operators: To achieve greater data refinement, map the XREF output from one Match-Merge operator into another Match-Merge operator. This scenario is the one exception to the SQL input rule for Match-Merge operators. With additional design elements, the second Match-Merge operator accepts PL/SQL. For more information, see "Refining Data from Match-Merge Operators".
Using the Match-Merge Operator
You have the following options for using a match-merge operator:
Define a new match-merge operator: Drag match-merge operator from the Toolbox onto the mapping. The Mapping Editor launches a wizard.
Edit an existing match-merge operator: Right-click the operator and select Edit. The Mapping Editor opens the Match-Merge Editor.
Whether you are using the operator wizard or the Operator Editor, complete the following pages:
The pages are listed in a columnar table that reads down the columns left to right to conserve space.
Groups
By definition, the Match-Merge operator has one input group and two output groups. You can rename the groups and add optional descriptions, but you cannot add or delete groups in the Match-Merge operator. The default names for the input group is INGRP1. The default names for the output groups are MERGE and XREF.
You assign attributes to the INGRP1 on the Input Connections page and then further edit those attributes on the Input Attributes page. You define attributes for the MERGE group on the Merge Output page. You can optionally assign attributes to the XREF group on the Cross Reference Output page.
Input Connections
Use the Input Connections page to select attributes to copy and map into the operator.
To complete the Input connections page for an operator:
Select complete groups or individual attributes from the left panel.
To search for a specific attribute or group by name, type the text in Search for and click Go. To find the next match, click Go again.
Hold the Shift key down to select multiple groups or attributes. If you want to select attributes from different groups, you must first combine the groups with a Joiner or Set operator.
Use the left to right arrow button in the middle of the page to move your selections to the right side of the wizard page.
You can use the right to left arrow to move groups or attributes from the input connections list. Warehouse Builder removes the selection from the input group and removes the mapping line between the source operator and the current operator.
Input Attributes
Use the Input page to modify the attributes you selected in the Input Connections tab or wizard page.
You can perform the following tasks from the Match-Merge Input Attributes page:
Change attribute properties: You can change the attribute name, data type, length, precision and scale.
Add an optional description: Type a description for the input attributes.
Add attributes: Use the Add button to add input attributes that you did not previously select in the Input Connections tab or wizard page.
Cross Reference Output
Use the Cross Reference Output page to optionally select attributes for the XREF group. Although the Match-Merge operator creates the XREF group by default, you have the option of adding attributes to the group or leaving it empty.
The XREF group is an optional group you can define to document the merge process. It enables you to create a foreign key relationship between the original data set to the new merged data set. You can send the attributes from the XREF group to a table that records the corresponding source row for each merged row.
Every row from INGRP1 corresponds to a row in the XREF group. To design the XREF group, select original attribute values and merged attributes from Source Attributes on the left. Warehouse Builder displays the merged attributes in Source Attributes with a default prefix of ÒMM_Ó. Use Set Prefix at the bottom left corner of the page to change the prefix.
Match Bins
Use the Match Bins page to limit the set of possible matches to a manageable number. When Warehouse Builder matches the rows, it compares each row with the subsequent row for all rows within the same grouping. This can greatly enhance performance since Warehouse Builder searches for matches only within groupings and not throughout the entire data set.
While you want to define Match Bins that separate records into manageable groupings, you also want to avoid separating records that should be matched. The attribute(s) you select for grouping like data depends on your data. For example, if you have a table of customer address with a million rows, you may want to group the data by partial street name, city name, and zip code.
Ideally, you should attempt to keep the number of records in each grouping under 2000. The number of comparisons Warehouse Builder must perform is based on the following formula:
n=(b*(b-1))/2where
n
is number of comparisons andb
is number of records in a bin.To match 5 records, Warehouse Builder must perform 10 comparisons. To match 50 records, Warehouse Builder must perform 1,225 comparisons. To match 500 records, Warehouse Builder must perform 124,750 comparisons.
Match Rules
You can define match rules for a single attribute or multiple attributes in the operator. On the Match Rules tab, create match rules at the top of the page. In the lower portion of Match Rules tab, specify the details for each match rule.
If you create more than one match rule, Warehouse Builder determines two rows match if those rows satisfy any of the match rules. In other words, Warehouse Builder evaluates multiple match rules using OR logic. This is indicated on the Match Rules tab by the OR icon in the left most column. For more information, see "Understanding Matching Concepts".
Warehouse Builder assigns a position number and creates a default name such as MM_MA_0 for each match rule you add. You can edit and drag the rule to a new position in the rule list. You can designate a match rule as either Active Match Rules or Passive Match Rules. Warehouse Builder does not execute a passive rule unless you call it through a custom match rule. Assign one of the rule types listed in Table 8-7. When you select a rule type, the lower portion of the Match Rules tab activates and you can enter details for the match rule.
When you add details for conditional rule types, Warehouse Builder prompts you to add details for one or more attributes. When you add details for multiple attributes, Warehouse Builder displays the AND icon in the left most column. This indicates that Warehouse Builder matches rows only when all the condition details are satisfied.
Passive Match Rules
Warehouse Builder does not directly execute passive match rules. Instead, you can create custom match rules that are active and call passive match rules. For each match rule that you define, Warehouse Builder creates a corresponding function with the same name as the match rule. Custom rules can call the passive match rule using this function.
Custom Match Rule
Use this editor to create a custom comparison algorithm for the match rule. Select Edit to launch the Custom Match Rule Editor. Double click or drag and drop desired functions and parameters from the navigation tree on the left to the Implementation editor on the right. You can write a custom match rule that references active and passive match rules, functions, and parameters such as THIS_ and THAT_ which represent the two records from INGRP1 that are being compared.
You can also validate your custom rule by selecting Test and Validation from the Custom Match Rule Editor
Merge Rules
Use the Merge Rules tab to select values for the attributes in the merged record.
On the Merge Rules tab, create merge rules at the top of the page. In the lower portion of the Merge Rules tab, specify the details for each merge rule.
Warehouse Builder assigns a position number and creates a default name such as MM_ME_0 for each merge rule you add. Warehouse Builder executes merge rules in the order of their position numbers. You can edit and drag the rule to a new position in the rule list. Assign one of the rule types listed in Table 8-6.
Table 8-6 Merge Rule Types
Merge Rule Type Select Single or Multiple Attributes DescriptionWhen you select a rule type, the lower portion of the Merge Rules tab activates and you can enter details for the merge rule.
When you define a merge rule, you can define one rule for all the attributes in the merged record or define a rule for each attribute. For instance, if the merged record is a customer record, it may have attributes such as ADDRESS1, ADDRESS2, CITY, STATE, and ZIP. You can write five rules, one for each attribute. Or, you can write one rule such that all five attributes come from the same matched row.
When you write a rule for a single attribute, specify the attribute in the Attribute list box at the top of the page. Then, if necessary, specify the details for the rule in the lower portion of the tab.
When you write a rule for multiple attributes, Warehouse Builder disables the Attribute list box at the top of the page and you must define the details in the lower portion of the tab.
Custom Merge Rule
Use this editor to create a custom merge rule that assigns a value based on a PL/SQL code that you write and returns the attribute type you previously selected.
Select Edit to launch the Custom Merge Rule Editor. Double click or drag and drop desired functions and attributes from the navigation tree on the left to the Implementation editor on the right. You can write a custom merge rule that references source attributes, other merge attributes, and functions.
You can also validate your custom rule by selecting Test and Validation from the Custom Merge Rule Editor.
Following is the Example of a Custom Merge Rule for an Attribute:
BEGIN RETURN M_MATCHES(1)."TAXID";Following is the Example of a Custom Merge Rule for a Record:
BEGIN RETURN M_MATCHES(1);Understanding Matching Concepts
When you use Warehouse Builder to match records, you can define a single match rule or multiple match rules. If you create more than one match rule, Warehouse Builder determines two rows match if those rows satisfy any of the match rules. In other words, Warehouse Builder evaluates multiple match rules using OR logic. Table 8-7 lists the match rules you can specify.
Table 8-7 Match Rule Types
Match Rule DescriptionMatches rows based on scores that you assign to attributes. Warehouse Builder compares each attribute using a similarity algorithm that returns a score between 0 and 100 to represent the similarity between the rows. For two rows to be considered a match, the total counts must be greater than the overall score you designate. For details, see "Weight Match Rule".
Example of Multiple Match Rules
The following example illustrates how Warehouse Builder evaluates multiple match rules using OR logic.
In the top portion of the Match Rules tab, create two match rules as described in Table 8-8:
Table 8-8 Two Match Rules
Position Rule Type Usage DescriptionIn the lower portion of the tab, assign the details to Rule_1 as described in Table 8-9:
Table 8-9 Details for Rule_1
Attribute Position Algorithm Similarity Score Blank MatchingExample of Transitive Matching
The general rule is, if A matches B, and B matches C, then A matches C. Assign a conditional match rule based on similarity such as described in Table 8-12:
Table 8-12 Conditional Match Rule
Attribute Position Algorithm Similarity Score Blank MatchingAddress Match Rule
Use the Address match rule to match records based on postal addresses. Matching by address is most effective when you first correct the address data using the Name-Address operator before the Match-Merge operator. The Name-Address operator identifies addresses as existing in a postal matching database and designates the records with the Is Found flag. The Match-Merge operator processes addresses with the Is Found role faster because the data is known to be syntactically correct, legal, and existing.
To define an Address match rule, complete the following steps:
On the Match Rules tab, select Address as the Rule Type.
The Address Attributes tab and Details tab display at the bottom of the page.
In the left panel of the Address Attributes tab, select the attribute that represents the primary address and click the left to right arrow button.
Right-click Roles and designate that attribute as the Primary Address.
You must perform this step. If you do not assign the Primary Address role, the match rule is ineffective and you cannot access the Details tab.
Add other attributes and designate their roles as necessary. If you used the Name-Address operator to cleanse the address data, assign the Is Found role to the appropriate attribute. See Table 8-14 for the types of roles you can assign.
Select the Details tab and select the applicable options as listed in Table 8-15.
Table 8-14 describes the Address Roles you can assign for the Address match rule.
Table 8-14 Address Roles
DescriptionFor addresses with matching primary addresses, the operator compares unit numbers such as suite numbers, floor numbers, or apartment numbers. When the unit numbers are blank, the operator considers them a match. If only one unit number is blank, it is not considered a match unless you select the Match on blank secondary address option.
Assign this role only if also assigning the State role. The matching behavior for this role depends on whether you previously used the Name-Address operator to cleanse the address data. For uncorrected address data, the operator compares each City. For corrected addresses, the operator only compares Cities when the postal codes do not match. If both City and State match, then the operator compares the address roles. Cities are considered a match if both are blank but not if only one is blank.
Assign this role only if also assigning the City role. The matching behavior for this role depends on whether you previously used the Name-Address operator to cleanse the address data. For uncorrected address data, the operator compares each State. For corrected addresses, the operator only compares States when the postal codes do not match. If both City and State match, then the operator compares the address roles. States are considered a match if both are blank but not if only one is blank.
The matching behavior for this role depends on whether you previously used the Name-Address operator to cleanse the address data. For uncorrected address data, the operator does not use the Postal_code. For corrected addresses, the operator only compares each Postal_code. If the Postal_codes match, then the operator compares the address roles. If the Postal_codes do not match, then the operator compares City and State to determine if it should compare address roles such as the Primary_address.
Conditional Match Rule
Use the Conditional Match Rule to combine multiple attribute comparisons into one composite rule. When you assign multiple attributes for comparison, all the comparisons must be true for the records to be considered a match.
To define a Conditional match rule, complete the following steps:
On the top portion of the Match Rules tab, select Conditional for the rule type.
The operator displays a Details section at the bottom of the tab.
Click Add to add and select an attribute.
For Algorithm, select one of the options as listed in Table 8-16.
If you select Similarity or Standardized Similarity, specify a similarity score.
In Blank Matching, specify how the operator should handle blank values.
Table 8-16 Algorithms for Match Rules
Algorithm DescriptionEnter a similarity score between 0-100. If the similarity of the two attributes is equal or greater to the score, then the attribute values are considered matched. The similarity algorithm computes the edit distance between two strings. Edit distance is the number of deletions, insertions, or substitutions required to transform one string into another. A similarity value of 100 indicates that the two values are identical. A value similarity of zero indicates no similarity whatsoever. For example, if the string "tootle" is compared with the string "tootles", then the edit distance is 1. The length of the string "tootles" is 7. The similarity value is therefore 6/7*100 or 85.
The values of a string attribute are considered a match if the value of one attribute is contained within the other, starting with the first word. For example, "Midtown Power" would match "Midtown Power and Light," but would not match "Northern Midtown Power". The comparison ignores case and non-alphanumeric characters. Before attempting to match a partial name, this algorithm performs a Standardized Exact comparison on the entire string.
The values of a string attribute are considered a match if one string contains words that are abbreviations of corresponding words in the other. Before attempting to find an abbreviation, the operator performs a Standardized Exact comparison on the entire string. The comparison ignores case and non-alphanumeric characters.
For each word, the operator looks for abbreviations, as follows. If the larger of the words contains all of the letters from the shorter word and the letters appear in the same order as the shorter word, then the words are considered a match. For example, "Intl. Business Products" would match "International Bus Prd".
The values of a string attribute are considered a match if one string is an acronym for the other. Before attempting to identify an acronym, the operator performs a Standardized Exact comparison on the entire string. If no match is found, then each word of one string is compared to the corresponding word in the other string. If the entire word does not match, each character of the word in one string is compared to the first character of each remaining word in the other string. If the characters are the same, the names are considered a match. For example, "Chase Manhattan Bank NA" matches "CMB North America". The comparison ignores case and non-alphanumeric characters.
Firm Match Rule
Use the Firm match rule to match records by business name. Matching by business name is most effective when you first correct the address data using the Name-Address operator before the Match-Merge operator.
To define a Firm match rule, complete the following steps:
On the Match Rules tab, select Firm as the Rule Type.
The Firm Attributes tab and Details tab display at the bottom of the page.
In the left panel of the Firm Attributes tab, select the attribute that represents the firm name and click the left to right arrow button.
Right-click Roles and designate that attribute as Firm 1.
By default, the operator compares the values in Firm 1 for exact matches. You can change this default behavior by making selections on the Details tab.
Add another attributes and designate it role as Firm 2, if necessary.
For the match rule to be valid, you must assign at least one attribute either as Firm 1 or Firm 2.
Select the Details tab and select the applicable options.
If you select Strip noise words, the operator ignores words in the business names such as ÒtheÓ and ÒandÓ. If you select Cross match firm 1 and firm 2, the operator compares business names in firm 1 with business names in firm 2.
See Table 8-16 for descriptions of the remaining options on the Details tab.
Person Match Rule
Use the Person match rule to match records based on names. Matching by names is most effective when you first correct the address data using the Name-Address operator before the Match-Merge operator.
To define a Person match rule, complete the following steps:
On the Match Rules tab, select Name as the Rule Type.
The Person Attributes tab and Details tab display at the bottom of the page.
In the left panel of the Person Attributes tab, select the attribute that represents the last name and click the left to right arrow button.
Right-click Roles and designate that attribute as the Last Name.
Add other attributes and designate their roles as necessary. You must define either the Last Name or First Name Standardized for the match rule to be effective. See Table 8-17 for the types of roles you can assign.
Select the Details tab and select the applicable options as listed in Table 8-18.
Table 8-17 describes the Name Roles you can assign for the Name match rule.
Table 8-17 Name Roles
DescriptionWeight Match Rule
Use this rule to match rows based on a weight value that you assign. A weighted match rule is most useful when comparing a large number of attributes, without having a single attribute that is different causing a non-match, as can happen with conditional rules.
To use the Weight match rule, complete the following steps:
On the Match Rules tab, select Weight as the Rule Type.
The Details tab display at the bottom of the page.
Select Add at the bottom of the page to add an attribute to the rule.
In Maximum Score, assign a weight to each attribute you want to include in the comparison. Warehouse Builder compares each attribute using a similarity algorithm that returns a score between 0 and 100 to represent the similarity between the rows. A value of 100 indicates that the two values are identical. A value of zero indicates there is no similarity.
In Required score to match, assign an overall score for the match.
For two rows to be considered a match, the total counts must be greater than the Required score you designate.
Example of a Weight Match Rule
Assume you want to apply the Weight match rule to the data in Table 8-19:
Table 8-19 Example Records for Matching
Record Number Attr_1 Attr_2For Maximum score, you assign a value of 50 to both Att_1 and Att_2. You assign a value of 80 for the Required score to match. You can expect the following results:
Rec_1 is the driver record. The operator reads it first.
In Rec_2, the value for Attr_1 is CA. That value has a similarity of 100 with the value in the driver record, Rec_1. Since the weight value for Attr_1 is 50, its score is 50 (100% of 50).
In Rec_2, the value for Attr_2 is QQ and has a similarity of 100. The weight value for Attr_2 is also 50 and its score is therefore 50 (100% of 50). The total maximum score is 100 (50 + 50). This equals or exceeds the value of the Required score for match and Rec_2 and Rec_1 are considered matched.
In Rec_3, Attr_1 is CA and has a similarity of 100 with Rec_1. Since the weight value for Attr_1 is 50, its weighted score is 50 (100% of 50).
In Rec_3, the value for Attr_2 is QR and that has a similarity of 50. The maximum value for Attr_2 is 50, so its score is 25 (50% of 50). The total weighted score is 75 (50+25). This is less than the value of the Required score to match. Therefore, Rec_3 and Rec_1 do not match.
Refining Data from Match-Merge Operators
Once you pass data through a Match-Merge operator, you may want to further refine the data. For example, when householding name and address data, you may need to merge the data first for addresses and then again for names. Assuming you map the MERGE output to a target table, you can map the XREF group to either a staging table or to another Match-Merge operator. Although mapping to a staging table is relatively easy to design, it can lead to significant loss of performance. Map the XREF group directly to another Match-Merge operator to avoid loss of performance.
Figure 8-19 shows a mapping that relies on a staging table between two Match-Merge operators MM and MM_1.
Figure 8-19 Householding Data: Mapping with Staging Table
Figure 8-20 shows a mapping that achieves the same results with better performance. The XREF group from MM is mapped directly to MM_1. For this mapping to be valid, you must assign the Match Id generated for the first XREF group as the Match Bin rule on the second Match-Merge operator.
Figure 8-20 Householding Data: XREF Group Mapped to Second Match-Merge Operator
This section contains introductory material followed by instructions for using the operator. For advanced information on the Name and Address operator, see Chapter 20, "Data Quality: Name and Address Cleansing".
About the Name and Address Operator
Oracle Warehouse Builder enables you to perform name and address cleansing on your data with the Name and Address operator. The Name and Address operator identifies and corrects errors and inconsistencies in name and address source data by comparing input data to the data libraries supplied by the third-party name and address cleansing software vendors. You can purchase the data libraries directly from these vendors.
Taking advantage of the Name and Address operator requires separate licensing and additional installation steps. Refer to the Oracle Warehouse Builder Installation and Configuration Guide for more information.The errors and inconsistencies corrected by the Name and Address operator include variations in address formats, use of abbreviations, misspellings, outdated information, inconsistent data, or transposed names. The operator fixes these errors and inconsistencies by:
Parsing, or separating, name and address input data into individual elements.
Standardizing name and address data, such as standardized versions of nicknames and business names or standard abbreviations of address components, as approved by the Postal Service of the appropriate country. Standardized versions of names and addresses facilitate matching, householding, and ultimately help you obtain a single view of your customer.
Correcting address information such as street names and city names. Filtering out incorrect or undeliverable addresses can lead to savings on marketing campaigns.
Augmenting names and addresses with additional data such as gender, ZIP+4, country code, apartment identification, or business and consumer identification. You can use this and other augmented address information, such as census geocoding, for marketing campaigns that are based on geographical location.
Augmenting addresses with geographic information facilitates geography-specific marketing initiatives, such as marketing only to customers in large metropolitan areas (for example, within an n-mile radius from large cities); marketing only to customers served by a company's stores (within x mile radius from these stores). Oracle Spatial, an option with Oracle Database, and Oracle Locator, packaged with Oracle Database, are two products that you can use in conjunction with this feature.
Finally, the Name and Address operator enables you to generate postal reports for countries that support address correction and postal matching. Using postal reports in countries that support this feature often qualifies you for mailing discounts. For more information, see "Postal Reporting".
Example: Following a Record Through the Name and Address Operator
This example follows a record through a mapping using the Name and Address operator. This mapping also uses a Splitter operator to demonstrate a highly recommended data quality error handling technique. For more details about how data is processed by the Name and Address operator, see Chapter 20, "Data Quality: Name and Address Cleansing".
Example Input
In this example, your source data contains a Customer table with the row of data shown in Table 8-20.
Table 8-20 Sample Input to Name and Address Operator
Address Column Address ComponentThis data contains a nickname, a last name, and part of a mailing address, but it lacks the customer's full name, complete street address, and the state in which he lives. The data also lacks geographic information such as latitude and longitude, which can be used for distance calculations for truckload shipping. In order to complete the name and address data, you can use the Name and Address operator.
Example Steps
This example uses a mapping with a Name and Address operator followed by a Splitter operator to cleanse name and address records and then load them into separate targets depending on whether they were successfully parsed. This section explains the general steps required to design such a mapping. For detailed information on each type of operator, refer to that operator's description in this chapter.
To make the listed changes to the sample record:
In the Mapping Editor, begin by adding the following operators to the canvas:
A CUSTOMERS table from which you extract the records. This is your data source. It contains the data in "Example Input".
A Name and Address operator. This action launches the Name and Address Wizard. Complete it following the instructions in "Using the Name and Address Operator in a Mapping".
A Splitter operator. For information on using this operator, see "Splitter Operator".
Three target operators to which you load (respectively):
The successfully parsed records
The records with parsing errors
The records whose addresses are parsed but not found in the postal matching software
Map the attributes from the CUSTOMERS table to the Name and Address operator ingroup. Map the attributes from the Name and Address operator outgroup to the Splitter operator ingroup.
You are not required to use the Splitter operator in conjunction with the Name and Address operator, but the combination highlights the Name and Address error handling capabilities.
Define the split conditions for each of the outgroups in the Splitter operator and map the outgroups to the targets.
Figure 8-21 shows a mapping designed for this example. The data is mapped from the source table to the Name and Address operator, and then to the Splitter operator. The Splitter operator separates the successfully parsed records from those that have errors. The output from OUTGRP1 is mapped to the CUSTOMERS_GOOD target. The Split Condition for OUTGRP2 is shown at the bottom of the screen: records whose Is Parsed flag is False are loaded to the NOT_PARSED target. Records in the REMAINING_RECORDS group are successfully parsed, but their addresses are not found by the postal matching software. These records are loaded to the PARSED_NOT_FOUND target.
Figure 8-21 Name and Address Operator Used with a Splitter Operator in a Mapping
Example Output
If you run the mapping designed in this example, the Name and Address operator standardizes, corrects, and completes the address data from the source table. In this example, the target table contains the address data as shown in Table 8-21 (compare it with the input record from Table 8-20).
Table 8-21 Sample Output from Name and Address Operator
Address Column Address ComponentJoe Smith was separated into separate columns for a First_Name_Standardized, and Last_Name.
Joe was standardized into JOSEPH and Suite was standardized into STE.
Normandale Lake was corrected to Normandale Lake BLVD.
The first portion of the postal code, 55437, was augmented with the ZIP+4 code to read 55437-3813.
Latitude and longitude locations were added.
Flags were added to distinguish records that parsed successfully from those that did not, and a separate target was loaded with records that had errors.
Using the Name and Address Operator in a Mapping
You have the following options for using the Name and Address operator:
Define a new Name and Address operator: Drag the Name and Address operator from the Toolbox onto the mapping. The Mapping Editor launches a wizard.
Edit an existing Name and Address operator: Right-click the operator and select Edit. The Mapping Editor opens the Name and Address Editor.
Whether you are using the operator wizard or the Operator Editor, complete the following pages:
The pages are listed in a columnar table that reads down the columns left to right to conserve space.
Definitions
Characterize the nature of your input data by assigning general definitions to this Name and Address operator.
Figure 8-22 shows the Definitions page containing sample values for the data described in Ò"Example: Following a Record Through the Name and Address Operator". Here, the Parsing Type is set to 'Name and Address', Primary Country is set to 'United States', and Dual Address Assignment is set to 'P.O. Box'.
Figure 8-22 Name and Address Operator Definitions Page
Note:
You can only specify the Parsing Type when you first add the Name and Address operator to your mapping. When you edit a Name and Address operator in the operator editor, you cannot modify the Parsing Type.Name Only: Select this when the input data contains only name data. Names can include both personal and business names. Selecting this option instead of the more generic Name and Address option increases performance and accuracy.
Address Only: Select this when the input data contains only address data and no name data. Selecting this option instead of the more generic Name and Address option increases performance and accuracy.
Name and Address: Select this when the input data contains both name and address data. If your input data only contains one or the other, selecting one of the other options for optimal performance and accuracy.
Primary Country
Select the country which best represents the country distribution of your data. The primary country is used by some providers of name and address cleansing software as a hint for the appropriate parser or parsing rules to use on the initial parse of the record. For other name and address service providers, external configuration of their installation controls this behavior.
Dual Address Assignment
A dual address contains both a Post Office (PO) box and a street address for the same address record. For records that have dual addresses, your selection determines which address becomes the normal address and which address becomes the dual address. A sample dual address is:
PO Box 2589 4439 Mormon Coulee Rd La Crosse WI 54601-8231
Note that your choice for Dual Address Assignment affects which postal codes are assigned during postal code correction, because the street address and PO box address may correspond to different postal codes.
Street Assignment: The street address is considered the normal address and the PO Box address is considered the dual address. This means that the address component is assigned the street address. In the preceding example, it is assigned 4439 MORMON COULEE RD. This will cause the postal code to be corrected to 54601-8220.
PO Box Assignment: The PO Box address is considered the normal address and the street address is considered the dual address. This means that the address component is assigned the Post Office (PO) box address. In the preceding example, it is assigned PO BOX 2589. This will cause the postal code to be corrected to 54602-2589.
Closest to Last Line: Whichever address occurs closest to the last line is considered the normal address; the other is considered the dual address. This means that the address component is assigned the address line closest to the last line. In the preceding example, it is assigned the street address, 4439 MORMON COULEE RD. This will cause the postal code to be corrected to 54601-8220.
This option has no effect for records having a single street or PO box address. Note that this option may not be supported by all name and address cleansing software providers.
Groups
By definition, the Name and Address operator has one input group and one output group. You cannot edit, add, or delete groups in the Name and Address operator. The input group is called INGRP1 and the output group is OUTGRP1. You can edit these names. If your input data requires multiple groups, create a separate Name and Address operator for each group.
You assign attributes to the INGROUP on the Input Connections page and then edit those attributes on the Input Attributes page. You assign attributes to the OUTGRP1 group on the Output Attributes page.
Input Connections
Use the Input Connections page to select attributes from any operator in your mapping that you want to copy and map into the operator. The Available Attributes panel enables you to select attributes from any operator in your mapping. The Mapped Attributes panel represents the Name and Address operator. When you move attributes from the left panel to the right, you map them to the operator.
Figure 8-23 shows the Input Connections page containing sample values for the example described in "Example: Following a Record Through the Name and Address Operator". Notice that the CUSTOMERS table columns are mapped as the input attributes.
Figure 8-23 Name and Address Operator Input Connections Page
To complete the Input Connections page for an operator:
Select complete groups or individual attributes from the Available Attributes panel. The Available Attributes panel enables you to select attributes from any operator in your mapping.
To search for a specific attribute or group by name, type the text in Search for and click Go. To find the next match, click Go again.
Hold the Shift key down to select multiple groups or attributes. If you want to select attributes from different groups, you must first combine the groups with a Joiner or Set operator.
Use the left to right arrow button between the two panels to move your selections to the Mapped Attributes panel.
Input Attributes
Use the Input Attributes page to further modify the attributes you selected in the Input Connections page, and to assign input roles to each input attribute.
You can perform the following tasks from the Name and Address Input Attributes page:
Add attributes: Use the Add button to add input attributes.
Change attribute properties: You can change the following properties: attribute name, input role, length. Leave data type as VARCHAR2. You cannot change Precision or Scale. You must assign an Input Role for every input attribute.
Input roles indicate what kind of name or address information resides in a line of data. For each attribute, select the Input Role that most closely matches the data contained in the source attribute. Refer to Table 20-1 for a complete list of input roles and their descriptions.
You can select either non-discrete (line oriented) input roles for free-form data, or discrete roles (such as first name, primary address, or city) for specific input attributes. Whenever possible, choose discrete input roles (such as 'Person'), rather than non-discrete roles (such as 'Line1'). Discrete roles give the Name and Address operator more information about the content of the source attribute.
Add an optional description: Type a description for the input attributes.
Figure 8-24 shows the Input Attributes page containing sample values for the data described in "Example: Following a Record Through the Name and Address Operator". In this example, the source table has only one NAME column containing both the first and last names, such as Joe Smith, so this column is assigned the 'Person' input role. In the example, the source table also contains the entire street address portion of the mailing address in the STREETADDRESS column, so that column is assigned the 'Address' input role.
Figure 8-24 Name and Address Operator Input Attributes Page
Output Attributes
Use the Output Attributes page to define output attributes that determine how the Name and Address operator handles parsed data. Specifically, the output attribute properties characterize the data that is extracted from the parser output.
The output attribute collection is initially empty. You can create and edit attributes.
To create output attributes, select Add.
To edit an attribute name, click the appropriate cell and overwrite the default name.
You must specify an output component for every output attribute. To do this, click the ... button to the right of each attribute to open the Output Components dialog, where you can assign a component. See Table 20-2 for a complete list of the output components and descriptions of their functions.
You can also adjust the field length to match the length of the target attribute to which you intend to map the output attribute. Adjusting the length to match the target attribute helps avoid data truncation warnings during code generation, or errors during execution.
You cannot change the data type.
Figure 8-25 shows the Output Attributes page containing sample values for the data described in "Example: Following a Record Through the Name and Address Operator".
Figure 8-25 Name and Address Operator Output Attributes Page
Notice that every output attribute is assigned an output component. For example, the FirstName attribute is assigned the 'First Name Standardized' component. Notice also that 'Latitude' and 'Longitude' attributes are added to augment the address information.
Finally, notice that several error handling flags are added, such as Is Parsed, Is Good Name, and Is Good Address. These flags can be used with the Splitter operator to separate successfully parsed records from records with errors and load them into different targets.
Output Components
Use the Output Attributes components dialog to define components for the output attributes you create. Each output component represents a discrete name or address entity, such as a title, a standardized first name, a street number, a street name, or a street type, and indicates which component of a name or address an attribute constitutes. The component you select is assigned to the output attribute.
Output Component: Use the component tree on the left side of the Output Components dialog to expand the category housing your component, and then select the name or address component you want to assign to the attribute in question. You can select any node on the tree that is denoted by the envelope icon with a green border, even if that component expands to reveal other nodes. See Table 20-2, "Name and Address Operator Output Components" for a description of these components.
Address Type: Like Dual Address Assignment, this option may not be supported by all name and address cleansing software providers.This option only applies for dual addresses—addresses containing both a street address and a Post Office (PO) box or route-box address. The Dual Address Assignment option you specified in the Definitions page determines which address—the street address or the PO box address—is used as the dual address. Select either the normal or dual address. For more information on dual addresses, see "Dual Address Assignment".
Instance: Specify which instance of an output component to use when there are multiple occurrences of the same attribute in a single record. The instance control applies to all name components and several address components, such as Miscellaneous Address and Complex. Instance enables you to extract numerous attributes of the same nature.
For example, an input record containing John and Jane Doe would have two name occurrences: John Doe and Jane Doe. You can extract the first person with any name component by assigning Instance 1 to that component. Similarly, you can also extract the second person using any name component by assigning Instance 2. The number of instances allowed for various components depends on the vendor of the name and address cleansing software you use. Miscellaneous address may also have multiple instances, for example, if both an email address and phone number are present.
Figure 8-26 shows the Output Components page selecting the component for the first sample output attribute used in "Example: Following a Record Through the Name and Address Operator".
Figure 8-26 Name and Address Operator Output Attributes Components Dialog
In "Example: Following a Record Through the Name and Address Operator", the address type is Normal. The example calls for the following output components: First Name Standardized, Last Name, Primary Address, Secondary Address, City, State, Postal Code, Latitude, Longitude, Is Parsed, Is Good Name, Name Warning, Is Good Address, Is Found, Street Warning, and City Warning.
Postal Report
Postal reporting applies only to countries that support address correction and postal matching. Country certification varies with different vendors of name and address cleansing software. The most common country certifications are United States, Canada, and Australia. The process provides mailers a common platform to measure the quality of address-matching software, focusing on the accuracy of postal codes (in the case of the United States, of five-digit ZIP Codes and ZIP+4 Codes), delivery point codes, and carrier route codes applied to all mail. Some vendors of name and address cleansing software may ignore these parameters and require external setup for generating postal reports. For more information, see "Postal Reporting".
Postal Report: If you select Yes for Postal Report, the Primary Country you chose in the Definitions page determines the country for which the postal report is generated. Only one postal report can be active.
Processor Name: The use of this field varies with vendors of name and address cleansing software. Typically, the value supplied here appears on the United States CASS report.
List Name: The list name is an optional reference field that appears on the United States and United Kingdom reports under the List Name section, but is not output on other reports. The list name provides a reference for tracking multiple postal reports; for example, 'July 2003 Promotional Campaign'.
Processor Address Lines: The four address lines may appear on various postal reports. Various name and address cleansing software vendors use these fields differently. These lines often contain the full address of your company.
Figure 8-27 shows the Postal Report page containing sample values.
Figure 8-27 Name and Address Operator Output Attributes Page
To exit the Name and Address Wizard, click Finish. To exit the Name and Address Editor, click OK.
Postal Reporting
You can specify a postal report using the Postal Report page in the Name and Address operator editor or wizard. The postal report is generated when the mapping containing the Name and Address operator is executed.
Postal reporting applies to only those countries that support address correction and postal matching. Those countries vary among vendors of name and address cleansing software. The most common certifications are for United States, Canada, and Australia.
The process provides mailers a common platform to measure the quality of address-matching software, focusing on the accuracy of postal codes (in the case of the United States, of five-digit ZIP Codes and ZIP+4 Codes), delivery point codes, and carrier route codes applied to all mail.
All address lists used to produce mailings for automation rates must be matched by postal report-certified software. Oracle Warehouse Builder Name and Address is built on name and address software and data supplied by third-party software vendors specializing in name and address cleansing. Certifications therefore depend on the vendor, and may include the following:
United States: Coding Accuracy Support System (CASS) certification with the United States Postal Service. The CASS report is a text file specified by the USPS and produced by Oracle Warehouse Builder Name and Address. To meet USPS requirements, the mailer must submit a CASS report in its original form to the USPS.
Canada: Software Evaluation and Recognition Program (SERP) certification with Canada Post. Customers who utilize Incentive Lettermail, Addressed Admail, or Publications Mail must meet the Address Accuracy Program requirements. Customers can obtain a Statement of Accuracy by comparing their database to Canada Post's address data.
Australia: Address Matching Approval System (AMAS®) certification with Australia Post. PreSort Letter Service prices require that customers use AMAS Approved Software with unique Delivery Point Identifiers (DPIDs) being current against the latest version of the Postal Address File (PAF) .
Accessing the Postal Report Files
To access the postal report, you must have access to the file system where the Name and Address Server resides. The reports are processed by the Name and Address Server and are written to the
owb/bin/admin/reports
folder, located in the Oracle home path specified during the Warehouse Builder Server-Side installation. For more information on installation parameters, see the Oracle Warehouse Builder Installation and Configuration Guide.For each report, Warehouse Builder creates a unique file name using the Country Code, Group Name, and the date and time the file is created, for example:
p_CAN_TESTGROUP1_20021219_0130.txt
. This naming convention may not apply to all vendors of postal matching software, because in some cases, the file naming is controlled through external configuration of the vendor installation.Postal Report Restrictions for International Data
A postal report-enabled mapping should only process that country's data, specified as the Primary Country in the Definitions page of the Name and Address wizard. If your source contains international data, and if the source records contain country codes, map the Country Code source column to an input group attribute of the Name and Address operator. Assign the 'Country Code' input role to the attribute.
Pivot Operator
The pivot operator enables you to transform a single row of attributes into multiple rows. Use this operator in a mapping when you want to transform data that is contained across attributes instead of rows. This situation can arise when you extract data from non-relational data sources such as data in a crosstab format.
Example: Pivoting Sales Data
The external table
SALES_DAT,
shown in Figure 8-28, contains data from a flat file. There is a row for each sales representative and separate columns for each month. For more information on external tables, see "Using External Tables".Figure 8-28 SALES_DAT
Table 8-22 shows a sample of the data after Warehouse Builder performs a pivot operation. The data that was formerly contained across multiple columns (M1, M2, M3...) is now contained in a single attribute (Monthly_Sales). A single ID row in
SALES_DAT
corresponds to 12 rows in pivoted data.Table 8-22 Pivoted Data
MONTH MONTHLY_SALES REGIONTo perform the pivot transformation in this example, create a mapping like the one shown in Figure 8-29.
Figure 8-29 Pivot Operator in a Mapping
In this mapping, Warehouse Builder reads the data from the external table once, pivots the data, aggregates the data, and writes it to a target in set based mode. It is not necessary to load the data to a target directly after pivoting it. You can use the pivot operator in a series of operators before and after directing data into the target operator. You can place operators such as filter, joiner, and set operation before the pivot operator. Since pivoted data in Warehouse Builder is not a row-by-row operation, you can also execute the mapping in set based mode.
The Row Locator
In the pivot operator, the row locator is an output attribute that you create to correspond to the repeated set of data from the source. When you use the pivot operator, Warehouse Builder transforms a single input attribute into multiple rows and generates values for a row locator. In this example, since the source contains attributes for each month, you can create an output attribute named '
MONTH
' and designate it as the row locator. Each row fromSALES_DAT
then yields 12 rows of pivoted data in the output.Table 8-21 shows the data from the first row from
SALES_DAT
after Warehouse Builder pivots the data with 'MONTH
' as the row indicator.Table 8-23 Pivoted Data
MONTH MONTHLY_SALES REGIONDefine a new pivot operator: Use the Pivot Wizard to add a new pivot operator to a mapping. Drag a pivot operator from the Toolbox onto the mapping. The Mapping Editor launches the Pivot Wizard.
Edit an existing pivot operator: Use the Pivot Editor to edit a pivot operator you previously created. Right click the operator and select Edit. The Mapping Editor opens the Pivot Editor.
Whether you are using the Pivot operator wizard or the Pivot Editor, complete the following pages:
The pages are listed in a columnar table that reads down the columns left to right to conserve space.
Groups
Use the Groups page to specify one input and one output group.
In a pivot operator, the input group represents the data from the source that is contained across multiple attributes. The output group represents that data transformed into rows.
You can rename and add descriptions to the input and output groups. Since each pivot operator must have exactly one input and one output group, the wizard prevents you from adding or removing groups or changing group direction.
Input Connections
Use the Input Connections page to copy and map attributes into the pivot operator. The attributes you select become mapped to the pivot input group. The left side of the page displays a list of all the operators in the mapping. Figure 8-30 shows a group from the external table SALES_DAT selected as input for the pivot operator.
Figure 8-30 Pivot Operator Input Connections Page
To complete the Input Connections page for a pivot operator:
Select complete groups or individual attributes from the left panel.
To search for a specific attribute or group by name, type the text in Search for and select Go. To find the next match, select Go again.
Press the Shift key to select multiple attributes. If you want to select attributes from different groups, you must first combine the groups with a Joiner or Set operator.
Use the left to right arrow button in the middle of the page to move your selections to the right side of the wizard page.
Use the right to left arrow to remove groups or attributes from the input connections list. Warehouse Builder removes the selection from the input group and removes the mapping line between the source operator and the pivot operator. Figure 8-31 shows a group from
SALES_DAT
copied and mapped into thePIVOTSALES
operator.Figure 8-31 Attributes Copied and Mapped into Pivot In Group
Input Attributes
Use the Input Attributes page to modify the attributes you selected in the Input connections tab or wizard page.
You can perform the following tasks from the pivot Input Attributes page:
Add attributes: Use the Add button to add input attributes.
Change attribute properties: You can change the attribute name, data type, length, precision, and scale.
Add an optional description: Type a description for the input attributes.
Designate attribute keys: As an option, use the Key check box to indicate an attribute that uniquely identifies the input group.
Output Attributes
Use the Output Attributes page to create the output attributes for the pivot operator. If you designated any input attributes as keys on the Input Attributes tab or wizard page, Warehouse Builder displays those input attributes as output attributes that you cannot edit or delete. Figure 8-32 displays the output attributes with
MONTH
selected as the row locator.Figure 8-32 Pivot Output Attributes Page
You can perform the following tasks from the pivot Output Attributes Page:
Change attribute properties: Except for attributes you designated as keys on the previous tab or wizard page, you can change the attribute name, data type, length, precision, and scale.
Add an optional description: Type a description for the output attributes.
Designate a row locator: Although you are not required to designate a row locator for the pivot operator, it is recommended. When you identify the row locator on the Output Attributes page or tab, it is easier for you to match your output data to the input data.
In the pivot operator, the row locator is an output attribute that corresponds to the repeated set of data from the source. For example, if the source data contains separate attributes for each month, create an output attribute '
MONTH
' and designate it as the row locator.Pivot Transform
Use the Pivot Transform page to write expressions for each output attribute.
By default, Warehouse Builder displays two rows. Use Add to specify how many rows of output you want from a single row in the source. For example, if your source contains an attribute for each quarter in a year, you can specify 4 rows of output for each row in the source. If the source data contains an attribute for each month in the year, you can specify 12 rows of output for each row in the source.
Figure 8-33 shows the Pivot Transform tab with the pivot expressions defined for a source with an attribute for each month.
Figure 8-33 Pivot Transform Page
Write pivot expressions based on the following types of output:
Row locator: Specify a name for each row where the name is a value you want to load into the table. For example, if the row locator is 'MONTH', type 'Jan' for the first row.
Pivoted output data: Select the appropriate expression from the list box. For example, for the row you define as 'Jan', select the expression that returns the set of values for January.
Attributes previously specified as keys: Warehouse Builder defines the expression for you.
Unnecessary data: If the Pivot Transform page contains data that you do not want as output, use the expression 'NULL'. Warehouse Builder outputs a repeated set of rows with no data for attributes you define as 'NULL'.
When using the wizard to create a new pivot operator, click Finish when you want to close the wizard. The Mapping Editor displays the operator you defined.
When using the Pivot Editor to edit an existing pivot operator, click OK when you have finished editing the operator. The Mapping Editor updates the operator with the changes you made.
Post-Mapping Process Operator
Use a Post-Mapping Process operator to define a procedure to be executed after running a mapping. For example, you can use a Post-Mapping Process operator to reenable and build indexes after a mapping completes successfully and loads data into the target.
The Post-Mapping Process operator calls a function or procedure that is defined in Warehouse Builder after the mapping is executed. The output parameter group provides the connection point for the returned value (if implemented through a function) and the output parameters of the function or procedure. There are no restrictions on the connections of these output attributes
The Post-Mapping Process operator contains groups corresponding to the number and direction of the parameters associated with the selected PL/SQL procedure or function. This list of groups and attributes can only be modified through reconciliation.
You can only define one Post-Mapping Process operator for a mapping. If you want to run more than one procedure after a mapping, you must wrap the procedures into one procedure.
You can map constants, data generators, mapping input parameters, and output from a Pre-Mapping Process into a Post-Mapping Process operator. The Post-Mapping Process operator is not valid for an SQL*Loader mapping.
After you add a Post-Mapping Process operator to the Mapping Editor, use the operator properties dialog to specify run conditions in which to execute the process.
To use a Post-Mapping Process operator in a mapping:
Drop a Post-Mapping Process operator onto the Mapping Editor canvas.
Warehouse Builder displays the Add Mapping Transformation dialog.
Use the Add Mapping Transformation dialog to select or create a transformation. For more information on how to use the Add Mapping Transformation dialog, see "Adding Bindable Operators".
Connect the output attribute of a source operator to the input/output group of the Post-Mapping Process operator.
Set the run conditions for the operator.
To set run conditions for a Post-Mapping Process operator:
From the mapping canvas, right-click a Post-Mapping Process operator and select Operator Properties.
Click Post-Mapping Process Run Condition and select one of the following run conditions:
Always: The process runs regardless of errors from the mapping.
On Success: The process runs only if the mapping completes without errors.
On Error: The process runs only if the mapping completes with errors exceeding the number of allowed errors set for the mapping.
On Warning: The process runs only if the mapping completes with errors that are less than the number of allowed errors set for the mapping.
If you select On Error or On Warning and the mapping runs in row based mode, you must verify the Maximum Number of Errors set for the mapping. To view the number of allowed errors, right-click the mapping in the navigation tree, select Configure, and expand Runtime Parameters.
Pre-Mapping Process Operator
Use a Pre-Mapping Process operator to define a procedure to be executed before running a mapping. For example, you can use a Pre-Mapping Process operator to truncate tables in a staging area before running a mapping that loads tables to that staging area. You can also use a Pre-Mapping Process operator to disable indexes before running a mapping that loads data to a target. You can then use a Post-Mapping Process operator to reenable and build the indexes after running the mapping that loads data to the target.
The Pre-Mapping Process operator calls a function or procedure whose metadata is defined in Warehouse Builder prior to executing a mapping. The output parameter group provides the connection point for the returned value (if implemented with a function) and the output parameters of the function or procedure. There are no restrictions on the connections of these output attributes.
When you drop a Pre-Mapping Process operator onto the Mapping Editor canvas, a dialog opens displaying the available libraries, categories, functions, and procedures. After you select a function or procedure from the tree, the operator displays with predefined input and output parameters.
The Pre-Mapping Process operator contains groups corresponding to the number and direction of the parameters associated with the selected PL/SQL procedure or function.
A mapping can only contain one Pre-Mapping Process operator. Only constants, mapping input parameters, and output from a Pre-Mapping Process can be mapped into a Post-Mapping Process operator.
After you add a Pre-Mapping Process operator to the Mapping Editor, use the operator property dialog to specify conditions in which to execute the mapping.
To use a Pre-Mapping Process operator in a mapping:
Drop a Pre-Mapping Process operator onto the Mapping Editor canvas.
The Add Mapping Transformation dialog displays.
Use the Add Mapping Transformation dialog to select or create a transformation. For more information on how to use the Add Mapping Transformation dialog, see "Adding Bindable Operators".
Connect the output attribute of the Pre-Mapping Process operator to the input group of a target operator.
Set the run conditions for the operator.
To set run conditions for a mapping with a Pre-Mapping Process operator:
In the mapping canvas, right-click a Pre-Mapping Process operator and select Operator Properties.
Click Mapping Run Condition and select one of the following run conditions:
Always: Warehouse Builder runs the mapping after the process completes, regardless of the errors.
On Success: Warehouse Builder runs the mapping only if the process completes without errors.
On Error: Warehouse Builder runs the mapping only if the process completes with errors.
Set Operation Operator
The Set Operation operator enables you to use following set operations in a mapping:
By default, the Set Operation operator contains two input groups and one output group. You can add input groups by using the operator editor. Mapping attributes to a Set Operation input group creates corresponding attributes with the same name and data type in the Set Operation output group. The number of attributes in the output group matches the number of attributes in the input group containing the most number of attributes.
To use the Set Operation operator:
All sets must have the same number of attributes.
The data types of corresponding attributes must match.
Corresponding attributes are determined by the order of the attributes within an input group. For example, attribute 1 in input group 1 corresponds to attribute 1 in input group 2.
You must apply the set operation in top-down order. The order of the input groups determines the execution order of the set operation. This order only affects the minus operation. For example, A minus B is not the same as B minus A. The order of the attributes within the first input group determines the structure of a set. For example, {empno, ename} is not the same as {ename, empno}.
To use the Set Operation operator in a mapping:
Drag and drop a Set Operation operator onto the Mapping Editor canvas.
Connect source attributes to the Set Operation operator groups.
Right-click the operator header and select Operator Properties...
The Set Operation properties window displays.
Click the field to the right of the Set Operation property and select an operation from the drop-down list.
Close the Set Operation properties window.
Connect the Set Operation output group to a target input group.
Sorter Operator
You can produce a sorted row set using the Sorter operator. The Sorter operator enables you to specify which input attributes are sorted and whether the sorting is performed in ascending or descending order. Warehouse Builder sorts data by placing an ORDER BY clause in the code generated by the mapping.
The Sorter operator has one input/output group. You can use the Sorter operator to sort data from any relational database source. You can place any operator after the Sorter operator.
The Sorter operator contains the following property:
Order By Clause: An ordered list of attributes in the input/output group to specify that sorting is performed in the same order as the ordered attribute list. You can set ascending or descending sorting for each attribute.
To use the Sorter operator in a mapping:
Drop the Sorter operator onto the Mapping Editor canvas.
Connect a source operator group to the Sorter input/output group as shown in Figure 8-34.
Figure 8-34 Mapping Editor with a Sorter Operator
Right-click the Sorter operator header and select Operator Properties.
The Sorter properties window displays.
Click the ... button in the Order By Clause field.
The Order By Clause dialog displays.
Select the attributes you want to sort.
Select an attribute from the Available Attributes list and click the right arrow button. Or, click the double right arrow button to select all of the Available Attributes.
Apply an ORDER BY clause to the attribute.
Select the attribute in the ORDER BY Attributes list and select ASC (ascending) or DESC (descending) from the drop-down list.
Click OK.
Splitter Operator
You can use the Splitter operator to split data from one source to several targets. The operator splits a single input row set into several output row sets using a boolean split condition. Each output row set has a cardinality less than or equal to the input cardinality.
You can configure Warehouse Builder mappings that split data from one source to multiple targets to take advantage of Oracle9i functionality that optimizes SQL code and improves performance. For more information, see "Example: Creating Mappings with Multiple Targets".
The Splitter operator creates an output group called REMAINING_ROWS containing all input rows not included in any of the other output groups. You can delete this output group, but you cannot edit it.
The Splitter Operator contains the following properties:
Split Condition: The text expression template for the Split Condition. For code generation, the source columns are substituted by the input attribute names in the expression template. The expression is a valid SQL expression that can be used in a WHERE clause.
Data Type: The data type of the attribute.
Precision: The precision of the attribute, used for numeric type attributes only.
Scale: The scale of the attribute, used for numeric type attributes only.
Length: The length of the attributes, used for string-type attributes only.
To use the Splitter operator in a mapping:
Drag and drop the Splitter operator onto the Mapping Editor canvas.
Connect a group from a source operator to the input group of the Splitter operator.
The output attributes are created with data types matching the corresponding input data types.
Right-click the Splitter operator header and select Operator Properties.
The Splitter properties window displays as shown in Figure 8-35.
Figure 8-35 Group Properties Window for a Split Condition
Enter an expression in the Split Condition field. Or click ... to define an expression using the Expression Builder as shown in Figure 8-36.
Figure 8-36 Expression Builder Showing A Split Condition
Close the Splitter Properties window.
Define expressions for each of the output groups except for the REMAINING ROWS group.
Connect the output groups to the targets.
Figure 8-37 Mapping with a Single Source and Multiple Targets
Example: Creating Mappings with Multiple Targets
When you design and configure a mapping with the Splitter operator, Warehouse Builder generates a
multi_table_insert
statement. This SQL statement takes advantage of parallel query and parallel DML services in the Oracle9i database server.To create a mapping with multiple targets:
Configure an Oracle target module that contains the mapping to validate and generate Oracle9i SQL.
From Warehouse Builder, right-click the target module on the navigation tree and select Configuration Properties. Under Target Database Type, select Oracle9i.
In the Mapping Editor, design a mapping with a single source, a Splitter operator, and multiple targets.
The targets must be tables, not views or materialized views. Each target table must have less than 999 columns. Between the Splitter operator and the targets, do not include any operators that change the cardinality. For example, you can place a Filter between the Splitter and the targets as shown in Figure 8-38, but not a Joiner or Aggregator operator.
Figure 8-38 Example Mapping with Multiple Targets
From the Warehouse Builder console, select the mapping from the navigation tree, select Object from the menu bar, and select Configure. You can also right-click the mapping you want to configure and select Configure.
Warehouse Builder displays the configuration properties dialog for a mapping as shown in Figure 8-39.
Figure 8-39 Configuration Properties Window for Mapping Tables
Expand Runtime Parameters Reference and set Default Operating Mode to set based.
Expand Code Generation Options Reference and set Optimize Code to true.
When you run this mapping and view the generation results, Warehouse Builder returns one total SELECT and INSERT count for all targets.
Table Function Operator
Table function operators enable you to manipulate a set of input rows and return another set of rows possibly of different cardinality. Unlike conventional functions, table functions can return a set of output rows that can be queried like a physical table.
Using table functions can greatly improve performance when loading your data warehouse.
Table Functions have the following characteristics:
They do not support the passing of parameters by name.
If the return type is TABLE of PLS Record, the name you select must match the name of PLS Record field. It is possible to select only one subset of the fields of the PLS Record in the select list.
If the return type is TABLE of T1%ROWTYPE, the name you select must match the name of the columns of the table T1.
If the return type is TABLE of Object Type, the name you select list must match the name of Object Type attribute.
If the return type is TABLE of Scalar (like TABLE of NUMBER), only Select COLUMN_VALUE can be used to retrieve the scalar values returned by the table function.
Prerequisites for Using the Table Function Operator
Before you can use the Mapping Table Function operator in a mapping, you need to create the table function in your target, external to Warehouse Builder. The table functions in the database that are supported by the unbound table function operator must meet the following requirements:
Input
Ref Cursor returning PLS Record (the fields of the PLS Record) must be scalar data types supported by Warehouse Builder (0..n).
There must be at least one input parameter.
PLS Record (the fields of the PLS Record should be scalar data types supported by Warehouse Builder).
Object Type (the attributes of the Object Type should be scalar data types supported by Warehouse Builder).
Scalar data types supported by Warehouse Builder.
ROWTYPE
For an unbound Mapping Table Function operator in a mapping:
You must add one parameter group for each ref cursor type parameter.
Multiple scalar parameters can be part of a single scalar type parameter group.
The parameter groups and the parameters in a group can be entered in any order.
The positioning of the parameters in the mapping table function operator must be the same as the positioning of the parameters in the table function created in your target warehouse.
Table Function Operator Properties
The Mapping Table Function operator contains the following properties:
General
General properties include the name and description. Specify the name of the table function located in the target database. The description is optional.
Input Parameter Group
The table function operator accepts the following types of input parameters:
Input Parameter Type: Valid input parameter types are REF_CURSOR_TYPE or SCALAR_TYPE.
REF_CURSOR_TYPE: Returns a PLS Record {0...N}. The fields of the PLS Record must be a scalar data type supported by Warehouse Builder.
SCALAR_TYPE: Scalar data types supported by Warehouse Builder.
Parameter Position: Indicates the position of the parameter in the table function signature corresponding to this parameter group.
Type Attribute Name: The name of the field of the PLS Record, attribute of the Object Type, or column of the ROWTYPE. This property is not applicable if the return type is TABLE of SCALAR. This name is used to invoke the table function.
To use a Table Function operator in a mapping:
Drag and drop a Mapping Table Function operator onto the Mapping Editor canvas.
Connect the appropriate source attributes to the input group of the mapping table function operator.
This automatically creates the input attributes.
Right-click the Table Function operator and select Edit.
From the Groups tab, select Add to add an output group.
Before you deploy the mapping containing the mapping table function operator, you must manually create the table function in the target warehouse. The mapping table function operator is bound to the actual table function object through the code generated by the mapping.
Transformation Operator
You use the Mapping Transformation operator to transform the column value data of rows within a row set using a PL/SQL function, while preserving the cardinality of the input row set.
The Mapping Transformation operator must be bound to a function or procedure contained by one of the modules in the repository. The inputs and outputs of the Mapping Transformation operator correspond to the input and output parameters of the bound repository function or procedure. If the Mapping Transformation operator is bound to a function, a result output is added to the operator that corresponds to the result of the function. The bound function or procedure must be generated and deployed before the mapping can be deployed, unless the function or procedure already exists in the target system.
Warehouse Builder provides pre-defined PL/SQL library functions in the runtime schema that can be selected as a bound function when adding a Mapping Transformation operator onto a mapping. In addition, you can choose a function or procedure from the Global Shared Library.
The Mapping Transformation operator contains the following properties:
Function Call: The text template for the function call that is generated by the code generator with the attribute names listed as the calling parameters. For the actual call, the attribute names are replaced with the actual source or target columns that are connected to the attributes.
Function Name: The name of the function or procedure, to which this operator is bound.
Procedure: A boolean value indicating, if true, that the bound transformation is a procedure rather than a function with no returned value.
Data Type: Indicates the data type of the input, output, or result parameter of the bound function that corresponds to the given attribute. If the output of a mapping transformation is of CHAR data type, then Warehouse Builder applies an RTRIM on the result before moving the data to a target. This ensures that no extra spaces are contained in the output result.
Default Value: The default value (blank if none) for the given attribute.
Optional Input: A boolean value indicating, if true, that the given attribute is optional. If the attribute is optional, it need not be connected in the mapping.
Function Return: A boolean value indicating, if true, that the given output attribute is the result attribute for the function. The result attribute is a named result. Use this property if another output is a named result, or if you change the name of the result output.
To use a Mapping Transformation operator in a mapping:
Drag and drop a Mapping Transformation operator onto the Mapping Editor canvas.
The Add Mapping Transformation dialog displays.
Use the Add Mapping Transformation dialog to create a new transformation or select one or more transformations. Warehouse Builder adds a transformation operator for each transformation you select. For more information on these options, see "Adding Bindable Operators" beginning.
Connect the source attributes to the inputs of the Mapping Transformation operator.
(Optional step) Right-click one of the inputs and select Attribute Properties.
The Mapping Transformation properties window displays.
Select an input attribute. If the Procedure property is set to True, then do not connect the input parameter.
Close the attribute property window.
Connect the Transformation operator output attributes to the target attributes.
Unpivot Operator
The unpivot operator converts multiple input rows into one output row. The unpivot operator enables you to extract from a source once and produce one row from a set of source rows that are grouped by attributes in the source data. Like the pivot operator, the unpivot operator can be placed anywhere in a mapping.
Example: Unpivoting Sales Data
Table 8-24 shows a sample of data from the SALES relational table. In the crosstab format, the 'MONTH' column has 12 possible character values, one for each month of the year. All sales figures are contained in one column, 'MONTHLY_SALES'.
Table 8-24 Data in a Crosstab Format
MONTH MONTHLY_SALES REGIONFigure 8-40 depicts data from the relational table 'SALES' after Warehouse Builder unpivoted the table. The data formerly contained in the 'MONTH' column (Jan, Feb, Mar...) corresponds to12 separate attributes (M1, M2, M3...). The sales figures formerly contained in the 'MONTHLY_SALES' are now distributed across the 12 attributes for each month.
Figure 8-40 Data Unpivoted from Crosstab Format
The Row Locator
When you use the unpivot operator, Warehouse Builder transforms multiple input rows into a single row based on the row locator. In the unpivot operator, the row locator is an attribute that you must select from the source to correspond with a set of output attributes that you define. A row locator is required in an unpivot operator. In this example, the row locator is 'MONTH' from the 'SALES' table and it corresponds to attributes M1, M2, M3... M12 in the unpivoted output.
Define a new unpivot operator: Drag an unpivot operator from the Toolbox onto the mapping. The Mapping Editor launches a wizard.
Edit an existing unpivot operator: Right-click the operator and select Edit. The Mapping Editor opens the Unpivot Editor.
Whether you are using the Unpivot operator wizard or the Unpivot Editor, complete the following pages:
The pages are listed in a columnar table that reads down the columns left to right to conserve space.
Groups
Use the Groups page to specify one input and one output group.
In an unpivot operator, the input group represents the source data in crosstab format. The output group represents the target data distributed across multiple attributes.
You can rename and add descriptions to the input and output groups. Since each unpivot operator must have exactly one input and one output group, the wizard prevents you from adding or removing groups or changing group direction.
Input Connections
Use the Input Connections page to select attributes to copy and map into the unpivot operator.
To complete the Input connections page for an unpivot operator:
Select complete groups or individual attributes from the left panel.
To search for a specific attribute or group by name, type the text in Search for and click Go. To find the next match, click Go again.
Hold the Shift key down to select multiple groups or attributes. If you want to select attributes from different groups, you must first combine the groups with a Joiner or Set operator.
Use the left to right arrow button in the middle of the page to move your selections to the right side of the wizard page.
You can use the right to left arrow to move groups or attributes from the input connections list. Warehouse Builder removes the selection from the input group and removes the mapping line between the source operator and the pivot operator.
Input Attributes
Use the Input Attributes page to modify the attributes you selected in the Input connections tab or wizard page.
You can perform the following tasks from the Unpivot Input Attributes Page:
Add attributes: Use the Add button to add input attributes.
Change attribute properties: You can change the attribute name, data type, length, precision and scale.
Add an optional description: Type a description for the input attributes.
Designate key attribute(s): You must designate one or more key attributes for unpivot operators. Use the Key check box to indicate the attribute(s) that uniquely identifies the input group. Input rows with the same value in their key attribute(s) produce one unpivoted output row.
Row Locator
Use the Row locator page to select a row locator and assign values to the distinct values contained in the row locator. Figure 8-41 shows the attribute MONTH selected as the row locator with values such as 'Jan', 'Feb', or 'Mar'.
Figure 8-41 Unpivot Row Indicator Page
To complete the Unpivot Row Locator page:
Select an attribute from the Row locator list box.
In the unpivot operator, the row locator is the attribute from the source data that corresponds to a set of output attributes.
Use Add to specify the number of distinct values that exist in the row locator.
For each row locator value, type in the value as it appears in your source dataset.
For string values, enclose the text in single quotes. For example, if the row locator is 'MONTH', there would be a total of 12 distinct values for that attribute. Click Add to add a row for each distinct value. For row locator values, type values exactly as they appear in the source dataset. For instance, the row locator values as shown in Table 8-24 are 'Jan', 'Feb', and 'Mar.'
Output Attributes
Use the Output Attributes page shown in Figure 8-42 to create the output attributes for the unpivot operator.
Figure 8-42 Unpivot Output Attributes Page
If you designated any input attributes as keys on the Input Attributes tab or wizard page, Warehouse Builder displays those input attributes as output attributes that you cannot edit or remove.
You can perform the following tasks from the Pivot Output Attributes page:
Add attributes: Use Add to increase the number of output attributes to accommodate the rows you specified on the Row locator tab or wizard page. If you specified 12 rows, specify 12 output attributes plus attributes for any other input attributes that you did not designate as a key.
Change attribute properties: Except for attributes you designated as keys on the Input Attributes tab or wizard page, you can change the attribute name, data type, length, precision, and scale.
Add an optional description: Type a description for the output attributes.
Unpivot Transform
Use the Unpivot Transform page shown in Figure 8-43 to write expressions for each output attribute.
Figure 8-43 Unpivot Transform Page
For attributes you designated as keys, Warehouse Builder defines the matching row and expression for you. Warehouse Builder displays the first row as the match for a key attribute. For all other output attributes, specify the matching row and the expression.
Matching row: Select the appropriate option from the list box. For example, for the attribute you define as the first month of the year, 'M1', select 'Jan' from the list box.
Expression: Select the appropriate expression from the list box. For all the new attributes you created to unpivot the data, select the same input attribute that contains the corresponding data. For example, the unpivot attributes M1, M2, M3... M12 would all share the same expression, INGRP1.MONTHLY_SALES. For all other output attributes, select the corresponding attribute from the list of input attributes.