What Is Key Generation In Bods
Various Transforms in Data Services
KEYGENERATION Transform. KeyGeneration transform helps to generate artificial keys for new rows in a table. The transform looks up the maximum existing key value of the surrogate key column from the table; And uses it as the starting value to generate new keys for new rows in the input dataset. Feb 29, 2012 Key Generation. Although the Key Generation Transform actually does execute a SQL I put it into the Streamline category because it executes one and only one SQL at the start of the transform and this is a 'select max(key) from table'. In almost all cases this key will be the primary key column which is indexed. Jan 30, 2013 Post subject: Re: keygeneration transform vs keygeneration function The transform is more obvious because it is an object in the Dataflow. The function works for every row it encounters, while the transform will only generate a new key value for insert rows. Jun 24, 2012 A surrogate key is an auto generated value, usually integer, in the dimension table. It is made the primary key of the table and is used to join a dimension to a fact table. Among other benefits, surrogate keys allow you to maintain history in a dimension table.
An example where most of the transforms are used,
Considering the Customer as source,
a) Key generation
The Surrogate Key (similar to Surrogate ID) is generated in the transformation Key generation. The table name is to be selected in this transformation along with the increment value.
Key Generation transform helps to generate artificial keys for new rows in a table. The transform looks up the maximum existing key value of the surrogate key column from the table and uses it as the starting value to generate new keys for new rows in the input dataset. The transform expects a column with the same name as the Generated key column of the source table to be a part of the input schema.
The source table must be imported into the DS repository before defining the source table for this transform. Also we can set the Increment value i.e. the interval between the generated key values. By default it is 1. We can also use a variable placeholder for this option. We will be using this transform frequently while populating surrogate key values of slowly changing dimension tables.
Depending on the column EMP_ID, the EMP_SURR_KEY (Surrogate Key) is incremented based on the increment value.
b) Table comparison
Table Comparison transform helps to compare two data sets and generates the difference between them as a resultant data set with rows flagged as INSERT, UPDATE, or DELETE. This transform can be used to ensure rows are not duplicated in a target table, or to compare the changed records of a data warehouse dimension table. It helps to detect and forward all changes or the latest ones that have occurred since the last time the comparison table was updated. We will be using this transform frequently while implementing slowing changing dimensions and while designing dataflow for recovery.
There are three methods for accessing the comparison table namely Row-by-row select, cached comparison table and Sorted input. Below is the brief on when to select which option.
- Row-by-row select option is best if the target table is large compared to the number of rows the transform will receive as input. In this case for every input row the transform fires a SQL to lookup the target table.
- Cached comparison table option is best when we are comparing the entire target table. DS uses page able cache as the default. If the table fits in the available memory, we can change the Cache type property of the dataflow to In-Memory.
- Sorted input option is best when the input data is pre sorted based on the primary key columns. DS reads the comparison table in the order of the primary key columns using sequential read only once.
NOTE: The order of the input data set must exactly match the order of all primary key columns in the Table Comparison transform.
c) History Preservation – converts ‘UPDATE’ to ‘INSERT’
The output of the history preservation is that we get 2 columns in addition as Effective from and Effective to.
As in the example above,
Employee Id – 2222 with Name – KV belongs to Region R2 in the interval 05.04.2011 – 04.03.2012 & Region R3 from 05.03.2012 till date.
d) Validation – filtering erroneous data (data cleansing)
Validation transform is used to filter or replace the source dataset based on criteria or validation rules to produce desired output dataset. It enables to create validation rules on the input dataset, and generate the output based on whether they have passed or failed the validation condition. This transform is typically used for NULL checking for mandatory fields, Pattern matching, existence of value in reference table, validate data type, etc.
The Validation transform can generate three output dataset Pass, Fail, and Rule Violation. The Pass Output schema is identical with the Input schema. The Fail Output schema has two more columns, DI_ERRORACTION and DI_ERRORCOLUMNS. The Rule Violation has three columns DI_ROWID, DI_RULENAME and DI_COLUMNNAME.
Key Generation Software
The rule for the Validation is entered in the highlighted area and the Action on FAIL is also mentioned here.
Example rule above – Zip code should be in format ‘99999’
The output of the PASS output schema is,
The output of FAIL output schema is,
These records doesn’t match the zip code of format ‘99999’.
Vast is an Ocean,So is vast the World of Knowledge. With my diving suit packed, loaded with imaginative visions, and lots of curiosity, started diving deep into the world of BODS.Lots of work is going on. Got attracted towards the “Key_Generation” transform and was fascinated at its features.Now it was time for me to fuse and adapt myself into its world.
THE KEY_GENERATION TRANSFORM:-
This transform is categorized under the “Data Integrator Transforms”. This generates new keys for source data, starting from a value based on existing keys in the table we specify.
If needed to generate Artificial keys in a table, the Key_Generation transform looks up the maximum existing key value from a table and uses it as the starting value to generate new keys.
Aoe iii product key generator. The transform expects the generated key column to be part of the input schema.
STEPS TO USE KEY GENERATION TRANSFORM:-
Scenario:- Here the target data source for which the keys is needed to be added, have certain newly added rows without a Customer_ID. This could be easily understood in the following snap:-
Our aim here is to automatically generate the keys(Customer_ID) in this case , for the newly inserted records which have no Customer_Id. Accordingly we have taken the following as our input (the modified data without Customer_ID)
INPUT DATA (to be staged in the db):-
TARGET TABLE(which contains the data initially contained in the source table before the entry of new records in the database):-
THE GENERATED DATA FLOW:-
CONTENT OF SOURCE DATA:- (containing the modified entry alone)
CONTENT OF QUERY_TRANSFORM:-
CONTENT OF THE KEY_GENERATION TRANSFORM:-
THE CONTENTS OF THE TARGET TABLE PRIOR JOB EXECUTION:-
The JOB_EXECUTION:-
THE OUTPUT AFTER THE JOB EXECUTION:-
We can now see from the output how Keys have been generated automatically to those records which did not have the Customer_ID initially.
What Is Key Generation In Bods 2017
I explored this little process of the Key_Generation transform, and it seems a savior at times when huge amount of data have the missing entries(wrt to the keys or any sequential column fields).
What Is Bod Testing
Now its time to go back to the surface of waters…….