Surrogate key

Why we implement sarrogate key in ETL Process?

Questions by prasannavld

Showing Answers 1 - 16 of 16 Answers

vivek1708

  • Jul 29th, 2008
 

Using the primary key in the DW doesnt actually end up serving as the unique constraint, as the data inside a Warehouse, represents - a snapshot of a time and hence using the same primary key as used in the Operational warehouse, tends to violate this constraint.
Hence we also use the Surrogate key also called the Datawarehousing key with moving data from ODS to DW, in order the contraints and logical integrity is met.

  Was this answer useful?  Yes

ETL is not a one time process. It is used to make periodical updates to
dimensional as well as fact data.


Usually dimensions contains lookup data and has less tendency to change, but
if it changes and you want to maintain the change history then surrogate key
will help with it. Read about SCD (slowly changing dimension) for more details.


Also, surrogate key helps uniquely identify a row in the table. Many times
ETL Extraction process is designed to pull data from multiple sources and
there's possibility that same rows can get pulled from different sources. If you
use key column same as source, you are likely to get the Primary Key violation
error. Surrogate key will let you pull all the data and you can clean it during
Transformation/Cleansing process.


mallikkd

  • Aug 14th, 2010
 

It is a unique primary key that is not derived from the data that it represents, therefore changes to the data will not change the primary key.

In star schema it is used to join Dimension and Fact tables.

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions