GeekInterview.com
  I am new, Sign me up!
 
GeekInterview.com  >  Interview Questions  >  Data Warehousing  >  ETL
Next Question 
 ETL  |  Question 1 of 65    Print  
What is a staging area? Do we need it? What is the purpose of a staging area?

Answered by Laxmi on 2005-05-13 16:42:53: Staging area is place where you hold temporary tables on data warehouse server. Staging tables are connected to work area or fact tables. We basically need staging area to hold the data , and perform data cleansing and merging , before loading the data into warehouse.



  
Total Answers and Comments: 17 Last Update: May 01, 2008   
  
 Sponsored Links

 
 Best Rated Answer
Submitted by: jryan999
 
The staging area is:-

  • One or more database schema(s) or file stores used to “stage” data extracted from the source OLTP systems prior to being published to the “warehouse” where it is visible to end users.
  • Data in the staging area is NOT visible to end users for queries, reports or analysis of any kind.  It does not hold completed data ready for querying.
  • It may hold intermediate results, (if data is pipelined through a process)
  • Equally it may hold “state” data – the keys of the data held on the warehouse, and used to detect whether incoming data includes New or Updated rows.  (Or deleted for that matter).
  • It is likely to be equal in size (or maybe larger) than the “presentation area” itself.
  • Although the “state” data – eg. Last sequence loaded may be backed up, much of the staging area data is automatically replaced during the ETL load processes, and can with care avoid adding to the backup effort.  The presentation area however, may need backup in many cases.
  • It may include some metadata, which may be used by analysts or operators monitoring the state of the previous loads (eg. audit information, summary totals of rows loaded etc).
  • It’s likely to hold details of “rejected” entries – data which has failed quality tests, and may need correction and re-submission to the ETL process.
  • It’s likely to have few indexes (compared to the “presentation area”), and hold data in a quite normalised form.  The presentation area (the bit the end users see), is by comparison likely to be more highly indexed (mainly bitmap indexes), with highly denormalised tables (the Dimension tables anyway).

 

The staging area exists to be a separate “back room“ or “engine room”  of the warehouse where the data can be transformed, corrected and prepared for the warehouse. 

 
It should ONLY be accessible to the ETL processes working on the data, or administrators monitoring or managing the ETL process.

In summary.  A typical warehouse generally has three distinct areas:-

  1. Several source systems which provide data.  This can include databases (Oracle, SQL Server, Sybase etc) or files or spreadsheets
  2. A single “staging area” which may use one or more database schemas or file stores (depending upon warehouse load volumes).  
  3. One or more “visible” data marts or a single “warehouse presentation area” where data is made visible to end user queries.  This is what many people think of as the warehouse – although the entire system is the warehouse – it depends upon your perspective.


The “staging area” is the middle bit.



Above answer was rated as good by the following members:
njvijay, butair
  Sorting Options  
  Page 1 of 2   « First    1    2    >     Last »  
May 13, 2005 16:42:53   #1  
Laxmi        

RE: What is a staging area? Do we need it? What is the purpose of a staging area?
Staging area is place where you hold temporary tables on data warehouse server. Staging tables are connected to work area or fact tables. We basically need staging area to hold the data and perform data cleansing and merging before loading the data into warehouse.
 
Is this answer useful? Yes | No
July 20, 2005 07:18:00   #2  
Kesavan,H.        

RE: What is a staging area? Do we need it? What is the purpose of a staging area?
In the absence of a staging area the data load will have to go from the OLTP system to the OLAP system directly which in fact will severely hamper the performance of the OLTP system. This is the primary reason for the existence of a staging area. In addition it also offers a platform for carrying out data cleansing.
 
Is this answer useful? Yes | No
August 18, 2005 12:14:12   #3  
chandru        

RE: What is a staging area? Do we need it? What is the purpose of a staging area?
Staging area is a temp schema used to
1. Do Flat mapping i.e dumping all the OLTP data in to it without applying any business rules.pushing data into staging will take less time becasuse there is no business rules or transformation applied on it.

2. Used for data cleansing and validation using First Logic.

 
Is this answer useful? Yes | No
August 23, 2005 17:13:00   #4  
Sriharsha Hanumanth        

RE: What is a staging area? Do we need it? What is the purpose of a staging area?
A staging area is like a large table with data separated from their sources to be loaded into a data warehouse in the required format. If we attempt to load data directly from OLTP it might mess up the OLTP because of format changes between a warehouse and OLTP. Keeping the OLTP data intact is very important for both the OLTP and the warehouse.
 
Is this answer useful? Yes | No
September 16, 2005 01:45:30   #5  
Dilip Member Since: September 2005   Contribution: 1    

RE: What is a staging area? Do we need it? What is the...

According to the complexity of the business rule we may require staging area the basic need of staging area is to clean the OLTP source data and gather in a place. Basically it s a temporary database area. Staging area data is used for the further process and after that they can be deleted.


 
Is this answer useful? Yes | No
October 15, 2005 00:01:40   #6  
Aqil Raza        

RE: What is a staging area? Do we need it? What is the...

Staging area is a common place to keep data from different source systems before performing transformation (cleansing)operations. The decision to store data in physical staging area versus processing it in memory is ultimately the choice of the ETL architect. The main reasons for staging data before it is loaded into the data warehouse: 1. Recoverability


 
Is this answer useful? Yes | No
December 29, 2005 18:01:20   #7  
bvrp        

RE: What is a staging area? Do we need it? What is the...
Staging area is a place where the data will be stored before applying transformation operations and from there the data will be stored to the data warehouse
 
Is this answer useful? Yes | No
January 29, 2006 11:15:31   #8  
kishore        

RE: What is a staging area? Do we need it? What is the...

satging is also a data base. if the sources r originated from different sources for example db2 oracle teradata.. etc all thouse can be

satanderdize in staging area. also joining of the tables can be done there. if there is only one source system then staging is not nessesary.


 
Is this answer useful? Yes | No
May 06, 2006 00:28:47   #9  
shaik        

RE: What is a staging area? Do we need it? What is the...

Staging area is a place to hold data from OLTP to OLAP until all transformation had been complete and without staging we can't do modify filter cleansing as per my knowledge


 
Is this answer useful? Yes | No
June 29, 2006 12:47:39   #10  
kmalla Member Since: March 2006   Contribution: 8    

RE: What is a staging area? Do we need it? What is the...
These are the intermediate tables into which the data is loaded
 
Is this answer useful? Yes | No
  Page 1 of 2   « First    1    2    >     Last »  


 
Go To Top


 Sponsored Links

 
About Us -  Privacy Policy -  Terms and Conditions -  Contact -  Ask Question -  Propose Category -  Site Updates 

Copyright © 2005 - 2009 GeekInterview.com. All Rights Reserved

Page copy protected against web site content infringement by Copyscape