GeekInterview.com
Series: Subject: Topic:
Question: 76 of 76

What is a staging area? Do we need it? What is the purpose of a staging area?


Answered by Laxmi on 2005-05-13 16:42:53: Staging area is place where you hold temporary tables on data warehouse server. Staging tables are connected to work area or fact tables. We basically need staging area to hold the data , and perform data cleansing and merging , before loading the data into warehouse.
Asked by: Interview Candidate | Asked on: Aug 23rd, 2005
Showing Answers 1 - 17 of 17 Answers
Laxmi

Answered On : May 13th, 2005

Staging area is place where you hold temporary tables on data warehouse server. Staging tables are connected to work area or fact tables. We basically need staging area to hold the data , and perform data cleansing and merging , before loading the data into warehouse.

  
Login to rate this answer.
Kesavan,H.

Answered On : Jul 20th, 2005

In the absence of a staging area, the data load will have to go from the OLTP system to the OLAP system directly, which in fact will severely hamper the performance of the OLTP system. This is the primary reason for the existence of a staging area. In addition, it also offers a platform for carrying out data cleansing.

Yes  2 Users have rated as useful.
  
Login to rate this answer.
chandru

Answered On : Aug 18th, 2005

Staging area is a temp schema used to  
1. Do Flat mapping i.e dumping all the OLTP data in to it without applying any business rules.pushing data into staging will take less time becasuse there is no business rules or transformation applied on it. 
 
2. Used for data cleansing and validation using First Logic.

  
Login to rate this answer.
Sriharsha Hanumanth

Answered On : Aug 23rd, 2005

A staging area is like a large table with data separated from their sources to be loaded into a data warehouse in the required format. If we attempt to load data directly from OLTP, it might mess up the OLTP because of format changes between a warehouse and OLTP. Keeping the OLTP data intact is very important for both the OLTP and the warehouse.

  
Login to rate this answer.
Dilip

Answered On : Sep 16th, 2005

View all answers by Dilip

According to the complexity of the business rule, we may require staging area, the basic need of staging area is to clean the OLTP source data and gather in a place. Basically its a temporary database area. Staging area data is used for the further process and after that they can be deleted.

  
Login to rate this answer.
Aqil Raza

Answered On : Oct 15th, 2005

Staging area is a common place to keep data from different source systems before performing transformation (cleansing)operations. The decision to store data in physical staging area versus processing it in memory is ultimately the choice of the ETL architect. The main reasons for staging data before it is loaded into the data warehouse:     1. Recoverability

  
Login to rate this answer.
bvrp

Answered On : Dec 29th, 2005

Staging area is a place where the data will be stored before applying transformation operations and from there the data will be stored to the data warehouse

  
Login to rate this answer.
kishore

Answered On : Jan 29th, 2006

satging is also a data base. if the sources r originated from different sources for example ,,db2,oracle,teradata.. etc all thouse can be

satanderdize in staging area. also joining of the tables can be done there. if there is only one source system then staging is not nessesary.

  
Login to rate this answer.
shaik

Answered On : May 6th, 2006

Staging area is a place  to hold data from OLTP to OLAP until all transformation had been complete and without staging we can't do modify,filter,cleansing as per my knowledge

  
Login to rate this answer.
kmalla

Answered On : Jun 29th, 2006

View all answers by kmalla

These are the intermediate tables into which the data is loaded

  
Login to rate this answer.
ielosta

Answered On : Aug 14th, 2006

View all answers by ielosta

The staging area is an intermediate database temp-table or allocated flat file to hold temporary pre-loaded data phase before the loading of the final phase into the Database tables. The ETL/ECTL or migrations process commonly uses it.

  
Login to rate this answer.
ravi

Answered On : Mar 2nd, 2007

We need staging area, mainly for Data cleansing and Query purposes

  
Login to rate this answer.
uma

Answered On : Apr 12th, 2007

staging area is the place where data transformation activities takes place.

  
Login to rate this answer.
Siraj Khan

Answered On : Aug 9th, 2007

Staging layer is a layer between Source System and the target System (data warehouse).There are two types of layers in data Warehouse Architecture .
Two types of layers are :
a) Staging layer
b) ODS (Operational data store).

ODS is used to store the Recent Data whereas Staging layer is used to store the data periodically.

Staging Layer is used for Cleansing Purpose. For ewample - RTRIM & LTRIM.

  
Login to rate this answer.
Ramsuresh Konijarla

Answered On : Sep 7th, 2007

Staging is a temporary database where we can create a temp tables and then populate the raw data which we need to apply the bussiness rules from a flat files/any other data source/ another database tables

In this stage tables we will apply the bussiness rules and according to the satisfactory of the rules we will update the Target tables if it fails to satisfy the bussiness rule we will popuate them in the Exception Database by mapping with a Exception code ...

  
Login to rate this answer.
Mamatha

Answered On : Oct 24th, 2007

Hi,
  Staging area is place where you hold temporary tables on data warehouse server.
In the absence of a staging area, the data load will have to go from the OLTP system to the OLAP system directly, which in fact will severely hamper the performance of the OLTP system.
This is the primary reason for the existence of a staging area
and also perform clensing of data as per the target like adding audit columns,removing null values,formatting the datatypes according to the target and finally applies the business logic and loads in to the target tables.

Regards
Mamatha

  
Login to rate this answer.

The staging area is:-

  • One or more database schema(s) or file stores used to “stage” data extracted from the source OLTP systems prior to being published to the “warehouse” where it is visible to end users.
  • Data in the staging area is NOT visible to end users for queries, reports or analysis of any kind.  It does not hold completed data ready for querying.
  • It may hold intermediate results, (if data is pipelined through a process)
  • Equally it may hold “state” data – the keys of the data held on the warehouse, and used to detect whether incoming data includes New or Updated rows.  (Or deleted for that matter).
  • It is likely to be equal in size (or maybe larger) than the “presentation area” itself.
  • Although the “state” data – eg. Last sequence loaded may be backed up, much of the staging area data is automatically replaced during the ETL load processes, and can with care avoid adding to the backup effort.  The presentation area however, may need backup in many cases.
  • It may include some metadata, which may be used by analysts or operators monitoring the state of the previous loads (eg. audit information, summary totals of rows loaded etc).
  • It’s likely to hold details of “rejected” entries – data which has failed quality tests, and may need correction and re-submission to the ETL process.
  • It’s likely to have few indexes (compared to the “presentation area”), and hold data in a quite normalised form.  The presentation area (the bit the end users see), is by comparison likely to be more highly indexed (mainly bitmap indexes), with highly denormalised tables (the Dimension tables anyway).

 

The staging area exists to be a separate “back room“ or “engine room”  of the warehouse where the data can be transformed, corrected and prepared for the warehouse. 

 
It should ONLY be accessible to the ETL processes working on the data, or administrators monitoring or managing the ETL process.

In summary.  A typical warehouse generally has three distinct areas:-

  1. Several source systems which provide data.  This can include databases (Oracle, SQL Server, Sybase etc) or files or spreadsheets
  2. A single “staging area” which may use one or more database schemas or file stores (depending upon warehouse load volumes).  
  3. One or more “visible” data marts or a single “warehouse presentation area” where data is made visible to end user queries.  This is what many people think of as the warehouse – although the entire system is the warehouse – it depends upon your perspective.


The “staging area” is the middle bit.

Yes  3 Users have rated as useful.
  
Login to rate this answer.

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

Related Open Questions

Ads

Connect

twitter fb Linkedin GPlus RSS

Ads

Interview Question

 Ask Interview Question?

 

Latest Questions

Interview & Career Tips

Get invaluable Interview and Career Tips delivered directly to your inbox. Get your news alert set up today, Once you confirm your Email subscription, you will be able to download Job Inteview Questions Ebook . Please contact me if you there is any issue with the download.