What is a staging area? Do we need it? What is the purpose of a staging area?


Answered by Laxmi on 2005-05-13 16:42:53: Staging area is place where you hold temporary tables on data warehouse server. Staging tables are connected to work area or fact tables. We basically need staging area to hold the data , and perform data cleansing and merging , before loading the data into warehouse.

  
Showing Answers 1 - 29 of 29 Answers

Laxmi

  • May 13th, 2005
 

Staging area is place where you hold temporary tables on data warehouse server. Staging tables are connected to work area or fact tables. We basically need staging area to hold the data , and perform data cleansing and merging , before loading the data into warehouse.

Kesavan,H.

  • Jul 20th, 2005
 

In the absence of a staging area, the data load will have to go from the OLTP system to the OLAP system directly, which in fact will severely hamper the performance of the OLTP system. This is the primary reason for the existence of a staging area. In addition, it also offers a platform for carrying out data cleansing.

chandru

  • Aug 18th, 2005
 

Staging area is a temp schema used to  
1. Do Flat mapping i.e dumping all the OLTP data in to it without applying any business rules.pushing data into staging will take less time becasuse there is no business rules or transformation applied on it. 
 
2. Used for data cleansing and validation using First Logic.

  Was this answer useful?  Yes

Sriharsha Hanumanth

  • Aug 23rd, 2005
 

A staging area is like a large table with data separated from their sources to be loaded into a data warehouse in the required format. If we attempt to load data directly from OLTP, it might mess up the OLTP because of format changes between a warehouse and OLTP. Keeping the OLTP data intact is very important for both the OLTP and the warehouse.

  Was this answer useful?  Yes

Dilip

  • Sep 16th, 2005
 

According to the complexity of the business rule, we may require staging area, the basic need of staging area is to clean the OLTP source data and gather in a place. Basically it?s a temporary database area. Staging area data is used for the further process and after that they can be deleted.

  Was this answer useful?  Yes

Aqil Raza

  • Oct 15th, 2005
 

Staging area is a common place to keep data from different source systems before performing transformation (cleansing)operations. The decision to store data in physical staging area versus processing it in memory is ultimately the choice of the ETL architect. The main reasons for staging data before it is loaded into the data warehouse:     1. Recoverability

bvrp

  • Dec 29th, 2005
 

Staging area is a place where the data will be stored before applying transformation operations and from there the data will be stored to the data warehouse

  Was this answer useful?  Yes

kishore

  • Jan 29th, 2006
 

satging is also a data base. if the sources r originated from different sources for example ,,db2,oracle,teradata.. etc all thouse can be

satanderdize in staging area. also joining of the tables can be done there. if there is only one source system then staging is not nessesary.

  Was this answer useful?  Yes

shaik

  • May 6th, 2006
 

Staging area is a place  to hold data from OLTP to OLAP until all transformation had been complete and without staging we can't do modify,filter,cleansing as per my knowledge

  Was this answer useful?  Yes

ielosta

  • Aug 14th, 2006
 

The staging area is an intermediate database temp-table or allocated flat file to hold temporary pre-loaded data phase before the loading of the final phase into the Database tables. The ETL/ECTL or migrations process commonly uses it.

  Was this answer useful?  Yes

ravi

  • Mar 2nd, 2007
 

We need staging area, mainly for Data cleansing and Query purposes

  Was this answer useful?  Yes

uma

  • Apr 12th, 2007
 

staging area is the place where data transformation activities takes place.

  Was this answer useful?  Yes

Siraj Khan

  • Aug 9th, 2007
 

Staging layer is a layer between Source System and the target System (data warehouse).There are two types of layers in data Warehouse Architecture .
Two types of layers are :
a) Staging layer
b) ODS (Operational data store).

ODS is used to store the Recent Data whereas Staging layer is used to store the data periodically.

Staging Layer is used for Cleansing Purpose. For ewample - RTRIM & LTRIM.

  Was this answer useful?  Yes

Ramsuresh Konijarla

  • Sep 7th, 2007
 

Staging is a temporary database where we can create a temp tables and then populate the raw data which we need to apply the bussiness rules from a flat files/any other data source/ another database tables

In this stage tables we will apply the bussiness rules and according to the satisfactory of the rules we will update the Target tables if it fails to satisfy the bussiness rule we will popuate them in the Exception Database by mapping with a Exception code ...

  Was this answer useful?  Yes

Mamatha

  • Oct 24th, 2007
 

Hi,
  Staging area is place where you hold temporary tables on data warehouse server.
In the absence of a staging area, the data load will have to go from the OLTP system to the OLAP system directly, which in fact will severely hamper the performance of the OLTP system.
This is the primary reason for the existence of a staging area
and also perform clensing of data as per the target like adding audit columns,removing null values,formatting the datatypes according to the target and finally applies the business logic and loads in to the target tables.

Regards
Mamatha

  Was this answer useful?  Yes

The staging area is:-

  • One or more database schema(s) or file stores used to “stage” data extracted from the source OLTP systems prior to being published to the “warehouse” where it is visible to end users.
  • Data in the staging area is NOT visible to end users for queries, reports or analysis of any kind.  It does not hold completed data ready for querying.
  • It may hold intermediate results, (if data is pipelined through a process)
  • Equally it may hold “state” data – the keys of the data held on the warehouse, and used to detect whether incoming data includes New or Updated rows.  (Or deleted for that matter).
  • It is likely to be equal in size (or maybe larger) than the “presentation area” itself.
  • Although the “state” data – eg. Last sequence loaded may be backed up, much of the staging area data is automatically replaced during the ETL load processes, and can with care avoid adding to the backup effort.  The presentation area however, may need backup in many cases.
  • It may include some metadata, which may be used by analysts or operators monitoring the state of the previous loads (eg. audit information, summary totals of rows loaded etc).
  • It’s likely to hold details of “rejected” entries – data which has failed quality tests, and may need correction and re-submission to the ETL process.
  • It’s likely to have few indexes (compared to the “presentation area”), and hold data in a quite normalised form.  The presentation area (the bit the end users see), is by comparison likely to be more highly indexed (mainly bitmap indexes), with highly denormalised tables (the Dimension tables anyway).

 

The staging area exists to be a separate “back room“ or “engine room”  of the warehouse where the data can be transformed, corrected and prepared for the warehouse. 

 
It should ONLY be accessible to the ETL processes working on the data, or administrators monitoring or managing the ETL process.

In summary.  A typical warehouse generally has three distinct areas:-

  1. Several source systems which provide data.  This can include databases (Oracle, SQL Server, Sybase etc) or files or spreadsheets
  2. A single “staging area” which may use one or more database schemas or file stores (depending upon warehouse load volumes).  
  3. One or more “visible” data marts or a single “warehouse presentation area” where data is made visible to end user queries.  This is what many people think of as the warehouse – although the entire system is the warehouse – it depends upon your perspective.


The “staging area” is the middle bit.

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions