Clean Data before Loading

Why is it necessary to clean data before loading it into the Warehouse

Questions by riyazz.shaik

Showing Answers 1 - 9 of 9 Answers

SQLGal

  • Jun 7th, 2009
 

Warehouse data is used as source data for data analysis and reporting. The data is organized into groups and categories (aggregation) and then summarized upon those groups (dimensions). These groups are based upon exactness.

For example "house", "houses", and "home" would fall into groups because they are not exact. But logically, they are the same and should be of the same group. The process of data cleansing would correct this. This is only one example of data cleansing.


The point is that if data is not cleansed, then the resulting reports and OLAP cubes will contain too many categories, making them hard to read. The results would also be skewed because factual data (totals, counts, etc) would be distributed across the good and bad categories. Once loaded into the data warehouse, it is very difficult, if not impossible to change.

  Was this answer useful?  Yes

Prashant Khare

  • Oct 3rd, 2012
 

Data Cleansing is a process of detecting and correcting the corrupt and inaccurate data from table or database.
There are following steps used:-
1) Data auditing
2) Workflow Specification
3) Workflow Execution
4) Post-processing and controlling

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions