Data Cleansing Phase

What are some typical considerations during the data cleansing phase

Questions by riyazz.shaik

Showing Answers 1 - 6 of 6 Answers

smsreddy

  • May 20th, 2009
 

Before start the cleansing process, do some quality assement for the source data for eg: NULL values for some important columns and -ve or zero values.

  Was this answer useful?  Yes

SQLGal

  • Jun 7th, 2009
 

The ultimate goal of data cleansing is to improve the organization's confidence
in their data. First set the bar for what kind of quality you are trying to
obtain. I usually shoot for a 99% level of confidence in my data. List the types
of data errors that need to be addressed such as

1) Missing data - nulls, zeros, zero length strings, and corrupted rows
2) Data that contains unwanted junk such as an apostrophe or a comma or an extra
space
3) Numeric data errors such as a negative value that should be positive
4) Telephone numbers in the wrong format. Some errors are database errors and
others are business rule errors.


Next, write aggregate queries to find errors. (Or use an ETL tool) Analyze
the query results or transformation reports and measure the impact if the errors
go unfixed and so on....

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions