GeekInterview.com
   Home |  Tech FAQ  |   Interview Questions |  Placement Papers |  Tech Articles |  Learn |  Freelance Projects |  Online Testing |  Geeks Talk |  Job Postings |  Knowledge Base | Site Search |  Add/Ask Question

  GeekInterview.com  >  Interview Questions  >  Data Warehousing  >  ETL

 Print  |  
Question:  Data Cleansing Phase

Answer: What are some typical considerations during the data cleansing phase


June 06, 2009 15:38:36 #2
 SQLGal   Member Since: October 2008    Total Comments: 6 

RE: Data Cleansing Phase
 
The ultimate goal of data cleansing is to improve the organization's confidence
in their data. First set the bar for what kind of quality you are trying to
obtain. I usually shoot for a 99% level of confidence in my data. List the types
of data errors that need to be addressed such as

1) Missing data - nulls, zeros, zero length strings, and corrupted rows


2) Data that contains unwanted junk such as an apostrophe or a comma or an extra
space


3) Numeric data errors such as a negative value that should be positive


4) Telephone numbers in the wrong format. Some errors are database errors and
others are business rule errors.


Next, write aggregate queries to find errors. (Or use an ETL tool) Analyze
the query results or transformation reports and measure the impact if the errors
go unfixed and so on....

     

 

Back To Question