GeekInterview.com
   Home |  Tech FAQ  |   Interview Questions |  Placement Papers |  Tech Articles |  Learn |  Freelance Projects |  Online Testing |  Geeks Talk |  Job Postings |  Knowledge Base | Site Search |  Add/Ask Question

  GeekInterview.com  >  Interview Questions  >  Data Warehousing  >  DataStage

 Print  |  
Question:  Sequential file with Duplicate Records

Answer: A sequential file has 8 records with one column, below are the values in the column separated by space,
1 1 2 2 3 4 5 6

In a parallel job after reading the sequential file 2 more sequential files should be created, one with duplicate records and the other without duplicates.
File 1 records separated by space: 1 1 2 2
File 2 records separated by space: 3 4 5 6
How will you do it


April 04, 2009 08:53:56 #1
 khasimsyda   Member Since: April 2009    Total Comments: 1 

RE: Sequential file with Duplicate Records
 
We can segregate the file into two files by using AGGREGATOR stage. One file with records having count 1 and the other file with records having count more than 1. 

In the properties tab of AGGREGATOR stage select the Aggregation type as  "Count Rows"  and Count Output Column as "Count". Next by using transformer you can devide the file as desired.

Condition used in the transformer DSLink.Count=1 for the Non-Duplicate rows. Rest of the records will be duplicates.
     

 

Back To Question