GeekInterview.com
  I am new, Sign me up!
 
GeekInterview.com  >  Interview Questions  >  Data Warehousing  >  DataStage
Go To First  |  Previous Question  |  Next Question 
 DataStage  |  Question 382 of 390    Print  
Sequential file with Duplicate Records
A sequential file has 8 records with one column, below are the values in the column separated by space,
1 1 2 2 3 4 5 6

In a parallel job after reading the sequential file 2 more sequential files should be created, one with duplicate records and the other without duplicates.
File 1 records separated by space: 1 1 2 2
File 2 records separated by space: 3 4 5 6
How will you do it



  
Total Answers and Comments: 4 Last Update: October 22, 2009     Asked by: rajivkumar23us 
  
 Sponsored Links

 
 Best Rated Answer

No best answer available. Please pick the good answer available or submit your answer.
April 16, 2009 08:53:56   #1  
khasimsyda Member Since: April 2009   Contribution: 1    

RE: Sequential file with Duplicate Records
We can segregate the file into two files by using AGGREGATOR stage. One file with records having count 1 and the other file with records having count more than 1.

In the properties tab of AGGREGATOR stage select the Aggregation type as Count Rows and Count Output Column as Count . Next by using transformer you can devide the file as desired.

Condition used in the transformer DSLink.Count 1 for the Non-Duplicate rows. Rest of the records will be duplicates.

 
Is this answer useful? Yes | No
August 20, 2009 05:43:33   #2  
rameshkm Member Since: April 2009   Contribution: 7    

RE: Sequential file with Duplicate Records
By using aggregator we can obtain this but the out put is like 1 2 and 3456
 
Is this answer useful? Yes | No
August 20, 2009 06:10:30   #3  
rameshkm Member Since: April 2009   Contribution: 7    

RE: Sequential file with Duplicate Records

By Using Transformer the data from source sequential file is segregate in to two links (Link A and Link B) the link A is followed by Aggregator the Aggregator type is set to be count rows and count output column name is XXX then perform left outer join with the Link B and link from aggregator after that by using transformer we segregate the data as two by using constraints as XXX 1 and XXX >1 so we get out put as 1122 and 34536


 
Is this answer useful? Yes | No
October 22, 2009 06:16:42   #4  
nagoosk Member Since: November 2007   Contribution: 15    

RE: Sequential file with Duplicate Records

1) We have an stage called Remove duplicate stage through which we can delete the duplicate records.

2) Use the aggregator stage and specify the particular column on which you want to delete the duplicates


 
Is this answer useful? Yes | No


 
Go To Top


 Sponsored Links

 
About Us -  Privacy Policy -  Terms and Conditions -  Contact -  Ask Question -  Propose Category -  Site Updates 

Copyright © 2005 - 2009 GeekInterview.com. All Rights Reserved

Page copy protected against web site content infringement by Copyscape