| |
GeekInterview.com > Interview Questions > Data Warehousing > DataStage
| Print | |
Question: Sequential file with Duplicate Records
Answer: A sequential file has 8 records with one column, below are the values in the column separated by space, 1 1 2 2 3 4 5 6
In a parallel job after reading the sequential file 2 more sequential files should be created, one with duplicate records and the other without duplicates. File 1 records separated by space: 1 1 2 2 File 2 records separated by space: 3 4 5 6 How will you do it |
| April 04, 2009 08:53:56 |
#1 |
| khasimsyda |
Member Since: April 2009 Total Comments: 1 |
RE: Sequential file with Duplicate Records |
We can segregate the file into two files by using AGGREGATOR stage. One file with records having count 1 and the other file with records having count more than 1.
In the properties tab of AGGREGATOR stage select the Aggregation type as "Count Rows" and Count Output Column as "Count". Next by using transformer you can devide the file as desired.
Condition used in the transformer DSLink.Count=1 for the Non-Duplicate rows. Rest of the records will be duplicates. |
| |
Back To Question | |