Remove duplicates using transformer

How do you remove duplicates using transformer stage in datastage

Questions by nagakalyan   answers by nagakalyan

Showing Answers 1 - 9 of 9 Answers

vira_venkat

  • Jun 25th, 2008
 

You have to put primary key on the column on which you want to remove duplicates.
That will remove the duplicates.If You want to catch the rejected rows you can apply ''rejected'constraint in the constraint tab of transformer.

  Was this answer useful?  Yes

In that Time  double click on transformer stage---> Go to Stage properties(its having in hedder line first icon) ---->double click on stage properties --->Go to inputs ---->go to partitioning---->select one partition technick(with out auto)--->now enable perform sort--->click on perfom sort----> now enable unique---->click on that and we can take required colum name. now out put will come unique values so here duplicats will be removed.
           

To capture rejected duplicates use a Transformer. Partition and sort on your primary key. In a transformer keep the primary key stored in a Stage Variable. Compare incoming primary key to the stored primary key Stage Variable. If it is the same output the incoming row as a duplicate, if it is different output the row as unique and save the new primary key.

You need at least two stage variables, one to do the comparison and the other to store the key value:

Variable: Derivation
IsDuplicate: input.keyfield = SavedKey
SavedKey: input.keyfield


Special thanks to Vincent

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions