How do you eliminate duplicate rows?

Showing Answers 1 - 9 of 9 Answers

Pavan

  • Jul 7th, 2005
 

Data Stage provides us with a stage Remove Duplicates in Enterprise edition. Using that stage we can eliminate the duplicates based on a key column.

  Was this answer useful?  Yes

Gokul tendulkar

  • Jul 15th, 2005
 

The Duplicates can be eliminated by loading thecorresponding data in the Hash file. Specify the columns on which u want to eliminate as the keys of hash. 

  Was this answer useful?  Yes

Mujeebur

  • Aug 15th, 2005
 

removal of duplicates done in two ways: 
1. Use "Duplicate Data Removal" stage  
or 
2. use group by on all the columns used in select , duplicates will go away.

  Was this answer useful?  Yes

sql: delete from tablename a where rowid>(select min(rowid) from tablename b) where a'key values=b.key values.
datastage server:using sort stage(option:allow duplicates(yes/no))
or hash file

parallel: remove duplicate stage

Regards,
Madhava

  Was this answer useful?  Yes

tisha24

  • Dec 29th, 2008
 

In server jobs we can make use of HASH FILE stage to eliminate the duplicate rows.

In parallel jobs we can use REMOVE DUPLICATE  stage to eliminate the duplicates.
we can also use sort stage to eliminate duplicate records.

(NOTE : But its advisible to remove the duplicate records at the databases using query., any way its depends the requirement too)

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions