The Duplicates can be eliminated by loading thecorresponding data in the Hash file. Specify the columns on which u want to eliminate as the keys of hash.
removal of duplicates done in two ways: 1. Use Duplicate Data Removal stage or 2. use group by on all the columns used in select duplicates will go away.
sql: delete from tablename a where rowid>(select min(rowid) from tablename b) where a'key values b.key values. datastage server:using sort stage(option:allow duplicates(yes/no)) or hash file