| |
GeekInterview.com > Tech FAQs > DataStage
| Print | |
Question: How can I identify the duplicate rows in a seq or comma delimited file? the case is...> the source has 4 values like, agent id, agent name, etc... our requirement is that the ID shouldn't be repeated. so how can i identify the duplicate rows , set a flag and send the rejects to the specified reject file? the source systems data is directly given to us. tha's why we are getting these duplicates.if it has a primary key set up already then it would have been very easy. thanks in advance.
|
| July 07, 2008 03:21:31 |
#3 |
| ds_ng |
Database Expert Member Since: July 2008 Total Comments: 1 |
RE: How can I identify the duplicate rows in a seq or comma delimited file?the case is...> the source has 4 values like, agent id, agent name, etc... our requirement is that the ID shouldn't be repeated. so how can i identify the duplicate rows , set a |
| If your working server jobs, use Sort stage then use Aggregator stage. Use property like 'Last' or 'First'. Then duplicated rows will be removed |
| |
Back To Question | |