| |
GeekInterview.com > Tech FAQs > DataStage
| Print | |
Question: How can I identify the duplicate rows in a seq or comma delimited file? the case is...> the source has 4 values like, agent id, agent name, etc... our requirement is that the ID shouldn't be repeated. so how can i identify the duplicate rows , set a flag and send the rejects to the specified reject file? the source systems data is directly given to us. tha's why we are getting these duplicates.if it has a primary key set up already then it would have been very easy. thanks in advance.
|
| January 01, 2007 14:24:28 |
#1 |
| Mansoor |
Member Since: Visitor Total Comments: N/A |
RE: How can I identify the duplicate rows in a seq or ... |
| Sort the sequential file based on the key AGENT_ID adn set the option "Create Key Change Column" to TRUE in the sort stage. The records which has the duplicate records will be populated with the value 0(Zero) in the KeyChange field. Now reject the records which has the value 0. |
| |
Back To Question | |