How can I identify the duplicate rows in a seq or comma delimited file?the case is...> the source has 4 values like, agent id, agent name, etc... our requirement is that the ID shouldn't be repeated. so how can i identify the duplicate rows , set a flag and send the rejects to the specified reject file? the source systems data is directly given to us. tha's why we are getting these duplicates.if it has a primary key set up already then it would have been very easy.thanks in advance.

Showing Answers 1 - 9 of 9 Answers

Mansoor

  • Jan 29th, 2007
 

Sort the sequential file based on the key AGENT_ID adn set the option "Create Key Change Column" to TRUE in the sort stage. The records which has the duplicate records will be populated with the value 0(Zero) in the KeyChange field. Now reject the records which has the value 0.

  Was this answer useful?  Yes

the_xxx

  • Jul 3rd, 2007
 

Hi, if your working on Server jobs. Sort the data first and use Hash file stage which has the property of eliminating duplicates.

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions