GeekInterview.com
   Home |  Tech FAQ  |   Interview Questions |  Placement Papers |  Tech Articles |  Learn |  Freelance Projects |  Online Testing |  Geeks Talk |  Job Postings |  Knowledge Base | Site Search |  Add/Ask Question

  GeekInterview.com  >  Tech FAQs  >  DataStage

 Print  |  
Question:  How can I identify the duplicate rows in a seq or comma delimited file?
the case is...> the source has 4 values like, agent id, agent name, etc... our requirement is that the ID shouldn't be repeated. so how can i identify the duplicate rows , set a flag and send the rejects to the specified reject file? the source systems data is directly given to us. tha's why we are getting these duplicates.if it has a primary key set up already then it would have been very easy.
thanks in advance.




July 07, 2008 03:21:31 #3
 ds_ng Database Expert  Member Since: July 2008    Total Comments: 1 

RE: How can I identify the duplicate rows in a seq or comma delimited file?the case is...> the source has 4 values like, agent id, agent name, etc... our requirement is that the ID shouldn't be repeated. so how can i identify the duplicate rows , set a
 
If your working server jobs,  use Sort stage then use Aggregator stage. Use property like 'Last' or 'First'. Then duplicated rows will be removed
     

 

Back To Question