Find out duplicate records using aggregator

How to find out duplicate records using aggregator. Pls explain with example. What is the need for placing joiner after aggregator. I just tried like the following:

Count the records and group by all the columns. if the count=1 then it will be a valid record. what to do if the count >2.means file contains duplicate records.
Why we need to use joiner after aggregator?

Questions by buvanalogu

Showing Answers 1 - 12 of 12 Answers

arun.menon

  • Oct 24th, 2008
 

With respect to your query. Try getting the session log details regarding the session run. In case if you are not able to get sufficient informations, running the workflow in Verbose data mode. That will give you an exact account where things went wrong.

Cheers
Arun

  Was this answer useful?  Yes

Hi,

Its similar to the SQL query,

SELECT * FROM ,,....
FROM TABLE_NAME
GROUP BY ,,.....
HAVING COUNT(*)=1

Similarly in Informatica Aggregator transformation, select group by for all the columns and add one output port,OUT_CNT_RCRDS=count(*)

In the next transformation, use a Router transformation and put a condition,
G1_OUT_CNT_RCRDS=1
G2_OUT_CNT_RCRDS>1

G1_OUT_CNT_RCRDS --> TGT_NO_DUPLICATES
G2_OUT_CNT_RCRDS --> TGT_DUPLICATES

Hope this helps.
Thanks.

VISHNU VARDHAN

  • Nov 17th, 2015
 

Use HAVING Condition in SELECT clause, You can write code like below to test, if it is greater than one then it is duplicate otherwise Unique.

Code
  1. SIMPLE EXAMPLE I AM WRITTEN HERE

  2. SELECT EMPNO,COUNT(*) FROM EMP

  3. GROUP BY EMPNO

  4. HAVING COUNT(*)>1;

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions