Aggregate Value

Explain a mapping design to start aggregating the values before getting all the rows from the source.

Questions by vikramthemass

Showing Answers 1 - 21 of 21 Answers

The Purpose of ETL tool is to Extract-Transform-Load, meaning to extract the data, then make it undergo transformation and then load the data. 

The question fails to satisfy the basic rule of the ETL architecture. 

Any aggregation should be done once you extract from the source in the Extract Phase (frequently in SQ query) or in the Transformation phase using an Aggregator.

  Was this answer useful?  Yes

Mallikar

  • Jun 17th, 2010
 

Hi,

When you want to add aggregate values to the source rows, you can do aggregation in one pipeline from the same source and use join transformation in the other pipeline.Each row from the source will be joined with the aggregate value of the same source.

Ex: Lets say a source which is Employee table contains emp_id,salary and dept_no and you want to load in a taget whose salaries are less than their department avg salary.
In order to achive this create an aggregate Transformation from the source group by dept_no and join out put of this to Source and use filter transformation to filter out less than avg salary of their dept.

Thnaks,
Mallik

  Was this answer useful?  Yes

satoru

  • Jun 17th, 2010
 

the question should be more "Explain a mapping design to start aggregating the values before getting all the rows IN THE TARGET"

in fact you have many possibilities. You can put a sorter and then an aggregator. Each group by will be done one after an other.

You can also identify all the insert/update rows (or/and if you prefer all the delete/reject) in and update strategy and then put a sorter (if you're in V8 or later of informatica) and the aggregate. The goal with an aggregator is to optimize it the most possible.

you can also filter all the rows you don't want to go thru the aggregator before using it.

and so on

Yes, this can be achieved by using an Expression transformation. Use sorter to sort values and then stage this data in an Expression.

In expression define variable port, this variable port will hold the value of previous row for a particular column while Input port will hold value of the current row being processed.

Use another port to sum up value of Input port and variable port. This way we can aggregate the rows as they come in from Source.

Here we taken this context as two ways 
aggregate value means for example if we take sales
sum(sales)
min(sales)
max(sales)

and the second context is

for the  better performance we use the sorter before aggregator incoming values of the source

  Was this answer useful?  Yes

Agg t/f when using unsorted data, it do not perform agg calculation until it receives all the source rows from the pipeline, on the other hand when using sorted data for Agg t/f agg calculation starts in the memory (not using the cache). Agg transformation in this case do not wait to receive the last row from the pipeline.

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions