GeekInterview.com
  I am new, Sign me up!
 
GeekInterview.com  >  Interview Questions  >  Data Warehousing  >  DataStage
Go To First  |  Previous Question  |  Next Question 
 DataStage  |  Question 291 of 390    Print  
Which partition we have to use for Aggregate Stage in parallel jobs ?

  
Total Answers and Comments: 4 Last Update: February 19, 2008     Asked by: izack 
  
 Sponsored Links

 
 Best Rated Answer

No best answer available. Please pick the good answer available or submit your answer.
January 28, 2007 23:14:20   #1  
srinivasguptha Member Since: January 2007   Contribution: 2    

RE: Which partition we have to used for Aggregate Stag...
By default this stage allows Auto mode of partitioning. The best partitioning is based on the operating mode of this stage and preceding stage. If the aggregator is operating in sequential mode it will first collect the data and before writing it to the file using the default Auto collection method. If the aggregator is in parallel mode then we can put any type of partitioning in the drop down list of partitioning tab. Generally auto or hash can be used.



Thanks

Srinivas

 
Is this answer useful? Yes | No
August 09, 2007 03:08:12   #2  
harishsj Member Since: July 2007   Contribution: 6    

RE: Which partition we have to used for Aggregate Stag...
I think the above answer is a little misleading. Most of the time you'll be using aggr. stage in parallel mode. Now if you use the auto partioning mode it doesnt indicate that the key columns that you are grouping on will lie in the same partition. Thus the result will not be useful for this aggregation.

1) Identify the grouping keys you want to aggregate on.
2) In a stage prior to aggr. Do a hash partition on the grouping keys. This will ensure that all the similiar group keys lie in a particular partition.
3) Now the result of partition will be appropriate.
4) I even think the entire partition method can be usefull But it will be slightly higher overhead as compared to hash partitioning.

Hope that helps....

Thanks
Harish

 
Is this answer useful? Yes | No
February 12, 2008 05:08:39   #3  
manoharkolukula Member Since: January 2008   Contribution: 32    

RE: Which partition we have to use for Aggregate Stage in parallel jobs ?
same as harish
 
Is this answer useful? Yes | No
February 19, 2008 04:53:08   #4  
swapnilverma Member Since: February 2008   Contribution: 2    

RE: Which partition we have to use for Aggregate Stage in parallel jobs ?
Its always preferable & appropriate that we must use a sort stage beore aggregate stage.
Hence based on the aggregate logic we should sort the incoming data by using hash partintion on keys.

Then we can use same partition on Aggregate stage.

This is most commonly used.

 
Is this answer useful? Yes | No


 
Go To Top


 Sponsored Links

 
About Us -  Privacy Policy -  Terms and Conditions -  Contact -  Ask Question -  Propose Category -  Site Updates 

Copyright © 2005 - 2009 GeekInterview.com. All Rights Reserved

Page copy protected against web site content infringement by Copyscape