How to use max core number of rows?

Showing Answers 1 - 7 of 7 Answers

K Vijay Kumar

  • Oct 28th, 2006
 

 

The max-core parameter is found in Sort, Join and Rollup components. There is no single, optimal value for the max-core parameter, since a good value depends on your particular graph and the environment in which it runs.

The Sort component works in memory and the Rollup and Join components have the option to do so. These components have a parameter called max-core that determines the maximum amount of memory they will consume per partition before they spill to disk. When the value of max-core is exceeded in any of the in-memory components, all of the inputs are dropped to disk. This can have a dramatic impact on performance, but this does not mean that it is always better to increase the value of max-core.

The higher you set the value of max-core, the more memory the component can use. Using more memory generally improves performance ? up to a point. Beyond this point, performance will not improve and may even decrease. If the value of max-core is set too high, operating system swapping can occur and the graph may even fail if memory on the machine is exhausted.

Sort Component

For the Sort component 100 MB is the default value for max-core. This default is used to cover a wide variety of situations and may not be ideal for your particular circumstances. Increasing the value of max-core will not increase performance unless the full dataset can be held in memory or the data volume is so large that a reduction in the number of temporary files improves performance. You can estimate the number of temporary files by multiplying the data volume being sorted by three and dividing by the value of max-core, since data is written to disk in blocks that are one third the size of the max-core setting. This number should be less than 1000. For example, suppose that you are sorting 1GB of data with the default max-core setting of 100 MB and the process is running in serial. The number of temporary files that will be created is:

3 ? 1,000MB / 100 MB = 30 files

You should decrease the value of a Sort component's max-core if an in-memory Rollup or Join component in the same phase would benefit from additional memory. The net performance gain will be greater.

 

  Was this answer useful?  Yes

mukund

  • Dec 22nd, 2006
 

Hi,

max-core is very essential parameter in componets like sort,rollup,scan etc

max-core gives the amount of momory which a component uses to process data before spilling to disk ie we need to have an optimum value of max-core.having high value of max-core degrades the performance .

generally components takes the default values of max-core (for eg: sort 100MB),

set the value of maxcore to an appropriate value,by using the formula

 (total memeory available)%(number of partitions)%2=max-core

cheers,

Mukund

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions