1. By query optimization i.e you check your extract and insert query
2. From Server side you can increase the memory
3. By removing unnecessary stages which are not required i.e with the proper job design
4. By using the constrain you can also increase the performance
5. Use parametrized jobs and sequence with the proper parallel/sequential mapping flow.
Prabhakar Achyuta
Jul 23rd, 2013
1. First filter then extract. But dont extract and filter. Use SQL instead of table method when extracting. Say 1 million records are coming from input table but there is a filter condition (Acct_Type=S) in job as per business documents which results only few records say (100).
2. Reduce as many as transformer stages.
3. Reduce stage variables.
4. Remove sort stage and apply the partition techniques at stage level (Ex: for join-hash, lookup-entire).
5. Be careful while operating with Joins. Be specific to inner join untill business needs left outer.
Use Copy stage instead of a Transformer for simple operations like :
•placeholder between stages
•renaming Columns
•dropping Columns
•implicit (default) type Conversions
USe Stage variables wisely.The more teir number, the slower the transformer and the job is. A job should not be overloaded with stages, so split job design into smaller jobs.
Reading or writing data from/to a sequential file is slow and this can be a bottleneck in caseof huge data. So, in such cases, to have faster reading from the Sequential File stage the number of readers per node can be increased (default value is one). Or Read from multiple nodes can be set.
Ensure that RCP is propagating unnecessary metadata to down stream stages.
Have a volumetric analysis done when you introduce Lookup(Normal/Sparse), Join and Merge stage in the design.
Always use CONNECTOR stages for connecting to databases as they are more robust and fast. If not available, use Enterprise stage. Plug-in stages come next.
Examine the execution plans for SQL queries used in jobs and create indices for appropriate columns. Having indices improves performance drastically.
Some times dropping the index and loading the table and re-creating once load is done may be a good option.
Lalit
Oct 6th, 2017
1)Set Proper configuration file
2)Handle null and duplicate values properly
3)Reduce number of transformers as much as possible
4)Do not take more than 20 stages in a perticuler job
5)Use Dataset stage instead of sequential file stage
6)Use join merge and lookup stages properly
Datastage Performance Tuning
Questions by rameshkk
Related Answered Questions
Related Open Questions