Decrease Graph Execution Time

I have 10 million records in one file, if I develop a graph with this file, it takes more time to be executed, but I want to decrease the execution time. How can you proceed?

Showing Answers 1 - 2 of 2 Answers

Rajesh Matcha

  • Jun 10th, 2015

You have not mentioned that whether your graph is a serial graph or a parallel graph , If it is serial make it parallel as soon as possible , if you are doing some specific business rules ..filter the data and make sure you do use partition by key sot the data using the composite key as early as possible to reduce the multiple sorts in your graph, Also make sure you follow all the performance techniques in your graph . Schedule the job in a non high cpu utilization time if you dont have a SLA to meet. If you still need help let me know what exactly you are looking for so that we can discuss about it .

  Was this answer useful?  Yes


  • Jun 26th, 2015

You may use the following performance improvement techniques depending on the situation:
1. You may convert the serial file to multifile system using a partition by key, if it is a serial file.
2. You may filter out all the records from the file that are unwanted for the process. Elimination of records helps the cause.
3. If there are joins with any tables/files, try to use look up files for smaller tables/files .
Also you should use the larger file as the driver port for joins with bigger tables/files.
4. Use in memory sort for smaller file joins.
5. By any chance if you are unloading from a table,you may use order by in the SQL which eliminates use of Sort component in the graph.
I hope these helps. Please correct me if I am wrong.

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.


Related Answered Questions


Related Open Questions