GeekInterview.com
  I am new, Sign me up!
 
GeekInterview.com  >  Interview Questions  >  Data Warehousing  >  DataStage
Go To First  |  Previous Question  |  Next Question 
 DataStage  |  Question 63 of 390    Print  
What are other Performance tunings you have done in your last project to increase the performance of slowly running jobs?
  1. Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the server using Hash/Sequential files for optimum performance also for data recovery in case job aborts.
  2. Tuned the OCI stage for 'Array Size' and 'Rows per Transaction' numerical values for faster inserts, updates and selects.
  3. Tuned the 'Project Tunables' in Administrator for better performance.
  4. Used sorted data for Aggregator.
  5. Sorted the data as much as possible in DB and reduced the use of DS-Sort for better performance of jobs
  6. Removed the data not used from the source as early as possible in the job.
  7. Worked with DB-admin to create appropriate Indexes on tables for better performance of DS queries
  8. Converted some of the complex joins/business in DS to Stored Procedures on DS for faster execution of the jobs.
  9. If an input file has an excessive number of rows and can be split-up then use standard logic to run jobs in parallel.
  10. Before writing a routine or a transform, make sure that there is not the functionality required in one of the standard routines supplied in the sdk or ds utilities categories.
    Constraints are generally CPU intensive and take a significant amount of time to process. This may be the case if the constraint calls routines or external macros but if it is inline code then the overhead will be minimal.
  11. Try to have the constraints in the 'Selection' criteria of the jobs itself. This will eliminate the unnecessary records even getting in before joins are made.
  12. Tuning should occur on a job-by-job basis.
  13. Use the power of DBMS.
  14. Try not to use a sort stage when you can use an ORDER BY clause in the database.
  15. Using a constraint to filter a record set is much slower than performing a SELECT … WHERE….
  16. Make every attempt to use the bulk loader for your particular database. Bulk loaders are generally faster than using ODBC or OLE.



  
Total Answers and Comments: 1 Last Update: November 14, 2005   
  
 Sponsored Links

 
 Best Rated Answer

No best answer available. Please pick the good answer available or submit your answer.
November 14, 2005 05:21:28   #1  
sistlasatish Member Since: November 2005   Contribution: 1    

RE: What are other Performance tunings you have done i...
  1. Minimise the usage of Transformer (Instead of this use Copy modify Filter Row Generator)
  2. Use SQL Code while extracting the data
  3. Handle the nulls
  4. Minimise the warnings
  5. Reduce the number of lookups in a job design
  6. Use not more than 20stages in a job
  7. Use IPC stage between two passive stages Reduces processing time
  8. Drop indexes before data loading and recreate after loading data into tables
  9. Gen\'ll we cannot avoid no of lookups if our requirements to do lookups compulsory.
  10. There is no limit for no of stages like 20 or 30 but we can break the job into small jobs then we use dataset Stages to store the data.
  11. IPC Stage that is provided in Server Jobs not in Parallel Jobs
  12. Check the write cache of Hash file. If the same hash file is used for Look up and as well as target disable this Option.
  13. If the hash file is used only for lookup then \ enable Preload to memory\ . This will improve the performance. Also check the order of execution of the routines.
  14. Don\'t use more than 7 lookups in the same transformer; introduce new transformers if it exceeds 7 lookups.
  15. Use Preload to memory option in the hash file output.
  16. Use Write to cache in the hash file input.
  17. Write into the error tables only after all the transformer stages.
  18. Reduce the width of the input record - remove the columns that you would not use.
  19. Cache the hash files you are reading from and writting into. Make sure your cache is big enough to hold the hash files.
  20. Use ANALYZE.FILE or HASH.HELP to determine the optimal settings for your hash files.

This would also minimize overflow on the hash file.

  1. If possible break the input into multiple threads and run multiple instances of the job.
  2. Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the server using Hash/Sequential files for optimum performance also for data recovery in case job aborts.
  3. Tuned the OCI stage for 'Array Size' and 'Rows per Transaction' numerical values for faster inserts updates and selects.
  4. Tuned the 'Project Tunables' in Administrator for better performance.
  5. Used sorted data for Aggregator.
  6. Sorted the data as much as possible in DB and reduced the use of DS-Sort for better performance of jobs
  7. Removed the data not used from the source as early as possible in the job.
  8. Worked with DB-admin to create appropriate Indexes on tables for better performance of DS queries
  9. Converted some of the complex joins/business in DS to Stored Procedures on DS for faster execution of the jobs.
  10. If an input file has an excessive number of rows and can be split-up then use standard logic to run jobs in parallel.
  11. Before writing a routine or a transform make sure that there is not the functionality required in one of the standard routines supplied in the sdk or ds utilities categories.
    Constraints are generally CPU intensive and take a significant amount of time to process. This may be the case if the constraint calls routines or external macros but if it is inline code then the overhead will be minimal.
  12. Try to have the constraints in the 'Selection' criteria of the jobs itself. This will eliminate the unnecessary records even getting in before joins are made.
  13. Tuning should occur on a job-by-job basis.
  14. Use the power of DBMS.
  15. Try not to use a sort stage when you can use an ORDER BY clause in the database.
  16. Using a constraint to filter a record set is much slower than performing a SELECT WHERE .
  17. Make every attempt to use the bulk loader for your particular database. Bulk loaders are generally faster than using ODBC or OLE.

 
Is this answer useful? Yes | No

 Related Questions

Orchestrate itself is an ETL tool with extensive parallel processing capabilities and running on UNIX platform. Datastage used Orchestrate with Datastage XE (Beta version of 6.0) to incorporate the parallel 

Container is a collection of stages used for the purpose of Reusability. There are 2 types of Containers. a) Local Container: Job Specific b) Shared Container: Used in any job within a project.  
Latest Answer : Container is a collection of stages used for the purpose of Reusability. There are 2 types of Containers. a) Local Container: Job Specific b) Shared Container: Used in any job within a project. · There are two types of shared container:· 1.Server shared ...

ODBC : a) Poor Performance. b) Can be used for Variety of Databases. c) Can handle Stored Procedures. Plug-In: a) Good Performance. b) Database specific.(Only one database) c) Cannot handle Stored Procedures.  

There are 3 types of views in Datastage Director a) Job View - Dates of Jobs Compiled. b) Log View - Status of Job last run c) Status View - Warning Messages, Event Messages, Program Generated Messages. 
Latest Answer : From what I know there are four views1> Status  2> Schedule3> Log4> Detail. ...

Latest Answer : You can insert the parameter values in a table and read them when the package runs using ODBC Stage or Plug-In stage and use DS variables to assign them in the data pipeline, or pass the parameters using DSSetParam from the controling job (batch ...

Latest Answer : Project life cycle is related to SDLCthat is software development life cycle....which mean there are 4 stages involvedthat is 1)Analysis2)development3)Testing4)ImplementationThis covers the entire project life cycle ! ...

Latest Answer : Partition the table if there is huge data. This will enhance performance statistics. ...


A) Transformer, ORAOCI8/9, ODBC, Link-Partitioner, Link-Collector, Hash, ODBC, Aggregator, Sort. 

100+ jobs for every 6 months if you are in Development, if you are in testing 40 jobs for every 6 months although it need not be the same number for everybody 
Latest Answer : Depends on which type of project you are working. Sometimes in Parallel Jobs we will take more time to finish complex job. If you have 3 years experience then you have to tell more than 100 Jobs, Few jobs in server and few in parallel ...


 Sponsored Links

 
Related Articles

Business Intelligence Key Performance Indicators

Business Intelligence Key Performance Indicators What are Key Performance Indicators Key Performance Indicators are also known as Key Success Indicators they help an organization to better define and measure their progress toward professional goals Once an organization has clearly identified its nee
 

Breaking up XML into Relational Data

Breaking up XML into Relational Data While the preceding example shows how to construct an XML representation over relational data the example in this section illustrates how you can shred XML data back into relational data This reverse operation can be useful if your application works with relation
 

Querying Data with Oracle XQuery

Querying Data with Oracle XQuery Starting with Oracle Database 10g Release 2 you can take advantage of a full featured native XQuery engine integrated with the database With Oracle XQuery you can accomplish various tasks involved in developing PHP Oracle XML applications operating on any kind of dat
 

Retrieving XML Data

Retrieving XML DataTo retrieve XML data from an XMLType table you can use a SELECT SQL statement just as you would if you had to query a relational table For example to select the employee with the id set to 100 from the employees XMLType table discussed in the preceding section you might issue the
 

Using XMLType for Handling XML Data in the Database

Using XMLType for Handling XML Data in the Database Being an object type XMLType can not only be used to store XML data in the database but also to operate on that data via its built in methods Regardless of the storage model you choose XMLType provides a set of XML specific methods to operate on XM
 

Using Oracle Database for Storing, Modifying, and Retrieving XML Data

Using Oracle Database for Storing Modifying and Retrieving XML Data With Oracle XML DB you have various XML storage and XML processing options allowing you to achieve the required level of performance and scalability One of the most interesting things about Oracle XML DB is that it allows you to per
 

Business Performance Management

Business Performance Management What is Business Performance Management Business Performance Management is most commonly described as a set of processes that help companies or organizations optimize their business performance It is specifically designed to organize automate and analyze business meth
 

ODP.NET - Techniques to Improve Performance while Retrieving Data

ODP NET Techniques to Improve Performance while Retrieving Data Performance tuning is a great subject in Oracle Volumes of books would not be enough to cover every aspect of performance tuning in Oracle However in this section we will only discuss the fundamental performance techniques while working
 

ODP.NET - Populating a Dataset with a Single Data Table

ODP NET Populating a Dataset with a Single Data Table A dataset is simply a group of data tables These data tables can be identified with their own unique names within a dataset You can also add relations between data tables available in a dataset mosgoogle The following code gives you the details o
 

ODP.NET - Retrieving Typed Data

ODP NET Retrieving Typed Data While retrieving values from OracleDataReader we can extract information available in individual columns of a particular row either by using column ordinal position values or column names mosgoogle Retrieving Typed Data Using Ordinals ODP NET provides data specific enum
 

About Us -  Privacy Policy -  Terms and Conditions -  Contact -  Ask Question -  Propose Category -  Site Updates 

Copyright © 2005 - 2009 GeekInterview.com. All Rights Reserved

Page copy protected against web site content infringement by Copyscape