GeekInterview.com
   Home |  Tech FAQ  |   Interview Questions |  Placement Papers |  Tech Articles |  Learn |  Freelance Projects |  Online Testing |  Geeks Talk |  Job Postings |  Knowledge Base | Site Search |  Add/Ask Question

GeekInterview.com  >  Interview Questions  >  Oracle  >  SQL
Go To First  |  Previous Question  |  Next Question 
 SQL  |  Question 60 of 170    Print  
Difference Between Hash Join & Merge Join
Merge Join :

Oracle performs a join between two sets of row data  using the merge
join algorithm. The inputs are two separate sets of row data. Output is
the results of the join.  Oracle reads rows from both inputs in an
alternating fashion and merges together matching rows in order to
generate output. The two inputs are sorted on join column.

Hash Join :

Oracle performs a join between two sets of row data using hash join
algorithm.  Input and Output same as Merge Join.  Oracle reads all rows
from the second input and builds a hash structure (like has table in
java), before reading each row from the first input one at a time. For
each row from the first input, the hash structure is probed and matching
rows generate output.



  
Total Answers and Comments: 1 Last Update: November 02, 2007   
  
 Sponsored Links

 
 Best Rated Answer

No best answer available. Please pick the good answer available or submit your answer.
November 02, 2007 06:56:23   #1  
Manoj Seemar Member Since: November 2007   Contribution: 2    

RE: Difference Between Hash Join & Merge Join
Merge Joins

Sort merge joins can be used to join rows from two independent sources. Hash joins generally perform better than sort merge joins. On the other hand, sort merge joins can perform better than hash joins if both of the following conditions exist:

  • The row sources are sorted already.
  • A sort operation does not have to be done.

However, if a sort merge join involves choosing a slower access method (an index scan as opposed to a full table scan), then the benefit of using a sort merge might be lost.

Sort merge joins are useful when the join condition between two tables is an inequality condition (but not a nonequality) like <, <=, >, or >=. Sort merge joins perform better than nested loop joins for large data sets. You cannot use hash joins unless there is an equality condition.

In a merge join, there is no concept of a driving table. The join consists of two steps:

  1. Sort join operation: Both the inputs are sorted on the join key.
  2. Merge join operation: The sorted lists are merged together.

If the input is already sorted by the join column, then a sort join operation is not performed for that row source.

The optimizer can choose a sort merge join over a hash join for joining large amounts of data if any of the following conditions are true:

  • The join condition between two tables is not an equi-join.
  • OPTIMIZER_MODE is set to RULE.
  • HASH_JOIN_ENABLED is false.
  • Because of sorts already required by other operations, the optimizer finds it is cheaper to use a sort merge than a hash join.
  • The optimizer thinks that the cost of a hash join is higher, based on the settings of HASH_AREA_SIZE and SORT_AREA_SIZE.

To advise the optimizer to use a sort merge join, apply the USE_MERGE hint. You might also need to give hints to force an access path.

There are situations where it is better to override the optimize with the USE_MERGE hint. For example, the optimizer can choose a full scan on a table and avoid a sort operation in a query. However, there is an increased cost because a large table is accessed through an index and single block reads, as opposed to faster access through a full table scan.

Hash Joins

Hash joins are used for joining large data sets. The optimizer uses the smaller of two tables or data sources to build a hash table on the join key in memory. It then scans the larger table, probing the hash table to find the joined rows.

This method is best used when the smaller table fits in available memory. The cost is then limited to a single read pass over the data for the two tables.

However, if the hash table grows too big to fit into the memory, then the optimizer breaks it up into different partitions. As the partitions exceed allocated memory, parts are written to temporary segments on disk. Larger temporary extent sizes lead to improved I/O when writing the partitions to disk; the recommended temporary extent is about 1 MB. Temporary extent size is specified by INITIAL and NEXT for permanent tablespaces and by UNIFORM SIZE for temporary tablespaces.

After the hash table is complete, the following processes occur:

  1. The second, larger table is scanned.
  2. It is broken up into partitions like the smaller table.
  3. The partitions are written to disk.

When the hash table build is complete, it is possible that an entire hash table partition is resident in memory. Then, you do not need to build the corresponding partition for the second (larger) table. When that table is scanned, rows that hash to the resident hash table partition can be joined and returned immediately.

Each hash table partition is then read into memory, and the following processes occur:

  1. The corresponding partition for the second table is scanned.
  2. The hash table is probed to return the joined rows.

This process is repeated for the rest of the partitions. The cost can increase to two read passes over the data and one write pass over the data.

If the hash table does not fit in the memory, it is possible that parts of it may need to be swapped in and out, depending on the rows retrieved from the second table. Performance for this scenario can be extremely poor.

The optimizer uses a hash join to join two tables if they are joined using an equijoin and if either of the following conditions are true:

  • A large amount of data needs to be joined.
  • A large fraction of the table needs to be joined.


SELECT o.customer_id, l.unit_price * l.quantity
  FROM orders o ,order_items l
WHERE l.order_id = o.order_id;

Apply the USE_HASH hint to advise the optimizer to use a hash join when joining two tables together. If you are having trouble getting the optimizer to use hash joins, investigate the values for the HASH_AREA_SIZE and HASH_JOIN_ENABLED parameters.


 
Is this answer useful? Yes | No

 Related Questions

Self join-Its a join foreign key of a table references the same table. Outer Join--Its a join condition used where One can query all the rows of one of the tables in the join condition even though they 
Latest Answer : Please let me know various joins available and description for every joinRegards,sireesha ...

Rename is a permanent name given to a table or column whereas Alias is a temporary name given to a table or column which do not exist once the SQL statement is executed. 
Latest Answer : Rename means give the new name to existed objects alise is use to refrence to objects ...

A table can have only one PRIMARY KEY whereas there can be any number of UNIQUE keys. The columns that compose PK are automatically define NOT NULL, whereas a column that compose a UNIQUE is not automatically 
Latest Answer : Primay Key                     Unique key-----------                  ...

SQL*PLUS is a command line tool where as SQL and PL/SQL language interface and reporting tool. Its a command line tool that allows user to type SQL commands to be executed directly against an Oracle database. 
Latest Answer : SQL keywords cannot be abbrevated but SQL*PLUS keywords can be  abbrevated. ...

SUBSTR returns a specified portion of a string eg SUBSTR('BCDEF',4) output BCDEINSTR provides character position in which a pattern is found in a string. eg INSTR('ABC-DC-F','-',2) 
Latest Answer : hi Buddy,   Substr only give the sub part of the stringSubstr(String,'start postion','length')Substr('ABCDEFAG',3,7); /*out put is =cdefag */Instr to give only possion  of letter in use in a stringInstr(String,'letter',possition  ...

Outer Join--Its a join condition used where you can query all the rows of one of the tables in the join condition even though they don’t satisfy the join condition. 
Latest Answer : outer join is that type of jon which retrives the matach & alson unmatached recores also from the tables ...

PL/SQL declares a cursor implicitly for all SQL data manipulation statements, including quries that return only one row. However,queries that return more than one row you must declare an explicit cursor 
Latest Answer : explicit ...

NO DATA FOUND is an exception raised only for the SELECT....INTO statements when the where clause of the querydoes not match any rows. When the where clause of the explicit cursor does not match any rows 
Latest Answer : NO DATA FOUND: Is an exception which is raised when no rows are retrieved from the database in a SELECT statement, then PL/SQL raises the exception NO_DATA_FOUND.%NOTFOUND: is a Boolean attribute that evaluates to TRUE if the most recent SQL statement ...

Functions are named PL/SQL blocks that return a value and can be called with arguments procedure a named block that can be called with parameter. A procedure all is a PL/SQL statement by itself, while 
Latest Answer : PROCEDURE                                                                                              ...

The variables declared in the procedure and which are passed, as arguments are called actual, the parameters in the procedure declaration. Actual parameters contain the values that are passed to a procedure 
Latest Answer : Formal Parameter: A variable declared in the parameter list of a subprogram specificationExample: create or replace procedure/function x (p_id number, p_sal number)Actual Parameter: A variable or expression refrenced in the parameter list of a subprogram ...


 Sponsored Links

 
Related Articles

Querying Data with Oracle XQuery

Querying Data with Oracle XQuery Starting with Oracle Database 10g Release 2 you can take advantage of a full featured native XQuery engine integrated with the database With Oracle XQuery you can accomplish various tasks involved in developing PHP Oracle XML applications operating on any kind of dat
 

Using Oracle XML DB Repository

Using Oracle XML DB Repository Another variation on accessing and manipulating XML content stored in Oracle database is provided by Oracle XML DB repository which is an essential component of Oracle XML DB mosgoogle NOTE Oracle XML DB repository also known as XML repository is a hierarchically organ
 

Using Oracle Database for Storing, Modifying, and Retrieving XML Data

Using Oracle Database for Storing Modifying and Retrieving XML Data With Oracle XML DB you have various XML storage and XML processing options allowing you to achieve the required level of performance and scalability One of the most interesting things about Oracle XML DB is that it allows you to per
 

XML Processing in PHP and Oracle Applications

Processing XML in PHP Oracle Applications As mentioned there are two alternatives when it comes to performing XML processing in your PHP Oracle application You can perform any required XML processing using either PHP s XML extensions or PEAR XML packages or Oracle s XML features mosgoogle In the fol
 

PHP Oracle Web Development

PHP Oracle Web Development Data processing Security Caching XML Web Services and Ajax The book is written by Yuli Vaseliev a well known author of different web development and programming books PHP Oracle Web Development Data processing Security Caching XML Web Services and Ajax is a good starting b
 

Getting Started with Oracle and ODP.NET

ODP NET Developer&rsquo; s Guide by Jagadish Chatarji Pulakhandam Sunitha Paruchuri A practical guide for developers working with the Oracle Data Provider for NET and the Oracle Developer Tools for Visual Studio 2005 Application development with ODP NET Dealing with XML DB using ODP NET Oracle
 

PHP Oracle Web Development Review

PHP Oracle Web Development Data processing Security Caching XML Web Services and Ajax The book is written by Yuli Vaseliev a well known author of different web development and programming books The author is also an expert in open source technologies and SOA Service Oriented Architecture But besides
 

Step by Step Oracle PL-SQL Tutorial

This introductory tutorial to PL SQL will help you to understand the basic concepts of PL SQL Please review the following tutorials and practice the sample SQL Statements on your local Oracle Database Please note that you must learn these basic things before we actually start getting in to Advanced
 

Working with XML in Oracle

Working with XML in Oracle Introduction to XML Extensive markup language is the language which presents data in a human readable form of text The data can be anything from a purchase order or a stock quote or weather radar or a flight schedule it can be represented using XML XML is very similar to H
 

What is Oracle Net Service

Oracle Net Services provides enterprise-wide connectivity solutions in distributed, heterogeneous computing environments. Oracle Net Services eases the complexities of network configuration and management, maximizes performance, and improves network diagnostic capabilities. Oracle Net, a component
 





About Us  |   Privacy Policy  |   Terms and Conditions  |   Contact  |   Site Map  |   Add Question  |   Propose Category  |   RSS Feeds  |   Articles Sitemap  |   Site Updates  |   Add Resource

Copyright © 2005 - 2008 GeekInterview.com. All Rights Reserved
Page copy protected against web site content infringement by Copyscape