Tuesday, February 4, 2014

JOINS (Hash,Merge,Nested Loop)

As a developer one definitely would have encountered performance issues with queries run on SQL Server database. Queries that were working during the initial design phases start to slow down once projects are deployed to production. There could be several factors affecting the performance of a query, it is essential for one to understand the execution plans generated by the sql server optimizer. It is important to know what types of joins are being utilised by the optimizer and why. The joins referred to in this article are related to Hash, Merge and Nested Loop joins. The focus is on what happens behind the scenes when INNER JOIN/ OUTER JOIN queries are executed. In order to see the execution plan, one use the Include Execution Plan option in SSMS, once the query is completed, the Execution Plan is displayed in the results tab. Now lets look at what each of the joins mean.
MERGE JOIN: Match rows from two suitably sorted input tables exploiting their sort order.
HASH MATCH(Hash Join): Using Input row from the top input table to build a hash table, and each row from the bottom input table to probe into the Hash table, outputting all matching rows.
NESTED LOOP: For Each row in the top(outer) input, scan the bottom (inner) input , and output matching rows.
The following article might be dated, but it has a very relevant information with a test performed with different data set sizes and indexes: This article was written by Muthukkumaran kaliyamoorthy.

http://www.sqlserverblogforum.com/2011/10/merge-join-vs-hash-join-vs-nested-loop-join/

One can take the examples provided in the article and try to apply it to one's database and see the difference in execution plan. The difference in the joins chosen by the optimizer depends on the data size, indexes of each table involved in the join. Hope the referenced article helps developers to figure performance issues related to joins and enhance the queries better.

1 comment: