Wednesday, February 19, 2014

Schema on Read vs Schema on Write...

There are more and more case studies/examples of Big data implementations. Big data is not suitable for all scenarios. Where Big data concepts are applicable, it is important to know the difference between:
Schema on Read and Schema on Write: The differences between the two was captured from a presentation/discussion on Big data by Oracle. Regardless of SQL Server or oracle the concept described below is very important to understand.

The main focus on the difference between the 2 is that when is the value realized faster, of course there could be debates surrounding this concept.

Monday, February 10, 2014

Power BI - Continued...

Today Microsoft announced the availability of Power BI for Office 365. Power BI recently has been gaining a lot of traction, I attended a session with SQL PASS on 24 hours of BI where in one of the lectures was on Power BI and its components. Please use the following link for More details.
As per Microsoft: a cloud-based business intelligence service that gives people a powerful new way to work with data in the tools they use every day, Excel and Office 365. With Excel you can now discover, analyze, and visualize data like never before to find valuable business insights. With Power BI for Office 365 you can easily deploy a cloud-based BI environment. Couple of features that I would like to mention here:
  • Quickly create collaborative BI sites – enable anyone to quickly create a collaborative BI site to share workbooks containing data and insights.
  • Keep reports up to date with scheduled data refresh.
  • Please visit the the link mentioned above to know more about Power BI and it s capabilities.
    In order to understand the differences between Power BI and tableau: Please read the article below written by BI Expert Jen Underwood, very detailed analysis...

    Tuesday, February 4, 2014

    JOINS (Hash,Merge,Nested Loop)

    As a developer one definitely would have encountered performance issues with queries run on SQL Server database. Queries that were working during the initial design phases start to slow down once projects are deployed to production. There could be several factors affecting the performance of a query, it is essential for one to understand the execution plans generated by the sql server optimizer. It is important to know what types of joins are being utilised by the optimizer and why. The joins referred to in this article are related to Hash, Merge and Nested Loop joins. The focus is on what happens behind the scenes when INNER JOIN/ OUTER JOIN queries are executed. In order to see the execution plan, one use the Include Execution Plan option in SSMS, once the query is completed, the Execution Plan is displayed in the results tab. Now lets look at what each of the joins mean.
    MERGE JOIN: Match rows from two suitably sorted input tables exploiting their sort order.
    HASH MATCH(Hash Join): Using Input row from the top input table to build a hash table, and each row from the bottom input table to probe into the Hash table, outputting all matching rows.
    NESTED LOOP: For Each row in the top(outer) input, scan the bottom (inner) input , and output matching rows.
    The following article might be dated, but it has a very relevant information with a test performed with different data set sizes and indexes: This article was written by Muthukkumaran kaliyamoorthy.

    One can take the examples provided in the article and try to apply it to one's database and see the difference in execution plan. The difference in the joins chosen by the optimizer depends on the data size, indexes of each table involved in the join. Hope the referenced article helps developers to figure performance issues related to joins and enhance the queries better.