One of the more popular term that is doing rounds these days is the term called Data Scientists. This seems to have emerged alongside the growth of Big Data. As part of the Data scientists role it is expected to have a good knowledge of statistical tools and data modelling concepts using mathematics plus exposure tools like SAS,SPSS and R. One of the tools i came across recently which is being used in the Big Data world is R. Today I had the opportunity to attend a session organized by PASS on SQL Server and R, this was presented by Tomaz Kastrun. The session focused on how R can be used with SQL Server to perform Data Analysis. As per Tomaz:
"The Language R is a powerful language for data analysis with all the great features for data import from SQL Server Environment". The Language R along with SQL Server data would be a powerful tool for data scientists to use wide range of statistics. There are statistical functions in SQL Server as part of T-SQL but they tend to be more of the standard/simplistic statistical functions. In case we have to use more complex statistical functions then the T-SQL Code would get longer and more complex which would lead to maintainability issues. This is where tools like R Language come in which can be used with SQL Server to produce very effective results. Here is a definition/purpose of R-Language (which was presented as par of the topic).
R-Language (needs to be installed) - This is available at the site: http://www.rstudio.com
Data (of Course), it would be good to cleanse the data before using for analysis with R.
The topic presented shows different opportunities that exists to leverage current tools like SQL Server and combine them with a powerful statistical language like R. This provides different analytical capabilities for the Data Scientists/Developers.