Tuesday, August 26, 2014

SPLUNK...Operational Data

In the Data warehousing  and BI world, there are numerous instances where we get text files/excel files as data inputs to a datawarehouse/data mart. As part of the handshake between the Source systems and traget systems we receive log files. These log files would contain summary information about the data being sent. Using the log file the target system would load validate the data being received, once data is validated the loading process would start. There are numerous techniques of doing this type of validation. In certain organizations there could be numerous processes that produce log files to other downstream systems that would need to consume the data. In addition to the log files there are also numerous amounts of machine data produced by different systems and applications that would be need to be monitored so that this type of information can be leveraged by organizations for various business groups. In order to mange this type of monitoring there is a tool provided by SPLUNK, web site: www.splunk.com. The company provides different types of monitoring tools: tools for operational intelligence, Cloud, BigData and Business Analytics. Please review the links below: for additional details.
Tools like SPLUNK provide business an opportunity to make sense of the operational/machine data and leverage positive results.

Thursday, August 7, 2014

SSIS Parameters-SQL Server

Given the nature of today's ETL operations there are lot of complex SSIS packages being built to handle different aspects of Data Processing. One of the key concepts involved with multiple SSIS packages is the passing of parameters between the packages. My friend Pinal Dave, a SQL Server Expert had bought out a excellent post in his famous SQL Server blog. Here is the link to the article.
Friday, August 1, 2014

Data Integration

One of the areas that i have increasingly working on these days is the area of data integration. Business/Companies have a variety of database systems and it is very rare that you find companies running out of one database system/platform. There is an increasing need for different database systems/architectures to co-exist. The battle at times becomes very political to replace one database/BI platform with another one in its entirety. This is where data integration comes in and there are lot of tools available in the ,market today in this space. The Gartner report for Data integration tools is available now. Based on the report Microsoft is in the Challengers quadrant, the data integration tools that encompass the Data integration framework as per the study are SSIS and BizTalk server. I have used a lot of SSIS but very rarely i have come across Biztalk servers is my projects. The reason why Microsoft is in in the challengers space is because of the lack of breadth of functionality, quoting the report: " Although Microsoft addresses core data integration requirements, non-bulk/batch data integration styles remain a gap in its offering relative to market demand. Microsoft's product strategy for aligning capabilities to data movement, transformation and orchestration (referred to as "information production") aims to broaden its market positioning". The leaders in the Data Integration Space are 1. Informatica 2. IBM 3. Oracle. Oracle's main data integration tools are Oracle Data Integrator (ODI), Oracle Data Service Integrator, Oracle GoldenGate and Oracle Warehouse Builder (OWB).  One of the weakness in the Oracle integration suite is lack of skill set available with the set of tools mentioned above. Companies like Cisco(Composite Studio) and Actian fall in the visionaries category. Cisco is a new incumbent in the data information management technologies market. Adeptia and Syncsort fall in the niche players category. For the Complete report, please use the link below:


The domain of data integration encompasses a wide variety of services which are increasing by the year.