Thursday, September 10, 2020

 AI, Machine Learning, Data Governance

Artificial Intelligence, machine Learning hav continued to penetrate all walks of life and technology has undergone tremendous amount of changes. It is being said that Data is the new oil which actually has propelled AI and ML to greater heights. In order to use AI and ML more effectively in the business today, it is imperative that all the stakeholders, consumers and technologists understand the importance of data. There should be very good collaboration between all the parties involved to make good use of data and take it forward to use AI and ML effectively. For data to be used effectively in an organization, we need proper guardrails to source the data, clean the data, remove unwanted data, store and provision data to various users. Here is where data governance comes in, there has to be a enterprise wide appreciation for having such process and standard. It should come off as process heavy or bureaucratic but something that is efficient and at the same able to manage data effectively. As organizations grow, there is going to be a vertical and horizontal implementation of data governance and both of them need to be in sync. This in turn is very essential for AI and ML efforts because it will make the outcomes more meaningful to the organization. In addition better contexts would be defined which will make the AI and ML projects more viable and reduce inefficiencies and provide cost benefits.

One of the important step in achieving the above mentioned steps is to have very data cataloguing measures , persist all the logical, business entities, lineage of all the data being sourced to be all in place. The data also need to be classified as NPI or non NPI depending on the business context. In today's world majority of the work mentioned above is manual and a lot of time is spent in trying to get SME inputs and approval. This causes time delays and project cost increase, this can be alleviated by using data discovery tools that are available today. The are quite a few tools available but the one i have  started to look more into the capabilities is the tool from Atlan: atlan provides an excellent platform for performing Data Discovery, Lineage, Profiling, Governance and exploration. In what i have seen with the tool and the demo provided to me, the whole data life cycle has been very nicely captured.The user interface is very intuitive and the tool also helps the user navigate through the different screens without any technical inputs needed. The search is very google like in terms of looking up the different data assets that are available. I will be doing some more use cases and deep dive into the tool in the next couple of weeks and will provide more updates.

