Saturday, July 31, 2021

Kafka - Event Processing

In a traditional data world which started to come into picture in the 1980s and become very popular during the 2000's is the concept of storing data based on entities/structured schema.
The field of Business Intelligence rose exponentially during this period and was the main area driving analytics. There was reporting, dashboards which started growing due to the flourish of BI. The infrastructure driving this was mainly servers, data centers, relational databases and in order to scale the operations there was a concept of web farms, clustering. Also with this type of  data structure, in case we had to track changes in data or entities, we relied on triggers, writing stored procedures. This was fine initially to provide such information to business and stakeholders. As the amount of data grew, requirements became complex and more time sensitive, there was a need to move to a scalable architecture. With the advent of big data technologies, there was one technology that is grown in popularity and usage, it is KAFKA.
Here are some basic concepts in Kafka, there lot of online tutorials on more in depth explanation of Kafka
Producer - Sends a Message Record (data), array of bytes
All records in table will be sent as message - Collect the result from query and sending each row as a message
You need to create a Producer application
Consumer - is an application that receives the data.
Producer sends data to Kafka server then requesting data from this is a Consumer
Producer -> Kafka Server -> Sends data to consumer
Broker is also the Kafka Server, it is a broker between Producer and consumer
Cluster - Can contain multiple brokers
Topic is a name given to a data set/stream
For example a topic can be called as Global Orders
Partition - Broker could have challenge in storing large amounts of data. Kafka can break a topic into partitions. How many partitions are needed, we need to make that decision for a topic. Every partition sits on a Single machine
Offset is a sequence number of message in a partition. Offsets starts from 0 for a message, they are local to the partition. To access a message Topic name, Partition Number, Offset number.

Consumer Group - Group of consumers , members of the group share the work
Retail chain
Billing counters
Producer for each Billing location
Sends message
Consumer will get the above messages
Create clusters and also create partition.
Consumer groups can then access a set of partitions


These are some basic concepts in Kafka...


No comments:

Post a Comment