Understanding Kafka Partitions

Understanding Kafka Partitions

Kafka Partitions are one of the core mechanisms for ordering produced events in your application.

It is the way that Kafka distribute events thought the nodes and keeping the same events from the same entity together to guarantee the ordination of the facts for the consumers.

Partitions

In Kafka, event messages are essentially logs files containing all the events, separated and organized into partitions. Imagine partitions as office drawers with dividers, organizing papers related to different companies. For example, papers related to company A in one section, and company B in another.

So, with all this events separated in partitions, you may have thought “how do i know the orders of this events for my application?”

Let’s imagine this business scenario, we have a application that process customer orders, the client select some products and generate an order, the order application generates an event (order-event) and after that we have a another application listening this events. This application handles the product inventory, to see if this products still have units available.

After the application finish the job, it generates a new event and so on. We can have another application that deals with the payment. It waits for the two events to follow with the process. But for this example is enough.

Now image a scenario where the order event is generated and send to the partition A, the inventory service listen this event, process and send to the partition B.

The checkout service is waiting for both events to process and the user be able to continue to the payment.

In this scenario the user wouldn’t be able to pay and the sell would never be successful.

To guarantee the order of events, Kafka has the partition key that help us solve this kind of scenario.

Kafka partitions without partition key

Partition Key

Also know as message key it is a value used to define the partition of the message, it is provided in the message metadata. All messages with the same key is stored in the same partitions, so this guarantee the ordination of events.

if you don’t provide this key, Kafka uses the round robin algorithm to distribute the message between the partitions.

So in a scenario where do you need the ordination, you can use this. However, if you require global ordering across nodes, you’ll need to configure your topic with a single partition, which may result in lower write throughput and parallelism in reading.

Comments are closed.