First of all, this article for the people who use Kafka nowadays or potentially will use Kafka in the coming projects so you should know at least some basic knowledge about kafka and CAP theorem.
If you know well the CAP theorem so of course, you know which states we can either move towards Consistency (CP) or High Availability (AP) in a distributed system, we can’t have both of them at the same time but actually, we can tune our system to have as much as possible some consistency (eventual consistency) without losing the high availability and this is our article scope is how to practice Kafka in the right way.
So as Kafka is an open-source distributed event streaming platform it should follow the CAP theorem and it should fulfill two of three so in which area of CAP Kafka is located.
As you see Kafka fulfills CA where Partition tolerance uses a certain mechanism to ensure partition fault tolerance as much as possible. so our scope now is the grantee of high availability with data consistency. actually Kafka cluster tuning there are several configurations that are set by default but we need to provide the configuration as per our system required to achieve the best possible consistency and availability.
So we need to understand the Kafka key concepts.
- Kafka cluster, Cluster is the N of nodes that represent Kafka brokers we should have a least three brokers per cluster. so why we should have at least three why not two, Let say your Kafka cluster has 2 brokers. To maintain the high availability and the consistency of the data, we will set the replicas and the ISR both to 2. The interesting part is the min-ISR attribute. If you set the min-ISR to 1 then the leader fails, likely you don’t have any remaining replicas. If you set the min-ISR to 2 when either the leader or the follower fails the producer and consumer can not work. so it should be an odd number.
ZooKeeper service to be active, there must be a majority of non-failing machines that can communicate with each other. To create a deployment that can tolerate the failure of F machines, you should count on deploying 2xF+1 machines. Thus, a deployment that consists of three machines can handle one failure, and a deployment of five machines can handle two failures. Note that a deployment of six machines can only handle two failures since three machines is not a majority. For this reason, ZooKeeper deployments are usually made up of an odd number of machines.
You can also see the Quorum technique as the zookeeper is a Quorum-based system https://en.wikipedia.org/wiki/Quorum_(distributed_computing)
Partitions: The partitions consider the unit of parallelism in Kafka. On the producer and the broker side, writes to different partitions can be done fully in parallel so the more partitions there are in a Kafka cluster, the higher the throughput one can achieve, so if you want to measure the throughput that you can achieve on a single partition for production (call it p) and consumption (call it c). Let’s say your target throughput is t. Then you need to have at least max(t/p, t/c) partitions. The per-partition throughput that one can achieve on the producer depends on configurations such as the batching size, compression codec, type of acknowledgment, replication factor and etc.
The partitions may lead to unavailability, as Kafka fulfills the high availability via replication mechanism as we explained before the partition can have multiple replicas, each stored on a different broker so when the original leader fails another broker will be the leader that operation will take some second to operate a new leader. Consider that a broker has a total of 2000 partitions, each with 2 replicas. Roughly, this broker will be the leader for about 1000 partitions. When this broker fails uncleanly, all those 1000 partitions become unavailable at exactly the same time. Suppose that it takes 5 ms to elect a new leader for a single partition. It will take up to 5 seconds to elect the new leader for all 1000 partitions. So, for some partitions, their observed unavailability can be 5 seconds plus the time taken to detect the failure.
The partition may lead to a memory leak in the client because increases the number of partitions, messages will be accumulated in more partitions in the producer that need to allocate a lot of memory based on the number of the partition. also, the consumer fetches a batch of messages per partition. The more partitions that a consumer consumes, the more memory it needs.
Replication factor: By default, the number of replica per partition is 1 For high availability production systems, the recommended setting is at least three replicas this requires at least three Kafka brokers as we explained before. so as we consider the N number of brokers we should not have R.F < N.
(ISR): In-sync-Replica Is the number of minimum replicas (including the leader) synced up available for the producer to successfully send messages to the partition. This inversely impacts the availability. lower the ISR more the availability and lesser the consistency and vice versa. we should always have ISR lower than RF. We recommend having 2 ISR for topics with RF as 3.
Preferred leader election: As Kafka is designed with failure at some point in time, web communications fail. When a broker goes offline, one of the replicas becomes the new leader for the partition. When the broker comes back online, it has no leader partitions. Kafka keeps track of which machine is configured to be the leader. Once the original broker is back up and in a good state, Kafka restores the information it missed in the interim and makes it the partition leader once more. Preferred Leader Election is enabled by default and should occur automatically unless you actively disable the feature. Typically, the leader is restored within five minutes of coming back online. If the preferred leader is offline for a very long time, though, it might need additional time to restore its required information from the replica. There is a small possibility that some messages might be lost when switching back to the preferred leader. You can minimize the chance of lost data by setting the acks property on the Producer to all.
Unclean leader election: in case of failure of all ISR, out-of-sync replica is elected as Leader, setting this to true is not recommended at all, as it will lose the consistency of the system, this should be used only and only if we need the 100% availability irrespective of the consistency so it depends on balancing the system requirement between A and C.
Acknowledgments: When configuring the producer you can choose how many replicas commit a new message before the message is acknowledged by using the acks configuration property. so take care of this property will because it impacts the C because if you set it to 0 that means the knowledge will send without message writing guarantee that may lead to losing the data and set it to 1 means it should be written at least on the leader replica. if you want a high level of consistency you can set it to all which means the message should be written to all in-sync replica that may lead to low availability so design always depends on system requirement but by the way it is not recommended to set it to 0.
In the end, I would say if you need to balance between high availability with a good level of data consistency we should have a Kafka cluster with at least 3 nodes or more with an odd number of brokers and three replicas with 2 IRS.
I hope I explained it clearly and I hope so that article helps you to understand how to configure your Kafka cluster in the right way to achieve CA, and I highly recommend reading the Effective-Kafka-Hands-Event-Driven-Applications and design intensive data application books. Also if you need to understand CAP you can read my article https://regoo707.medium.com/introduction-to-nosql-773d2c34ee00