Kafka paper notes
Very good and not too complicated read. The paper is by now outdated, I wasn’t aware that in the initial version there was no built-in message replication between brokers, but the paper came out in 2011, and over the years Kafka has developed a lot.
Notes
- distributed messaging system
- a stream of messages of particular type is defined by topic
- to balance the load, topic is divided into partitions
- published messages are then stored at a set of servers called brokers
- messages are just payloads of bytes, it’s up to a user to define how to serialize it
- overall architecture is simple:
- producers push the messages to the topic
- consumers poll the topic for any unread messages
// Sample producer code:
producer = new Producer(…);
message = new Message(“test message str”.getBytes());
set = new MessageSet(message);
producer.send(“topic1”, set);
//Sample consumer code:
streams[] = Consumer.createMessageStreams(“topic1”, 1)
for (message : streams[0]) {
bytes = message.payload();
// do something with the bytes
}
-
storage is very simple, each partition of topic corresponds to a logical log
- log is implemented as a set of segment files of approximately the same size (e.g. 1GB)
- for better performance, messages are flushed to disk only after a certain amount of time has passed
- consumers only see flushed messages
-
there is no explicit message id, each message is addressed by its logical offset in a log
-
consumers always consume sequentially, once the offset is committed, it is acknowledged that messages up to that offset has been consumed
- committed offset is always the offset of the next message to consume that still hasn’t been processed
-
each consumer usually pulls multiple messages at the same time, up to a certain size
-
messages are never cached at the Kafka layer, it relies on the file system page cache and
sendfileLinux API to copy the data from the disk to the socket.- Kafka counts on near real time message processing so that the data is still in page cache due to write-through cache, when it’s read
-
Kafka has the concept of consumer groups which consists of one or more consumers that jointly consume a set of topics - each message is delivered to only one consumer
-
at any given time messages from 1 partition can be consumed by only one a single consumer within a consumer group
- partition in a topic is the smallest unit of parallelism
-
original design uses Zookeeper for coordination, but this scaled poorly because each offset commit triggered ZAB atomic broadcast protocol to have quorum replication
-
initial rebalance would cause a hard stop on all processing, to avoid split-brain decisions and have two consumers consuming from the same topic
- topics were assigned in a deterministic way across consumers after a pause
-
at least once delivery guarantee