Data. Data streamed in topics
Each data record is a key value pair
Timestamps and additional headers supported
Topics split into partitions
Each data record stored in one partition
Record addressed by offset within partition
Configurable assignment of records to partitions
Brokers. Replicated broker cluster
Each broker stores data logs of some topic partitions
Data log retention period configurable
Partition replication configurable
Leader follower architecture
Topic access done on leader broker
Leader election in case of leader failure
Producer may require minimum number of in sync replicas
Clients. Producers
Can batch records when so configured
Can guarantee exactly once delivery semantics
Can wait for confirmation from zero, one or all in sync brokers
Consumers
Each consumer maintains own topic position
Consumer groups split topic partitions among themselves
Can update topic position together with output in transaction
Stream processors