|
|
|
Kafka Cluster - Architecture
- record/message/log
- schema: {key, value, timestamp}
- are immutable
- configurable retention period
- broker is a node in a cluster that contain partition(s)
- producer writes records to a broker
- consumer reads records from a broker (pull instead of push)
- topic - logical name with 1 or more partitions
- partitions are replicated (normally 3x)
- ordering is guaranteed within a partition (not by topic)
Message Offset
- unique sequential id per partition
- each consumer keeps track of offset for each assigned partition
- this allows:
- replays
- consumers of different speeds
Message Delivery Guarantees
producer
- async (no guarantee)
- committed to leader
- committed to leader & quorum
consumer
- at-least-once (default) -
- at-most-once -
- effectively-once - at-least-once delivery
- exactly-once (maybe) -
Kafka Cluster - 5 Core APIs
- Producer API allows an application to publish a stream of records to one or more Kafka topics
- Consumer API allows an application to subscribe to one or more topics and process the stream of records produced to them
- Streams API allows an application to act as a stream processor, consuming an input stream from one or more topics and producing an output stream to one or more output topics, effectively transforming the input streams to output streams
- Connector API allows building and running reusable producers or consumers that connect Kafka topics to existing applications or data systems. For example, a connector to a relational database might capture every change to a table
- Admin API allows managing and inspecting topics, brokers and other Kafka objects
