Kafka
What is Kafka
Apache Kafka is a distributed event steaming platform.
It is open source.
It was developed to efficiently deal with real-time data feed.
This means that it is largely used to build real-time data pipelines and data streaming apps.
Kafka can store registries for a configurable amount of time.
This data is also persisted on disk, allowing recover in case of failure.
You must consider before using Kafka, if you must keep order of messages/registries. (As Kafka do not maintain order in
topics)
Main Components
Producer
Send registries (messages) to kafka
topics.
Consumer
Reads and process the registries of one or more
topics.
Topics
TopicsAre categories or channels where registries are kept/stored.
They don't maintain order, so it does not care which topic came first or last.
Each topic can be partitioned into
partitions.
Partitions
Are subdivisions of a
topicthat allow distribution and parallelization of data.Each
partitionis an ordered and immutable sequence ofregistries.
Broker
A Kafka server that stores the data and serves the
consumers.
Cluster
A cluster of
brokersthat work together to give high availability and scalability.
Last updated