Introduction to Kafka

manba_ yqq 2022-01-26 13:16:14 阅读数:435

introduction kafka
  1. kafka What is it? ? Use scenarios ?
    Kafka It is a high throughput distributed message queuing system . It's characterized by the producer consumer model , fifo (FIFO) Ensure that the order , Don't lose your data , Default every 7 Days to clean up the data . Message queuing common scenes : Decoupling between systems 、 Peak pressure buffering 、 asynchronous communication .
  2. kafka Production news 、 Store messages 、 News consumption
     Insert picture description here
    Kafka Architecture is made up of producer( Message producer )、consumer( Message consumer )、 borker(kafka Clustered server, Responsible for processing message reading 、 Write requests , Store messages , stay kafka cluster This floor here , In fact, there are many broker)、topic ( Message queue / Classification is equivalent to queue , There are models of producers and consumers )、 zookeeper( Metadata information exists zookeeper in , Include : Storage consumption offset , topic Topic information ,partition Information ) These parts make up .

kafka The message is that topic To organize the , Simply, we can imagine a team Column , A queue is a topic, And then it puts each topic It's divided into many partition, This is for parallel , At every partition Internal information is strong and orderly , It's like an orderly queue , Each of these messages has a sequence number offset, such as 0 To 12, Read from the front and write from the back . One partition Corresponding to one broker, One broker can To manage multiple partition, for instance ,topic Yes 6 individual partition, There are two broker, Then each broker Just take care of it 3 individual partition. This partition It can be simply imagined as A file , When the data comes, it goes to this partition above append, Just add it , Messages are not buffered in memory , Write directly to file ,kafka And a lot of news All different , A lot of information systems are consumed and I delete them , and kafka Is based on Inter policy deletion , Instead of deleting after consumption , stay kafka None of them spent so much A concept , Only the concept of expiration .

producer Decide which way to go partition Write in it , Here are some strategies , Analogy Such as hash.consumer Where do you maintain your consumption offset, Every consumer all There are corresponding group,group Inside is queue Consumption model ( each consumer consumption Different partition, So a message is group I only spend once in a week ),group between yes publish-subscribe Consumption model , each group Independent consumption , They don't influence each other , So a message is being sent to each group Consume once .

  1. kafka Characteristics
    1、 Characteristics of the system : Producer consumer model ,FIFO
    Partition The internal is FIFO Of ,partition In between, it's not FIFO Of , Of course, we You can put topic Set as a partition, This is the strict FIFO.
    2、 High performance : A single node supports thousands of clients , hundred MB/s throughput , Very close to the network card limit
    3、 persistence : Message persistence directly on ordinary disk with good performance Write directly to disk , It's direct append Go to the disk , The advantage is that Connect persistence , Data will not be lost , The second advantage is to write in order , Then the consumption data is also Sequential reading , So it's persistent and it's sequential , better , Because the disk order It's better to read .
    4、 Distributed : Data replica redundancy 、 Traffic load balancing 、 Scalable Distributed , Copy of data , That is, the same data can be sent to different broker The above to , That is, when a piece of data , When the disk breaks down , Data will not be lost , such as 3 Copies , Is in the 3 The data will be lost only when all the machine disks are broken , In case of heavy use Look, this is very good , Load balancing , Scalable , Online expansion , No need to stop service .
    5、 Very flexible : Message persistence for a long time +Client Maintain consumption status The way of consumption is very flexible , The first reason is that the message persistence time span is relatively long , One day or Those who wait a week , Second, maintain the consumption status and where to consume. You can customize the consumption status Fee offset .
copyright:author[manba_ yqq],Please bring the original link to reprint, thank you.