There is a comprehensive white paper from Confluent which explains how to increase throughput and which configurations to look at.
Basically, you have already done the right steps by increasing batch.size
and tuning linger.ms
. Depending on your requirements of potential data loss you may also reduce the retries
. As an important factor for increasing throughput, you should use a compression.type
in your producer while at the same time set the compression.type=producer
at broker-level.
Remember that Kafka scales with the partitions and this can only happen if you have enough brokers in your cluster. Having many partitions, all located on the same broker will not increase throughput.
To summarize, the white paper mentiones the following producer configurations to increase throughput:
batch.size: increase to 100000 - 200000 (default 16384)
linger.ms: increase to 10 - 100 (default 0)
compression.type=lz4 (default none)
acks=1 (default 1)
retries=0 (default 0)
buffer.memory: increase if there are a lot of partitions (default 33554432)
Keep in mind, that in the end, each cluster behaves differently. In addition, each use case has different structure of messages (volume, frequency, byte size, ...). Therefore, it is important to get an understanding of the mentioned producer configurations and test their sensitivity on your actual cluster.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…