Throttle Kafka consumption to the file writing speed
Created by: danesss
Issue
ECDC-3253: We currently read from Kakfa at max speed and buffer into memory, which can cause out-of-memory issues if Kafka is faster than the GPFS filesystem.
Description of work
We adapt the consumption rate to the file writing rate:
- We add a new configuration setting for the size of write queue
--max-queued-writes
- We add a method in MessageWriter to be able to check current queue size
- We extend StreamController to
- Perform a periodic check of the write queue size
- Pause all consumers for 200ms if queue is larger than configured
-
Topic
andPartition
classes have new methods to supportpause
andresume
of the consumers. - Unit tests are added to verify pause/resume behaviour at the
Partition
level.- To test the "queue_size -> pause" logic further changes are needed so they will be explored in separate PR: https://github.com/ess-dmsc/kafka-to-nexus/pull/689
Nominate for Group Code Review
-
Nominate for code review
Reminder
Changes should be documented in changes.md