Failed to read ‘log header’ errors in Kafka logs

Symptoms

When Event Streams is configured to use GlusterFS as a storage volume, the Kafka logs show errors containing messages similar to the following:

[2020-05-12 06:40:19,249] ERROR [ReplicaManager broker=2] Error processing fetch with max size 1048576 from consumer on partition <TOPIC-NAME>-0: (fetchOffset=10380908, logStartOffset=-1, maxBytes=1048576, currentLeaderEpoch=Optional.empty) (kafka.server.ReplicaManager)
org.apache.kafka.common.KafkaException: java.io.EOFException: Failed to read `log header` from file channel `sun.nio.ch.FileChannelImpl@a5e333e6`. Expected to read 17 bytes, but reached end of file after reading 0 bytes. Started read from position 95236164.

These errors mean that Kafka has been unable to read files from the Gluster volume. This can cause replicas to fall out of sync.

Cause

See Kafka issue 7282 GlusterFS has performance settings that will allow requests for data to be served from replicas when they are not in sync with the leader. This causes problems for Kafka when it attempts to read a replica log segment before it has been fully written by Gluster.

Resolving the problem

Apply the following settings to each Gluster volume that is used by an Event Streams Kafka broker:

gluster volume set <volumeName> performance.quick-read off
gluster volume set <volumeName> performance.io-cache off
gluster volume set <volumeName> performance.write-behind off
gluster volume set <volumeName> performance.stat-prefetch off
gluster volume set <volumeName> performance.read-ahead off
gluster volume set <volumeName> performance.readdir-ahead off
gluster volume set <volumeName> performance.open-behind off
gluster volume set <volumeName> performance.client-io-threads off

These settings can be applied while the Gluster volume is online. The Kafka broker will not need to be modified, the broker will be able to read from the volume after the change is applied.