Symptoms
You run the kubectl get pods -n <namespace> -o wide command to check the Kafka broker pods and see an output similar to the following:
kubectl get pods -n <namespace> -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
affinity-controller-0 1/1 Running 0 2m46s 10.144.16.142 worker0.my-cluster.cp <none> <none>
affinity-controller-1 1/1 Running 0 2m46s 10.144.28.180 worker0.my-cluster.cp <none> <none>
affinity-xhmp6 1/1 Running 0 2m42s 10.144.28.136 worker0.my-cluster.cp <none> <none>
Kafka broker pods are scheduled on the same worker node, which reduces fault tolerance and resilience.
Causes
In Event Streams 12.0.x, the default pod anti-affinity rules are not optimized for KRaft-based deployments. If no rules are configured, the Kubernetes scheduler might place multiple Kafka broker pods on the same node.
Resolving the problem
To ensure Kafka broker pods are distributed across multiple nodes, you must configure custom pod anti-affinity rules manually in each Kafka Node Pool.
Update your EventStreams custom resource to include anti-affinity rules under spec.strimziOverrides.nodePools.template.pod.affinity.
For example:
apiVersion: eventstreams.ibm.com/v1beta2
kind: EventStreams
# ...
spec:
strimziOverrides:
kafka:
# ...
nodePools:
- name: kafka
replicas: 2
roles:
- broker
storage:
# ...
template:
pod:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 10
podAffinityTerm:
labelSelector:
matchExpressions:
- key: eventstreams.ibm.com/pool-name
operator: In
values:
- kafka
topologyKey: kubernetes.io/hostname
- weight: 5
podAffinityTerm:
labelSelector:
matchExpressions:
- key: eventstreams.ibm.com/cluster
operator: In
values:
- min-prod-scram
- key: eventstreams.ibm.com/component-type
operator: In
values:
- kafka
topologyKey: kubernetes.io/hostname
The previous example defines two preferred pod anti-affinity rules that improve Kafka pod distribution across nodes:
- Rule 1 (weight 10): Aims to place pods from the same node pool on different hosts.
- Rule 2 (weight 5): Aims to place Kafka pods from the same Event Streams instance on different hosts.