Kubernetes release 1.31.0

Urgent Upgrade Notes

Custom scheduler plugin developers **must** implement a QueueingHint for Pod/Update events if rejections from their plugins could be resolved by updating unscheduled Pods. This ensures the scheduler efficiently re-evaluates Pods after updates.

The --keep-terminated-pod-volumes kubelet flag has been removed. Ensure this flag is not in use before upgrading.

If using the RecoverVolumeExpansionFailure alpha feature, clear status.allocatedResourceStatus on existing PersistentVolumeClaims with values "ControllerResizeFailed" or "NodeResizeFailed" **before** upgrading to avoid issues with volume expansion recovery.

Critical Deprecations

The volume.beta.kubernetes.io/mount-options annotation for PersistentVolumes is deprecated. Migrate to using csi.storage.k8s.io/pv/node-publish-secret-ref or mount options specified in the StorageClass.

The CephFS (kubernetes.io/cephfs) and CephRBD (kubernetes.io/rbd) volume plugins are removed. Migrate to the Ceph CSI driver (https://github.com/ceph/ceph-csi/).

Non-CSI volume limit scheduler plugins (AzureDiskLimits, CinderLimits, EBSLimits, GCEPDLimits) are deprecated. Replace them with the NodeVolumeLimits plugin in your scheduler configuration.

The kubeadm RootlessControlPlane feature gate is deprecated. Use the core Kubernetes UserNamespacesSupport feature gate instead.

Legacy cloud provider integration code is removed. Use external cloud providers instead.

Major API Changes

The Dynamic Resource Allocation (DRA) driver's DaemonSet now requires a ServiceAccount with permissions to write ResourceSlice and read ResourceClaim objects.

The API server now handles updates to Ingress.spec.defaultBackend atomically. Controllers modifying the default backend port should be aware of this change.

The kube-proxy --nodeport-addresses option now supports the value "primary", causing it to listen only on the node's primary IP addresses. Review your kube-proxy configuration if you were relying on the default behavior of listening on all interfaces.

The ConsistentListFromCache feature, which was promoted to beta in 1.30, has been reverted due to issues. It is now disabled by default again.

Key New Features

Kubernetes 1.31 introduces Coordinated Leader Election as an alpha feature (behind the CoordinatedLeaderElection feature gate). This allows the control plane to use LeaseCandidate objects for leader election, enabling the kube-apiserver to select the best leader based on a defined strategy.

The kube-proxy nftables mode (--proxy-mode=nftables) is promoted to beta and is now available by default. This mode offers improved performance and scalability compared to the traditional iptables mode.

The kubectl debug command now supports a --keep-* set of flags to control the removal of probes, labels, annotations, and initContainers from the debug container. This allows for more flexibility when debugging Pods.

The WatchList feature, which enables consistent list operations from the API server, is promoted to beta. This can improve the performance and efficiency of list operations, especially for large clusters.

Important Bug Fixes

Fixed a bug where kubelet would fail to restart containers when fields other than the container image were changed in the Pod spec. This ensures that Pod updates are applied correctly.

Fixed an issue where Windows nodes were not implementing memory pressure eviction. This improves resource management on Windows nodes.

Fixed a bug that could cause Pods to get stuck in a pending state if they were rejected by pre-enqueue scheduler plugins. This ensures that Pods are scheduled efficiently.

Fixed a regression in kube-apiserver where watch events were not delivered correctly when watching a single namespace using the deprecated /api/v1/watch/namespaces/$name endpoint. This restores the expected behavior for this deprecated endpoint.

Fixed a bug where the Service LoadBalancer controller was not correctly considering the service.Status.loadBalancer.ipMode field, potentially leading to incorrect Service status updates.