-
Bug
-
Resolution: Can't Do
-
Normal
-
None
-
4.15.z
-
Important
-
No
-
1
-
253 - Core Packages
-
1
-
False
-
Description of problem:
I tried to update my SNO from 4.15.5 to 4.15.11. However the Machine Config Pool became degraded preventing the upgrade from reaching completion.
Version-Release number of selected component (if applicable):
4.15.5 -> 4.15.11
How reproducible:
Unknown, customer lab environment, so I have not tried to reproduce. I would likely just install 4.15.11 directly if I were to reinstall.
Steps to Reproduce:
1. Install custom patched kernel (for example: sudo rpm-ostree override replace kernel{,-core,-modules,-modules-extra}-5.14.0-284.59.1.rstat_blkio.el9_2.x86_64.rpm) 2. Do cluster upgrade: oc adm upgrade --to=4.15.11 3. Wait for upgrade to show error.
Actual results:
$ oc adm upgrade Failing=True: Reason: ClusterOperatorDegraded Message: Cluster operator machine-config is degradedinfo: An upgrade is in progress. Unable to apply 4.15.11: wait has exceeded 40 minutes for these operators: machine-configUpgradeable=False Reason: DegradedPool Message: Cluster operator machine-config should not be upgraded between minor versions: One or more machine config pools are degraded, please see `oc get mcp` for further details and resolve before upgradingUpstream is unset, so the cluster will use an appropriate default. Channel: candidate-4.15 (available channels: candidate-4.15, candidate-4.16) No updates available. You may still upgrade to a specific release image with --to-image or wait for new updates to be available $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.15.5 True True 11h Unable to apply 4.15.11: wait has exceeded 40 minutes for these operators: machine-config $ oc get clusteroperator NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.15.11 True False False 11h cloud-controller-manager 4.15.11 True False False 32d config-operator 4.15.11 True False False 32d dns 4.15.11 True False False 11h etcd 4.15.11 True False False 32d image-registry 4.15.11 True False False 11h ingress 4.15.11 True False False 32d kube-apiserver 4.15.11 True False False 32d kube-controller-manager 4.15.11 True False False 32d kube-scheduler 4.15.11 True False False 32d kube-storage-version-migrator 4.15.11 True False False 6d19h machine-approver 4.15.11 True False False 32d machine-config 4.15.5 True True True 32d Unable to apply 4.15.11: error during syncRequiredMachineConfigPools: [context deadline exceeded, failed to update clusteroperator: [client rate limiter Wait returned an error: context deadline exceeded, error MachineConfigPool master is not ready, retrying. Status: (pool degraded: true total: 1, ready 0, updated: 0, unavailable: 0)]] marketplace 4.15.11 True False False 32d monitoring 4.15.11 True False False 11h network 4.15.11 True False False 32d node-tuning 4.15.11 True False False 11h openshift-apiserver 4.15.11 True False False 11h openshift-controller-manager 4.15.11 True False False 11h operator-lifecycle-manager 4.15.11 True False False 32d operator-lifecycle-manager-catalog 4.15.11 True False False 32d operator-lifecycle-manager-packageserver 4.15.11 True False False 11h service-ca 4.15.11 True False False 32d storage 4.15.11 True False False 32d
Expected results:
For the upgrade to complete.
Additional info:
$ oc get po -nopenshift-machine-config-operator NAME READY STATUS RESTARTS AGE kube-rbac-proxy-crio-pstacn1-sut 0/1 CreateContainerError 0 9h machine-config-controller-5b49c86f4b-p7lmd 2/2 Running 0 11h machine-config-daemon-f79j6 2/2 Running 0 11h machine-config-operator-dbb55f546-6qpcx 2/2 Running 0 11h machine-config-server-jcmp9 1/1 Running 0 11h $ oc logs machine-config-controller-5b49c86f4b-p7lmd E0429 14:34:03.804410 1 render_controller.go:439] Error syncing Generated MCFG: %!w(*errors.StatusError=&{{{ } { <nil>} Failure Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modi fied; please apply your changes to the latest version and try again Conflict 0xc0029ccf60 409}}) E0429 14:34:03.806224 1 render_controller.go:461] Error updating MachineConfigPool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again I0429 14:34:03.806232 1 render_controller.go:378] Error syncing machineconfigpool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the l atest version and try again E0429 14:34:03.847677 1 render_controller.go:439] Error syncing Generated MCFG: %!w(*errors.StatusError=&{{{ } { <nil>} Failure Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "master": the object has been modi fied; please apply your changes to the latest version and try again Conflict 0xc00086a600 409}}) E0429 14:34:03.849612 1 render_controller.go:461] Error updating MachineConfigPool master: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "master": the object has been modified; please apply your changes to the latest version and try again I0429 14:34:03.849621 1 render_controller.go:378] Error syncing machineconfigpool master: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "master": the object has been modified; please apply your changes to the l atest version and try again I0429 14:34:08.767117 1 status.go:207] Pool worker: All nodes are updated with MachineConfig rendered-worker-70802875d6e4d855ddae1368c075d6bb I0429 14:34:08.786413 1 status.go:224] Degraded Machine: pstacn1-sut and Degraded Reason: unexpected on-disk state validating against rendered-master-203f42c9596377c3f3ae47f1927c75a3: error running rpm-ostree kargs: exit status 1 Job for rpm-ostreed.service failed because the control process exited with error code. See "systemctl status rpm-ostreed.service" and "journalctl -xeu rpm-ostreed.service" for details. error: Loading sysroot: exit status: 1 $ sudo rpm-ostree status Job for rpm-ostreed.service failed because the control process exited with error code. See "systemctl status rpm-ostreed.service" and "journalctl -xeu rpm-ostreed.service" for details. × rpm-ostreed.service - rpm-ostree System Management Daemon Loaded: loaded (/usr/lib/systemd/system/rpm-ostreed.service; static) Drop-In: /run/systemd/system/rpm-ostreed.service.d └─bug2111817.conf /etc/systemd/system/rpm-ostreed.service.d └─mco-controlplane-nice.conf Active: failed (Result: exit-code) since Mon 2024-04-29 14:42:13 UTC; 5ms ago Docs: man:rpm-ostree(1) Process: 1595530 ExecStart=rpm-ostree start-daemon (code=exited, status=217/USER) Main PID: 1595530 (code=exited, status=217/USER) CPU: 0Apr 29 14:42:13 pstacn1-sut systemd[1]: Starting rpm-ostree System Management Daemon... Apr 29 14:42:13 pstacn1-sut systemd[1]: rpm-ostreed.service: Main process exited, code=exited, status=217/USER Apr 29 14:42:13 pstacn1-sut systemd[1]: rpm-ostreed.service: Failed with result 'exit-code'. Apr 29 14:42:13 pstacn1-sut systemd[1]: Failed to start rpm-ostree System Management Daemon. error: Loading sysroot: exit status: 1 $ oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-203f42c9596377c3f3ae47f1927c75a3 False True True 1 0 0 1 32d worker rendered-worker-70802875d6e4d855ddae1368c075d6bb True False False 0 0 0 0 32d $ oc get mc NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE 00-master 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 32d 00-worker 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 32d 01-master-container-runtime 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 32d 01-master-cpu-partitioning 3.2.0 32d 01-master-kubelet 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 32d 01-worker-container-runtime 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 32d 01-worker-cpu-partitioning 3.2.0 32d 01-worker-kubelet 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 32d 02-master-workload-partitioning 3.2.0 32d 15-master-hosts.yaml 3.2.0 32d 30-master-dnsmasq.yaml 3.2.0 32d 50-nto-master 26d 50-performance-n1-master 3.2.0 26d 97-master-generated-kubelet 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 11h 97-worker-generated-kubelet 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 11h 98-master-generated-kubelet 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 32d 98-worker-generated-kubelet 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 32d 99-master-etc-udev-rulesd-renic 3.2.0 20d 99-master-generated-kubelet 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 26d 99-master-generated-registries 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 32d 99-master-oneshot-script-service 3.2.0 26d 99-master-ssh 3.2.0 32d 99-worker-generated-registries 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 32d 99-worker-ssh 3.2.0 32d container-mount-namespace-and-kubelet-conf-master 3.2.0 32d rendered-master-203f42c9596377c3f3ae47f1927c75a3 8437f354d88926efbf447472c640f27cc3764741 3.4.0 11h rendered-master-286647e15c358a693c22faee520476c5 8437f354d88926efbf447472c640f27cc3764741 3.4.0 20d rendered-master-56d89121965318828618a96b80dc948e 8437f354d88926efbf447472c640f27cc3764741 3.4.0 32d rendered-master-5f7b95676f8e6789152ae7b533a12f40 8437f354d88926efbf447472c640f27cc3764741 3.4.0 26d rendered-master-68f4353024a25c7106de5bace3e7791f 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 11h rendered-master-826411b944246a00333200aaa4a04924 8437f354d88926efbf447472c640f27cc3764741 3.4.0 6d21h rendered-master-8d03c8220f3a19b505d4119606d31945 8437f354d88926efbf447472c640f27cc3764741 3.4.0 6d22h rendered-master-92677ee03ce3852593a6567a4755f412 8437f354d88926efbf447472c640f27cc3764741 3.4.0 32d rendered-master-a3dec86f75a0bbdbdbbfc3b7169af7b0 8437f354d88926efbf447472c640f27cc3764741 3.4.0 26d rendered-master-af809de8e1c649425a147296bd9fcee1 8437f354d88926efbf447472c640f27cc3764741 3.4.0 6d17h rendered-master-b0b73e8b86c7434a1e29d063e1ddca5c 8437f354d88926efbf447472c640f27cc3764741 3.4.0 32d rendered-master-c7018a68ddd834546f657a54af7526f8 8437f354d88926efbf447472c640f27cc3764741 3.4.0 6d19h rendered-master-d0e79cf4e402385678b23f12397fc47f 8437f354d88926efbf447472c640f27cc3764741 3.4.0 26d rendered-master-d3ec6bec2b6493bd844f8c36c55bcca6 8437f354d88926efbf447472c640f27cc3764741 3.4.0 32d rendered-master-d914d1c82db6323a363e784cbe8f9bbe 8437f354d88926efbf447472c640f27cc3764741 3.4.0 26d rendered-master-ec50afe000ad68a263ebc6a5624facf5 8437f354d88926efbf447472c640f27cc3764741 3.4.0 6d21h rendered-master-f75e5fe130ea5715ff2583ef2903f144 8437f354d88926efbf447472c640f27cc3764741 3.4.0 26d rendered-worker-0d265722591f6031ea09a19f9122222f 8437f354d88926efbf447472c640f27cc3764741 3.4.0 32d rendered-worker-3f927333966bd29f4033cd08c984cb04 8437f354d88926efbf447472c640f27cc3764741 3.4.0 26d rendered-worker-4dcba50d0390a39a45178ac4216e191c 8437f354d88926efbf447472c640f27cc3764741 3.4.0 32d rendered-worker-6f9f51bcc0e7d8e856c2479992d36992 8437f354d88926efbf447472c640f27cc3764741 3.4.0 32d rendered-worker-70802875d6e4d855ddae1368c075d6bb 6e28938baecfe677ff6b69f46e2c889c1a7a0bb5 3.4.0 11h