Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-11757

csi-snapshot-controller-operator can't reconcile during upgrade

XMLWordPrintable

    • Important
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Can't upgrade from 4.11.20 to 4.12.10

      Description of problem:

      upgrading from ocp 4.11.20 to 4.12.10 stuck due to failure in csi-snapshot-controller-operator reconciliation.
      
      ===
      -bash-4.2$ oc get clusterversion
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.11.20   True        True          20h     Working towards 4.12.10: 663 of 830 done (79% complete), waiting on csi-snapshot-controller
      ===
      -bash-4.2$ oc get pods
      NAME                                                READY   STATUS                       RESTARTS   AGE
      cluster-storage-operator-85f4dd454c-pd7f7           1/1     Running                      0          92s
      csi-snapshot-controller-55dd59f7cc-45j8l            1/1     Running                      0          36d
      csi-snapshot-controller-55dd59f7cc-g2pfp            1/1     Running                      0          36d
      csi-snapshot-controller-operator-6d74b68689-l5b86   0/1     CreateContainerConfigError   0          14s
      csi-snapshot-webhook-5d49775645-45s96               1/1     Running                      0          36d
      csi-snapshot-webhook-5d49775645-p8m44               1/1     Running                      0          36d
      ===
      To fix the issue temporary, changed runAsNonRoot: false => runAsNonRoot: true
      But, the fix warn me about using runAsNonRoot
      
      Warning: would violate PodSecurity "restricted:latest": runAsNonRoot != true (pod must not set securityContext.runAsNonRoot=false)
       

      Version-Release number of selected component (if applicable):

      The affected version we noticed 4.12.10

      How reproducible:

      Upgrade from 4.11.20 to 4.12.10

      Steps to Reproduce:

      1. install ocp on azure cluster version 4.11.20
      2. Upgrade the cluster to version 4.12.10
      

      Actual results:

      Upgrade stuck due to failure in csi-snapshot-controller-operator

      Expected results:

      Upgrade done smooth

      Additional info:

      1. The cluster is running on Azure
      2. Installed using IPI
      3. Storage classes
      -bash-4.2$ oc get sc
      NAME                         PROVISIONER                RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
      azurefile-csi                file.csi.azure.com         Delete          Immediate              true                   40d
      managed-csi (default)        disk.csi.azure.com         Delete          WaitForFirstConsumer   true                   40d
      managed-premium              kubernetes.io/azure-disk   Delete          WaitForFirstConsumer   true                   37d
      managed-standard (default)   kubernetes.io/azure-disk   Delete          WaitForFirstConsumer   true                   37d
      netapp-trident-nas           csi.trident.netapp.io      Delete          Immediate              false                  15d

            Unassigned Unassigned
            rhn-gps-hhemied Hazem Hemied
            Jitendar Singh Jitendar Singh
            Hazem Hemied
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: