Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-5522

OVN control-plane vs. data-plane skew within a z stream

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • Hosted Control Planes, SDN
    • None
    • False
    • None
    • False
    • Not Selected

      1. Proposed title of this feature request

      OVN control-plane vs. data-plane skew within a z stream

      2. What is the nature and description of the request?

      This RFE is requesting investigation (and possibly persistent periodic testing) of OVN control-plane vs. data-plane skew within a z stream. For example, is a 4.y.0 control-plane compatible with 4.y.z data-plane? And is a 4.y.z control-plane compatible with a 4.y.0 data-plane? If theory and testing shows compatibility, the network operator can possibly relax some of its current constraints during patch updates within a z-stream (4.y.z to 4.y.z') without exposing the data-plane compute nodes to networking issues.

      3. Why does the customer need this? (List the business requirements here)

      Hosted-cluster (HyperShift) administrators (Cluster Service Providers) may want to mitigate their management cluster's exposure to CVEs by deploying updated images, regardless of the state of hosted cluster infrastructure like compute-node reachability. Testing in OCPBUGS-32382 shows that today, breaking networking between compute nodes and the control plane can cause the network operator to delay updating the management-cluster-side OVN control-plane, as it waits for the compute-side OVN data-plane to update. It was historically important for the OVN data-plane to update before the OVN control-plane, although perhaps this is less true since OVN IC? SDN-4057 also delivered a knob in this space, although it's not clear to me how a custom OVN_CONTROL_PLANE_IMAGE would interact with an in-progress update to a new release image. Would the network operator roll out the requested control-plane images without waiting on the data plane in that case?

      SDN-4057 also looks like a pretty generous grant of power to the admins writing HostedCluster spec, and I'm curious about whether there are any automated skew guards to cover things like "admin requested a 4.y control-plane with a 4.(y-1) data plane, and OVN doesn't support that kind of skew". Technically out of scope for this RFE, but floating as adjacent in case it inspires anyone else to spin off work around that angle.

      Also in this space, RFE-1955 and OCPSTRAT-975 are tiptoeing into rollbacks within a z-stream, in which case a cluster could update from 4.y.0 to 4.y.z and then roll back to 4.y.0. That would introduce the same sort of OVN control-plane vs. data-plane skew if the network operator has the data-plane going first.

      4. List any affected packages or components.

      OVN/network-operator. Possibly also HyperShift, if the HostedControlPlane controller ends up needing changes.

            azaalouk Adel Zaalouk
            trking W. Trevor King
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: