Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.14
Component/s: HyperShift
Labels:

Severity:
Critical
Regression:
No
Sprint:
Hypershift Sprint 243
sprint_count:
1
Release Blocker:
Rejected
Blocked:
False
Blocked Reason:

Hide

None

Show
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

Installed agent-based hosted cluster in disconnected environment. The HostedCluster's conditions show that it is Available and Progressing but it is Degraded:

    Last Transition Time:  2023-09-28T20:11:35Z
    Message:               [certified-operators-catalog deployment has 1 unavailable replicas, community-operators-catalog deployment has 1 unavailable replicas, redhat-marketplace-catalog deployment has 1 unavailable replicas, redhat-operators-catalog deployment has 1 unavailable replicas]
    Observed Generation:   2
    Reason:                UnavailableReplicas
    Status:                True
    Type:                  Degraded



The pods of such deployments are in ImagePullBackOff, e.g.

$ oc describe po -n clusters-hosted-0 certified-operators-catalog-df7997697-5sv67
Name:                 certified-operators-catalog-df7997697-5sv67
Namespace:            clusters-hosted-0
Priority:             100000000
Priority Class Name:  hypershift-control-plane
Service Account:      default
Node:                 master-0-2/192.168.123.70
Start Time:           Thu, 28 Sep 2023 16:13:55 -0400
Labels:               app=certified-operators-catalog
                      hypershift.openshift.io/control-plane-component=certified-operators-catalog
                      hypershift.openshift.io/hosted-control-plane=clusters-hosted-0
                      olm.catalogSource=certified-operators
                      pod-template-hash=df7997697
Annotations:          alpha.image.policy.openshift.io/resolve-names: *
                      hypershift.openshift.io/release-image:
                        registry.ocp-edge-cluster-0.qe.lab.redhat.com:5000/ocp/release@sha256:9cdd3d0a1bbe04aecbe19e9f0416114835d317a3e96926884fc49ce899e46306
                      k8s.ovn.org/pod-networks:
                        {"default":{"ip_addresses":["10.130.0.168/23"],"mac_address":"0a:58:0a:82:00:a8","gateway_ips":["10.130.0.1"],"routes":[{"dest":"10.128.0....
                      k8s.v1.cni.cncf.io/network-status:
                        [{
                            "name": "ovn-kubernetes",
                            "interface": "eth0",
                            "ips": [
                                "10.130.0.168"
                            ],
                            "mac": "0a:58:0a:82:00:a8",
                            "default": true,
                            "dns": {}
                        }]
                      openshift.io/scc: restricted-v2
                      seccomp.security.alpha.kubernetes.io/pod: runtime/default
Status:               Pending
SeccompProfile:       RuntimeDefault
IP:                   10.130.0.168
IPs:
  IP:           10.130.0.168
Controlled By:  ReplicaSet/certified-operators-catalog-df7997697
Containers:
  registry:
    Container ID:   
    Image:          from:imagestream
    Image ID:       
    Port:           50051/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:        10m
      memory:     160Mi
    Liveness:     exec [grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3
    Readiness:    exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3
    Startup:      exec [grpc_health_probe -addr=:50051] delay=0s timeout=1s period=10s #success=1 #failure=15
    Environment:  <none>
    Mounts:       <none>
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:            <none>
QoS Class:          Burstable
Node-Selectors:     <none>
Tolerations:        hypershift.openshift.io/cluster=clusters-hosted-0:NoSchedule
                    hypershift.openshift.io/control-plane=true:NoSchedule
                    node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                    node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                    node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                     From     Message
  ----     ------   ----                    ----     -------
  Warning  Failed   55m (x47 over 11h)      kubelet  Failed to pull image "from:imagestream": rpc error: code = DeadlineExceeded desc = pinging container registry registry-1.docker.io: Get "https://registry-1.docker.io/v2/": dial tcp 52.1.184.176:443: i/o timeout
  Warning  Failed   34m (x34 over 11h)      kubelet  Failed to pull image "from:imagestream": rpc error: code = DeadlineExceeded desc = pinging container registry registry-1.docker.io: Get "https://registry-1.docker.io/v2/": dial tcp 18.215.138.58:443: i/o timeout
  Warning  Failed   19m (x12 over 3h25m)    kubelet  Failed to pull image "from:imagestream": rpc error: code = DeadlineExceeded desc = pinging container registry registry-1.docker.io: Get "https://registry-1.docker.io/v2/": dial tcp 34.194.164.123:443: i/o timeout
  Normal   BackOff  5m27s (x2234 over 11h)  kubelet  Back-off pulling image "from:imagestream"
  Normal   Pulling  32s (x105 over 12h)     kubelet  Pulling image "from:imagestream"

Version-Release number of selected component (if applicable):

4.14.0-rc.2

How reproducible:

100%

Steps to Reproduce:

1. Install MCE 2.4 and hypershift operator on 4.14 1pv4 disconnected hub cluster
2. Install 4.14.0-rc.2 agent-based hosted cluster 
3.

Actual results:

The HostedCluster is degraded because certified-operators-catalog, community-operators-catalog, redhat-marketplace-catalog, redhat-operators-catalog pods are in ImagePullBackOff

Expected results:

The pods are ready and the hosted cluster is not degraded

Additional info:

mentioned on

Merge request - Fix for OCPBUGS-19929 in hypershift disconnected

Assignee:: Juan Manuel Parrilla Madrid

Reporter:: Elsa Passaro

QA Contact:: Liangquan Li

Contributors:: Lubov Shilin, Shelly Miron

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2023/09/29 8:26 AM

Updated:: 2023/10/09 10:36 AM

Resolved:: 2023/10/03 10:41 AM

Details

Description

Attachments

Issue Links

Activity

People

Dates

Hide