Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-33034

SSH to OCP nodes asking for password after cluster upgrade to RHOCP 4.14

XMLWordPrintable

    • No
    • 3
    • 254 - Integration & Delivery
    • 1
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Customer upgraded their OCP cluster from version 4.12.40 version 4.13.37 first and then from OCP version 4.13.37 to version 4.14.16 . Upgrade completed successfully.
      
      When customer is trying to SSH the OCP nodes (Master and Worker) from bastion machine after upgrade, they are getting prompt to enter the password.
      ~~~
      [root@kcsgmocphanb01 .ssh]# ssh core@kcsgmocpmanb03.gmvayu1x.ktbcs core@kcsgmocpmanb03.gmvayu1x.ktbcs's password:
      ~~~
      
      The expectation from this bug is to check below queries :
      
      - In/after OpenShift 4.13 the location for ssh keys changed, as reported by the release notes - https://access.redhat.com/bounce/?externalURL=https%3A%2F%2Fdocs.openshift.com%2Fcontainer-platform%2F4.13%2Frelease_notes%2Focp-4-13-release-notes.html%23ocp-4-13-rhcos-ssh-key-location
      Still to make SSH to the nodes working properly, customer had to copy public key from "/home/core/.ssh/authorized_keys.d/ignition" to "/home/core/.ssh/authorized_keys". Why ?
      
      - Why this "10-disable-ssh-key-dir.conf" file is not present in  the "/etc/ssh/sshd_config.d/" directory on the node ?
      
      - We verified following configuration files on multiple nodes and confirmed that only 'Include /etc/ssh/sshd_config.d/*.conf' line is not present in the file '/etc/ssh/sshd_config' . How come this line is removed or not added after upgrade ?

      Actual results:

      SSH to the node prompt to enter password. SSH to the OCP nodes is not working after upgrade.

      Expected results:

      SSH to the node should not ask for password. SSH to the OCP nodes should work after upgrade.

      Additional info:

      Note : 
      - Customer do not have compliance operator installed in the cluster
      - There is no MachineConfig created to make changes in the sshd_config.
      - Customer claims that they have not made any changes manually in any file or do not have any automation scripts or cron jobs.
      
      Below are the tried workarounds -
      
      - Line "Include /etc/ssh/sshd_config.d/*.conf" is missing from file /etc/ssh/sshd_config
      When tried to add this line in the file manually on the node, sshd service failed. Removed the added line and restarted the sshd service to make it active.
      
      - Tried generating a new key pair with more secured algorithm : `$ ssh-keygen -t ed25519` . Tried SSH using new private key, still prompt to enter the password to SSH the OCP nodes.
      
      - Tried running the command on client machine : `$ sudo update-crypto-policies --set DEFAULT` . It didn't help
      
      - Tried copying the private key to other machine running in the same network (one of the master node). And tried ssh to the other master node.
      Again asked for password to enter while ssh the master node.
      
      - Checked the sshd status on the OCP nodes where SSH was asking for password, it was active but with below error msg : `$ systemctl status sshd` :
      `ssh-rsa algorithm is disabled`
      
      ===========
      Workaround which helped in this case :
      
      Customer copied "/home/core/.ssh/authorized_keys.d/ignition" into "/home/core/.ssh/authorized_keys" on master node 
      
      sh-4.4# cp -r authorized_keys.d/ignition /home/core/.ssh/authorized_keys
      [root@kcsgmocphanb01 .ssh]# ssh core@kcsgmocpmanb01.gmvayu1x.ktbcs
      Red Hat Enterprise Linux CoreOS 414.92.202403051622-0
      Part of OpenShift 4.14, RHCOS is a Kubernetes native operating system
      managed by the Machine Config Operator (`clusteroperator/machine-config`).
      WARNING: Direct SSH access to machines is not recommended; instead,
      make configuration changes via `machineconfig` objects:
      https://docs.openshift.com/container-platform/4.14/architecture/architecture-rhcos.html
      ---
      Last login: Fri Apr 26 10:44:46 2024 from 100.127.173.111
      ===========

       

            travier@redhat.com Timothée Ravier
            rhn-support-sdharma Suruchi Dharma
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: