Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-7480

Setting TuneD profile with realtime kernel results in nmi watchdog errors

XMLWordPrintable

    • No
    • Proposed
    • False
    • Hide

      None

      Show
      None

      Description of problem:
      Attempting to enable a realtime-virtual-guest TuneD profile on VMware, or any profile that uses the realtime kernel, results in an error trying set the nmi_watchdog=0 and the profile fails to install.

      Version-Release number of selected component (if applicable):
      tested on OCP 4.10.31 and 4.12.2

      How reproducible
      Always

      Steps to Reproduce:
      1. On a fresh install of OCP of on VMware
      2. Create a file rt-v-guest.yaml

      apiVersion:  tuned.openshift.io/v1
      kind:  Tuned
      metadata: 
        name:         rt-v-guest
        namespace:    openshift-cluster-node-tuning-operator
      spec: 
        recommend: 
        - profile:   realtime-virtual-guest
          match: 
          - label:   node-role.kubernetes.io/worker
          priority:  20
      

      3. Apply with

      oc create -f rt-v-guest.yaml
      

      Actual results:
      After applying the tuned profile, the TuneD pod log for a worker node:

      oc logs tuned-hsxhb -n openshift-cluster-node-tuning-operator
      2023-02-14 16:46:54,014 INFO     tuned.daemon.daemon: starting tuning
      2023-02-14 16:46:54,016 INFO     tuned.plugins.base: instance cpu: assigning devices cpu2, cpu1, cpu0, cpu3
      2023-02-14 16:46:54,017 INFO     tuned.plugins.plugin_cpu: We are running on an x86 GenuineIntel platform
      2023-02-14 16:46:54,020 WARNING  tuned.plugins.plugin_cpu: your CPU doesn't support MSR_IA32_ENERGY_PERF_BIAS, ignoring CPU energy performance bias
      2023-02-14 16:46:54,022 INFO     tuned.plugins.plugin_bootloader: cannot read '/etc/default/grub'
      2023-02-14 16:46:54,026 INFO     tuned.plugins.base: instance net: assigning devices ens192
      2023-02-14 16:46:54,026 INFO     tuned.plugins.plugin_rtentsk: opened SOF_TIMESTAMPING_OPT_TX_SWHW socket
      2023-02-14 16:46:54,028 INFO     tuned.plugins.plugin_cpu: setting new cpu latency 3
      2023-02-14 16:46:54,029 ERROR    tuned.plugins.plugin_sysctl: Failed to set sysctl parameter 'kernel.nmi_watchdog' to '0': [Errno 524] Unknown error 524
      2023-02-14 16:46:54,029 INFO     tuned.plugins.plugin_sysctl: reapplying system sysctl
      2023-02-14 16:46:54,308 INFO     tuned.plugins.plugin_bootloader: installing additional boot command line parameters to grub2
      2023-02-14 16:46:54,309 INFO     tuned.plugins.plugin_bootloader: cannot find grub.cfg to patch
      E0214 16:46:54.309755    4183 controller.go:854] unable to sync(daemon/) requeued (4)
      E0214 16:46:54.309913    4183 controller.go:854] unable to sync(daemon/) requeued (5)
      

      Expected results:
      Profile is applied and guest restarts with the realtime kernel

      Additional info:
      The goal is to be able to isolate CPUs. Is this properly supported on VMware?

            jmencak Jiri Mencak
            rhn-support-dguthrie David Guthrie
            Sunil Choudhary Sunil Choudhary
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: