Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-20879

[2122119] Virtual machine fails to start with error "Unable to use native AIO: failed to create linux AIO context: Resource temporarily unavailable"

    XMLWordPrintable

Details

    • CNV Virtualization Sprint 224, CNV Virtualization Sprint 225, CNV Virtualization Sprint 226, CNV Virtualization Sprint 227, CNV Virtualization Sprint 228, CNV Virtualization Sprint 229, CNV Virtualization Sprint 230, CNV Virtualization Sprint 231, CNV Virtualization Sprint 232
    • High

    Description

      Description of problem:

      The OpenShift nodes by default are having fs.aio-max-nr = 65536. However, this is low and a host with many VMs running can hit this limit and the VM startup fails with error "Resource temporarily unavailable" since io_setup fails with EAGAIN.

      libvirtd raises this limit to 1048576 with a custom sysctl.conf [1] in other virtualization platforms like RHV and Stack which is hard to hit. Since libvirtd is containerized in OpenShift Virt, this conf is not applied.

      In my tests, for every VM that has disks with aio=native, the fs.aio-nr is getting incremented by 1024 (Looks like 128 was increased to 1024 [2]) and I can see io_setup called with nr_events=1024 in the strace of the qemu-kvm. So it's easy to hit the limit on a host with many VMs with the default 65536.

      [1] https://github.com/libvirt/libvirt/commit/5298551e07a9839c046e0987b325e03f8ba801e5
      [2] https://github.com/qemu/qemu/commit/2558cb8dd4150512bc8ae6d505cdcd10d0cc46bb

      Version-Release number of selected component (if applicable):

      OpenShift Virtualization 4.10.4

      How reproducible:

      100%

      Steps to Reproduce:

      1. Start a VM with disk aio=native. Use preallocated file or block disks which use aio=native by default.
      2. fs.aio-nr is incremented by 1024 for every new VMs.
      3. With many VMs running in the host, it will hit the aio-max-nr

      Actual results:

      Virtual machine fails to start with error "Unable to use native AIO: failed to create linux AIO context: Resource temporarily unavailable"

      Expected results:

      Lift the default limit fs.aio-max-nr so that host can have more VMs.

      Additional info:

      Attachments

        Activity

          People

            ibezukh Igor Bezukh
            rhn-support-nashok Nijin Ashok
            Kedar Bidarkar Kedar Bidarkar
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: