Live Migration with changing ROM sizes

February 20, 2018
qemu migration option-roms

Live Migration with changing Rom sizes

a.k.a size matters

Background on QEMU ROM

Virtual PCI devices have Option ROM support just like physical PIC device. Option ROMs for PCI devices have been used to enable additional functionality, such as PXE Boot. The size of these is implied by the PCI spec to be a power of two. In QEMU the size allocated for such a rom is defined by the size of the file backing up the rom.

As an example for virtio network card the virtio-net-pci driver defined efi-virtio.rom as default rom file in hw/virtio/virtio-pci.c:2361. And for example on a Xenial system this file is provided by the package ipxe-qemu at:

-rw-r--r-- 1 root root 237K Nov 22 18:15 /usr/lib/ipxe/qemu/efi-virtio.rom

             +------------------------------+
             | Virtual System               |
             |                              |
             |                              |
             |                              |
             |  +-----------+               |
             |  | Rom 2**x  |               |
             |  | 256k      |               |
             |  +---^-------+               |
             +------------------------------+
                    | file load defines size
                    |
             +------+
             | file |
             | 237k |
             +------+

This space is visible as PCI mapping from the guest, so obviously it is not allowed to change at runtime.

Implications on Migration

All of the above works fine as long as sizes of the ROM do not change. But remember that size is defined by loading from a file, so on a migration the target loads its local file. Things become complex if that file has a different size. For example, let’s look at the bionic version of the rom:

-rw-r--r-- 1 root root 294K Feb 5 14:09 /usr/lib/ipxe/qemu/efi-virtio.rom

When migrating between two virtual hosts,  the target virtual host will allocate the smallest power-of-two size that fits the target hosts version of the ROM file. From above, the file size is 294k which is larger than 256 (2 « 7) and the next power of two is 2 « 8, which results in a ROM space of 512k. This will trigger an issue on the migration like this: qemu-system-x86_64: Length mismatch: 0000:00:03.0/virtio-net-pci.rom: 0x40000 in != 0x80000: Invalid argument

 +-------------------------+                 +-------------------------+
 | Virtual System          |                 | Virtual System          |
 |                         |                 |                         |
 |                         |                 |                         |
 |                         |                 |                         |
 |  +-----------+          |   Migration     |  +---------------------+|
 |  | Rom 2**x  |          | +----------->   |  | Rom 2**x            ||
 |  | 256k      |          |   breaks        |  | 512k                ||
 |  +---^-------+          |   as rom        |  +--^------------------+|
 +-------------------------+   size is       +-------------------------+
    file load defines size     guest visible    file load defines size
        |                      and not            |
 +------+                      allowed          +-----+------+
 | file |                      to change        | file       |
 | 237k |                                       | 294k       |
 +------+                                       +------------+

And you are safe on the content - as the actual content of the rom will be migrated over. So even with a different content file this won’t break on migration.

Who shall fix it

This is a tricky question. To some extent the ROMs come from different projects than QEMU. So QEMU can’t perfectly map versions to ROMs as the ROM files may change independently of QEMU..

You might say, “But hey they bundle ROMs on the release tarballs”. That is correct, but won’t help due to differing distribution packaging policies. Some distribution have restrictions on bundling pre-build ROMs Furthermore different distributions might want to add or enable/disable different features in those ROMs. For example Ubuntu had https enabled for quite a while which makes our ROMs have a different size compared to Debian.

With all that in mind we can agree that upstream can’t handle it as it isn’t fully under their control. The actual projects of the various ROMs have even less control where and how they are bundled. QEMU upstream can provide some generic mechanisms to handle those cases nicely though.

                   +---------------+
                   | QEMU bundling |
                   |               |
                   | Versions      |
                   +-------+-------+
+--------------+           |           +------------------------+
| Rom projects |           |           | Distributions          |
|              +-----------------------+ might                  |
| code         |           |           | bundle other Versions  |
+--------------+           |           +-----------+------------+
                           |                       |
                           |                       |
+--------------+           |           ------------+------------+
| Distribution |           |           | Distributions          |
| toolchain    +-----------------------+ enable/disable features|
| on build     |           |           | modifying code         |
+--------------+           |           +------------------------+
                   +-------v-------+
                   | Size of rom   |
                   |               |
                   | in a release  |
                   +-------+-------+
                           |
                +----------v----------+
                | Distribution decides|
                | which QEMU version  |
                | will it be aligned  |
                | with?               |
                +---------------------+

Eventually this is something that falls back on the Distributions as part of the integration of those projects and the compatibility for upgrades, migrations and similar.

Changing sizes but retaining migratability

When I first understood all the implications the runway was too short (close to Ubuntu 17.10 being released). I reverted my ipxe changes and took a look how it is handled so far. And to my surprise it seems that mostly people just hope it doesn’t happen - there isn’t a very clear solution to it.

First of all I added a build time check that will prevent any fix/change to accidentally change the sizes. But at least for the next release I needed a solution to not force us to keep an outdated ipxe.

I discussed with several people and the obvious options were all unacceptable:

I reached out further and after a while upstream discussion revealed that a few had solved similar cases with mapping older machine types to files of a different size. And after no better solution was found I proposed such a change in sync with Debian. This wasn’t picked up yet, but the discussions triggered around it were good.

Proposed Solution for QEMU 2.11

Eventually for Ubuntu 18.04 I needed a solution and implemented the file mapping on the machine types. This solution consists of the following pieces:

That way on a migration QEMU will allocate based on the compat file. The size will match, and the content will get transferred. Migration is working just fine now.

 +-------------------------+                 +-------------------------+
 | Virtual System          |                 | Virtual System          |
 |                         |                 |                         |
 |                         |                 |                         |
 |                         |                 |                         |
 |  +-----------+          |   Migration     |  +---------------------+|
 |  | Rom 2**x  |          | +----------->   |  | Rom 2**x            ||
 |  | 256k      |          |   works         |  | 256k                ||
 |  +---^-------+          |                 |  +--^------------------+|
 +-------------------------+   but at the    +-------------------------+
    file load defines size     same time        file load defines size
        |                      new guests         |
 +------+                      will use the     +-----+------+
 | file |                      new ROMs         | compat-file|<-incoming machine
 | 237k |                                       | 237k       |  type selects
 +------+                                       +------------+

Even with all the effort to have it working, please never forget that in general it is recommended to upgrade the machine type when you can.

By the way - to help you on this there is an experimental snap called virt-machine-type.

Further thoughts

Even on the current solution I think I should move up to the non arch dependent compat.h at least. I have had some ideas, but no time yet to consider or RFC code any of them. Here they are for now:

Given enough time I’d think something like that will be a better solution. But until then mapping via the machine types seems to be the best we can do to provide clean migrations to users.