oVirt Engine 3.4.4 Interoperability with Fedora 21

By Paul Heinlein | Dec 19, 2014

At work, we maintain a cluster of a dozen hosts dedicated to running virtual machines. The cluster is managed by oVirt version 3.4.4. Version 3.5 is current, but for a variety of reasons we’re sticking with 3.4.4 for a while yet.

The cluster nodes and the oVirt Engine, essentially the cluster controller, currently run Fedora 19. The standard Fedora life cycle is that “Release X is supported until one month after the release of Release X+2.” Since Fedora 21 was release earlier this month, Fedora 19 is near the end of its lifecycle.

So I thought it was time to try using Fedora 21 on a cluster node, the nodes being easier to update than the server running the Engine.

Getting Fedora 21 to work with Engine 3.4.4 took some work! The short version of the process is outlined below.

The Fedora 21 installation went very smoothly. I used a kickstart configuration with a pretty minimal package set. The network interfaces changed named, so it took a little trial and effort to get the various interfaces and bridges named correctly, but I’d expected that problem and had planned accordingly.

I instructed the Engine to add the Fedora 21 host to the cluster. The process involves, among other things, the Engine telling the host which packages it needs to install to work in the oVirt environment. Again, that went fine. At the end of the process, the Engine reboots the newly added host.

Here’s where I encountered the first problem. Via a console connection, I could tell that the Fedora 21 host rebooted cleanly, but the Engine couldn’t connect to it. The logs showed a problem with the vdsm-no-mac-spoofing filter, which led me to Red Hat Bugzilla ticket 1138807. The workaround suggested in the ticket did the trick for me:

/usr/bin/systemctl stop vdsmd
/usr/bin/virsh -c remote://my.host.com nwfilter-undefine vdsm-no-mac-spoofing
/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start
/usr/bin/virsh -c remote://my.host.com nwfilter-undefine vdsm-no-mac-spoofing
/usr/bin/systemctl start vdsmd

Once that procedure was complete, the Engine could see the host and control it via the power-management settings.

I created a test VM in the cluster. After the installation, I edited its profile to specify that it run only on the Fedora 21 host. The VM booted just fine, but not on the Fedora 21 host. The logs reported that a libvirt parameter called min_guarantee (concering RAM availability) wasn’t supported in the software on Fedora 21. Another log search led to a different Bugzilla ticket, 1138340. Ticket comments suggest that the problem has been patched in Engine 3.4.4, but that wasn’t evident in our environment.

Fortunately, oVirt provides a hook mechanism to alter VM definitions. In this case, I had to write a hook to remove the min_guarantee element from a VM’s XML description that the Engine passes to the host on which the VM will run. Hooks can be simple shell scripts, but I decided to take advantage of oVirt’s python API to craft a more robust solution.

#!/usr/bin/python
#
# hook to remove the min_guarantee Element from the VM domain
# XML; newer versions of libvirt barf completely when they
# encounter a min_guarantee requirement
#
# install as /usr/libexec/vdsm/hooks/before_vm_start/10_remove_memtune
#
# ======================================================================
  
import hooking
# note: hooking.exit_hook takes a text message and one of three
# return codes: 0 (success), 1 (non-fatal error), 2 (fatal error,
# the default). i use exit_hook() to ensure hook actions get
# logged on exit.
  
# fetch the VM's XML definition
domxml = hooking.read_domxml()
  
# locate the memtune Element, exiting (with some verbosity)
# if it's not found; the exit should not be fatal however, just
# an acknowlegement that no memtune was found
try:
  memtune = domxml.getElementsByTagName('memtune')[0]
except IndexError:
  msg = 'No Element memtune found in domain definition'
  hooking.exit_hook( msg, 1 )
  
# locate the min_guarantee Element within memtune, exiting
# (non-fatally) with an acknowlegement that it wasn't found
try:
  minguarantee = memtune.getElementsByTagName("min_guarantee")[0]
except IndexError:
  msg = 'No Element min_guarantee found in memtune definition'
  hooking.exit_hook( msg, 1 )
  
# if the min_guarantee Element is present but cannot be removed,
# that's really bad. exit and let vdsm know that things went really
# awry.
try:
  memtune.removeChild(minguarantee)
except ValueError:
  msg = 'Unable to remove min_guarantee Element. Huh?'
  hooking.exit_hook( msg )
  
hooking.write_domxml(domxml)
hooking.exit_hook( "min_guarantee removed successfully", 0 )

Once I installed that hook, I was able to launch the new VM on the Fedora 21 host. I still can’t get VMs to migrate either to or from the Fedora 21 host, but I suppose that’s another blog post for another day.