No edit summary |
No edit summary |
||
(One intermediate revision by the same user not shown) | |||
Line 11: | Line 11: | ||
== Current status == | == Current status == | ||
* Targeted release: [[Releases/16 | Fedora 16 ]] | * Targeted release: [[Releases/16 | Fedora 16 ]] | ||
* Last updated: | * Last updated: 21-09-2011 | ||
* Percentage of completion: | * Percentage of completion: 100% | ||
== Detailed Description == | == Detailed Description == | ||
Line 34: | Line 34: | ||
== How To Test == | == How To Test == | ||
There are no special hardware requirements for testing this feature, beyond those already required for running QEMU/KVM virtual machines. | There are no special hardware requirements for testing this feature, beyond those already required for running QEMU/KVM virtual machines. | ||
=== General host setup === | |||
Install libvirt, KVM, etc as per normal practice. Additionally install the 'augtool', 'libvirt-lock-sanlock' and 'sanlock' RPMs using yum | |||
The sanlock plugin requires a directory in which it will store leases. For single host protection, this directory can be a local filesystem, but for cross-host protection it needs to be a network filesystem like NFS, or cluster filesystem like GFS. By convention the directory should be '/var/lib/libvirt/sanlock'. | |||
Each host that shares the same filesystem for leases, needs to be allocated a *unique* host ID, between 1 and 512. | |||
With this in mind the basic configuration for sanlock can be done with the following augeas commands: | |||
$ augtool | |||
augtool> set /files/etc/libvirt/qemu.conf/lock_manager "sanlock" | |||
augtool> set /files/etc/libvirt/qemu-sanlock.conf/host_id 1 | |||
augtool> set /files/etc/libvirt/qemu-sanlock.conf/auto_disk_leases 1 | |||
augtool> set /files/etc/libvirt/qemu-sanlock.conf/disk_lease_dir "/var/lib/libvirt/sanlock" | |||
augtool> save | |||
Saved 1 file(s) | |||
augtool> quit | |||
Obviously, change the 'host_id' line to give a unique value for the host. | |||
By default sanlock uses a software watchdog to ensure that the host is automatically hard rebooted if something goes wrong. In testing this is not very nice, so disable the sanlock watchdog and then start the sanlock daemon | |||
$ echo 'SANLOCKOPTS="-w 0"' > /etc/sysconfig/sanlock | |||
$ service sanlock start | |||
=== Single host testing === | === Single host testing === | ||
- | - Follow the 'General host setup' instructions | ||
- Restart the libvirtd daemon | |||
- Provision two virtual machines | - Provision two virtual machines | ||
- Create a third disk image (eg dd if=/dev/zero of=/var/lib/libvirt/images/extra.img bs=1M count=100) | - Create a third disk image (eg dd if=/dev/zero of=/var/lib/libvirt/images/extra.img bs=1M count=100) | ||
Line 59: | Line 86: | ||
=== Dual host testing === | === Dual host testing === | ||
- | - Follow the 'General host setup' instructions, on both hosts | ||
- Mount an NFS volume at /var/lib/libvirt/ | - Mount an NFS volume at /var/lib/libvirt/sanlock on both hosts | ||
- Restart the | - Restart the libvirtd daemon on both hosts | ||
- Provision a virtual machine | - Provision a virtual machine | ||
- Copy the virtual machine configuration to the second host | - Copy the virtual machine configuration to the second host | ||
Line 107: | Line 134: | ||
otherwise result in the same disk image being run twice | otherwise result in the same disk image being run twice | ||
== Dependencies == | == Dependencies == | ||
Line 115: | Line 141: | ||
== Contingency Plan == | == Contingency Plan == | ||
The use of 'sanlock' is an explicit adminsitrator 'opt in', thus no contingency plan is required. The user can simply run without a lock manager, in which case the behaviour will be identical to previous Fedora releases. | |||
== Documentation == | == Documentation == | ||
* http://libvirt.org/locking.html | The primary upstream documentation is at | ||
* http://libvirt.org/locking.html | |||
== Release Notes == | == Release Notes == | ||
* The QEMU/KVM virtualization driver in libvirt | * The QEMU/KVM virtualization driver in libvirt includes an optional lock manager plugin to enforce exclusive access to the virtual machine disk images on a single host. This prevents multiple guests being started with the same disk image, unless the <shareable/> flag is set for the disk | ||
* | * If a shared filesystem (eg NFS) is mounted at /var/lib/libvirt/lockd, the protection extends across multiple hosts in the network | ||
* If configuring locking across multiple hosts it is important to ensure that all disk image paths are globally unique across all hosts sharing the same NFS mount, and that block devices use the stable unique names under /dev/disk/by-path/ and not the unstable /dev/sdNN names | * If configuring locking across multiple hosts it is important to ensure that all disk image paths are globally unique across all hosts sharing the same NFS mount, and that block devices use the stable unique names under /dev/disk/by-path/ and not the unstable /dev/sdNN names | ||
Latest revision as of 11:12, 21 September 2011
Virtual Machine Lock Manager
Summary
The virtual machine lock manager is a daemon which will ensure that a virtual machine's disk image cannot be written to by two QEMU/KVM processes at the same time. It provides protection against starting the same virtual machine twice, or adding the same disk to two different virtual machines.
Owner
- Name: Daniel Berrange
- Email: berrange-at-redhat-dot-com
Current status
- Targeted release: Fedora 16
- Last updated: 21-09-2011
- Percentage of completion: 100%
Detailed Description
Virtual machines running via the QEMU/KVM platform do not currently acquire any kind of lock when starting up. This means it is possible for the same virtual machine to be accidentally started more than once, or for the same disk image to be accidentally added to two different virtual machines. The result of such a mistake is likely to be catastrophic destruction of the virtual machines filesystem.
The virtual machine lock manager is a framework embedded in the libvirtd daemon that allows for pluggable locking mechanisms. The first available plugin introduced in F16, integrates with the 'sanlock' program. This will protect against adding the same disk to two different virtual machines, and protect against libvirtd bugs where it might "forget" about a previously running virtual machine. If the administrator mounts a suitable shared filesystem (eg, NFS) in /var/lib/libvirt/lockd then the lock manager protection will be extended to all hosts shared that filesystem.
Later Fedora releases will introduce alternative lock manager implementations.
Benefit to Fedora
Hosts running virtual machines for QEMU/KVM will have much stronger protection against administrator host/cluster configuration mistakes. This will reduce the risk that a virtual machines' disk image will become corrupted as a result.
Scope
The changes are confined to the libvirt and sanlock packages
- The new 'sanlock' RPM is introduced to Fedora - The new 'libvirt-locking-sanlock' sub-RPM is introduced to the libvirt.spec file - The /etc/libvirt/qemu.conf file will gain a configuration parameter to set the lock manager implementation - A new /etc/libvirt/qemu-sanlock.conf file is introduced for sanlock lock manager configuration
How To Test
There are no special hardware requirements for testing this feature, beyond those already required for running QEMU/KVM virtual machines.
General host setup
Install libvirt, KVM, etc as per normal practice. Additionally install the 'augtool', 'libvirt-lock-sanlock' and 'sanlock' RPMs using yum
The sanlock plugin requires a directory in which it will store leases. For single host protection, this directory can be a local filesystem, but for cross-host protection it needs to be a network filesystem like NFS, or cluster filesystem like GFS. By convention the directory should be '/var/lib/libvirt/sanlock'.
Each host that shares the same filesystem for leases, needs to be allocated a *unique* host ID, between 1 and 512.
With this in mind the basic configuration for sanlock can be done with the following augeas commands:
$ augtool augtool> set /files/etc/libvirt/qemu.conf/lock_manager "sanlock" augtool> set /files/etc/libvirt/qemu-sanlock.conf/host_id 1 augtool> set /files/etc/libvirt/qemu-sanlock.conf/auto_disk_leases 1 augtool> set /files/etc/libvirt/qemu-sanlock.conf/disk_lease_dir "/var/lib/libvirt/sanlock" augtool> save Saved 1 file(s) augtool> quit
Obviously, change the 'host_id' line to give a unique value for the host.
By default sanlock uses a software watchdog to ensure that the host is automatically hard rebooted if something goes wrong. In testing this is not very nice, so disable the sanlock watchdog and then start the sanlock daemon
$ echo 'SANLOCKOPTS="-w 0"' > /etc/sysconfig/sanlock $ service sanlock start
Single host testing
- Follow the 'General host setup' instructions - Restart the libvirtd daemon - Provision two virtual machines - Create a third disk image (eg dd if=/dev/zero of=/var/lib/libvirt/images/extra.img bs=1M count=100) - Add the following XML to the configuration of both virtual machines
<disk type='file' device='disk'> <source file='/var/lib/libvirt/images/extra.img'/> <target dev='vdb' bus='virtio'/> </disk> - Start the first virtual machine - Attempt to start the second virtual machine
The last step should fail, with a message that the disk image is already in use.
- Stop the first virtual machine - Attempt to start the second virtual machine
The second VM should now successfully run
Dual host testing
- Follow the 'General host setup' instructions, on both hosts - Mount an NFS volume at /var/lib/libvirt/sanlock on both hosts - Restart the libvirtd daemon on both hosts - Provision a virtual machine - Copy the virtual machine configuration to the second host
virsh dumpxml myguest > myguest.xml virsh -c qemu+ssh://otherhost/system define myguest.xml
- Start the virtual machine on the first host - Attempt to start the virtual machine on the second host
The last step should fail, with a message that the disk image is already in use.
- Stop the virtual machine on the first host - Attempt to start the virtual machine on the second host
The VM should now succesfully run on the second host
Migration testing
- As per "Dual host testing" - Attempt to migrate the running VM from the first host to the second host
Libvirtd failure testing
- As per 'Single host testing" - Start the first virtual machine - Stop the libvirtd daemon, without stopping the VM - Delete the files /var/run/libvirt/qemu/myguest.{pid,xml} (this ophans the VM from libvirtd) - Start the libvirtd daemon - Attempt to start the first virtual machine again
The last step should fail, with a message that the disk image is already in use.
- Find the orphaned QEMU process and manually kill it - Attempt to start the first virtual machine again
The VM should now once again run successfully
User Experience
End users should see no difference in behaviour of QEMU/KVM virtualization during normal operation.
They will be prevented from making certain configuration/operational mistakes which would otherwise result in the same disk image being run twice
Dependencies
The feature is confined to the 'libvirt' package
Contingency Plan
The use of 'sanlock' is an explicit adminsitrator 'opt in', thus no contingency plan is required. The user can simply run without a lock manager, in which case the behaviour will be identical to previous Fedora releases.
Documentation
The primary upstream documentation is at
Release Notes
- The QEMU/KVM virtualization driver in libvirt includes an optional lock manager plugin to enforce exclusive access to the virtual machine disk images on a single host. This prevents multiple guests being started with the same disk image, unless the <shareable/> flag is set for the disk
- If a shared filesystem (eg NFS) is mounted at /var/lib/libvirt/lockd, the protection extends across multiple hosts in the network
- If configuring locking across multiple hosts it is important to ensure that all disk image paths are globally unique across all hosts sharing the same NFS mount, and that block devices use the stable unique names under /dev/disk/by-path/ and not the unstable /dev/sdNN names