Fedora 12 includes a number of improvements in the field of Virtualization. New tools enable system administrators to perform nearly impossible - until now - tasks easily. Imagine re-configuring a virtual machine off-line, add new hardware to VM with out restarting it, migrate to another host without restarting the VMs and many other exotic features. Let's hear what developers have to say about those wonderful new options.
Featured interviewees
- Chris Wright (KVM Huge Page Backed Memory)
- John Cooper (KVM Huge Page Backed Memory)
- Mark McLoughlin (KVM Stable Guest ABI and KVM NIC Hotplug)
- Kevin Wolf (KVM qcow2 Performance)
- David Lutterkort (Network Interface Management)
- Daniel Berrange (VirtPrivileges)
- Glauber Costa (VirtgPXE)
- Dave Allan (VirtStorageManagement)
- Richard Jones (libguestfs)
Raw transcript
http://meetbot.fedoraproject.org/fedora-mktg/2009-10-22/fedora-mktg.2009-10-22-15.00.log.txt
guestfish and friends (libguestds and libvirt)
mchua: Why don't we start with everyone introducing themselves briefly, and giving a sentence or two about what they do, and what virt features they worked on for F12?
<Intro> rwmjones: I'm a software engineer at Red Hat, and I am working on http://libguestfs.org/. libguestfs is a set of tools which you can use to examine and modify virtual machine images from outside (ie. from the host), so for example if you had an unbootable guest, you could try to fix it by doing: virt-edit myguest /boot/grub/grub.conf
mchua: what would sysadmins have to do to fix that before libguestfs arrived?
<A> rwmjones: that's really tricky ... it was sort of possible using tools like kpartx and loopback mounts, but it was dangerous stuff, hard and you had to be root. now there's no root commands needed, and it's organized as nice little command line tools for each task with proper manual pages. I'd point people to the home page -- http://libguestfs.org/ -- to see lots of examples, and documentation.
mchua: How do libguestfs capabilities in Fedora compare with how a sysadmin might do the same thing on other, non-Linux (or linux-but-on-another-distribution) platforms? Are there other similar tools?
<A> rwmjones: we've worked with Guido Gunther from Debian on getting a parts of libguestfs packaged up for Debian. On Windows, Microsoft offer something called DiscUtils.Net which is similar but not nearly as powerful. So I'm confident Fedora is well ahead of everyone here.
mchua: Do you want to talk about the guestfish interface a bit?
<A> rwmjones: mchua, sure ... guestfish is one of the ways to get access to the libguestfs features, for use from shell scripts. The basic usage is to do:
guestfish -i yourguest # where yourguest is some guest name known by libvirt
and that gives you a shell where you can list files in the guest, edit them, look in directories, find out what LVs the guest has (or create new ones) ... literally 200 commands! That's all documented here: http://libguestfs.org/guestfish.1.html
mchua: Wow. That documentation is gorgeous.
<A> rwmjones: and if you run out of ideas, we have some "recipes" you can try out with guestfish: http://libguestfs.org/recipes.html
markmc: mchua, we've certainly all been put to shame by rwmjones docs :)
lutter: the pwoer of OCaml ;)
Virtual upgrades to your Virtual Machine
<Intro> markmc: I'm an engineer at Red Hat, joined from Sun nearly 6 years ago. Previously worked on GNOME desktop related stuff, but have been working on virtualization for the past few years. For Fedora 12, I worked on the NIC Hotplug and Stable Guest ABI features, along with packaging, bug triaging and general shepherding of all the other virt bits. I work upstream on both qemu and libvirt, but at lot of my time is taken up by Fedora work these days.
Okay, the NIC hotplug feature - the ability to add a new virtual NIC while the guest is running - was a pretty obviously missing feature from our KVM support previously. The problem we had with implementing it, is that libvirt is responsible for configuring the virtual NIC and passes a file descriptor to the qemu process when it starts it.
That's much harder to do when the guest is already running. So, most of the work involved some scary UNIX voodoo to allow passing that file descriptor between two running processes. As for use cases, people often want to add and remove hardware from their guests without re-starting them. You might want to add a guest to a new network, for example.
Now, the Stable Guest ABI feature is really quite boring, but is about preparing KVM so that we can maintain compatibility across new releases. The idea is that if you are running a Fedora 12 KVM host and you install a new host with Fedora 13, you might like to migrate your running guests from the Fedora 12 host to the Fedora 13 host, without re-starting them.
Now, as we add new features to qemu in Fedora 13, we might end up 'upgrading' the virtual machine's hardware. We might, for example, emulate a new chipset by default or add a new default NIC. The Stable Guest ABI feature means that when you migrate to the Fedora 13 host, the hardware emulated by qemu will remain the same for that guest.
As you can imagine, if you change around the hardware under a running guest, the guest may get seriously confused.
But it's not just about live migration - if you upgrade your host and restart your guest, not all guest OSes will like if you've changed around the hardware.
Windows for example, with significant enough changes to the hardware, will require you to re-validate your license.
We want to avoid that happening when you upgrade your Fedora host.
Network scripts: complex no more
<Intro> lutter: David Lutterkort, software engineer at Red Hat, worked on http://fedorahosted.org/netcf (for the Network Interface Mgmt feature), in the past worked on ovirt and some of the virt-install tools. besides that, work some on http://deltacloud.org/, and http://augeas.net/
Network Interface Mgmt lets sysadmins set up fairly complex network configurations (e.g. a bridge with a bond enslaved) through a simple description of the config, using the libvirt API; in the past, that required initimate knowledge of ifcfg-* files and a lot of nailbiting. Having an API also means that such setups can be done by programs (e.g., centralized virt mgmt software or virt-manager)
mchua: Awesome. If I'm understanding you right, this means that now sysadmins can automate complex custom network configurations for VMs?
<A> lutter: Complex network configs on the host, generally ... a common request is 'how do I share a physical NIC between various VM's'; in the past, you had to manually go and edit ifcfg-* files. libvirt now has an API and XML description to make that setup much easier. The backend for the libvirt interface API is netcf, which is independent of virtualization, so you could use that to setup network configs in your VM's
mchua: Ahhh, okay - thanks for the clarification. How does this compare to how people would set up host network configs on other platforms?
<A> lutter: Right now this is exposed in the libvirt API; we're working (well, Cole Robinson is working) on exposing that in virt-manager so that people can say 'use this physical NIC for all my VM's' with one click; there you either have to manually edit the network configs, which generally is only really possible for humans, not programs, or rely on the very dodgy, never-quite-right Xen networking scripts
mchua: lutter, Is there a place where our readers can go to find out more about how to use the libvirt API? How do folks try these features out?
<A> lutter: There's a small amount of docs on the netcf site (I have to add more) and libvirt.org has API docs for the various virInterface* calls
mchua: lutter, I see instructions on how to test at https://fedoraproject.org/wiki/Features/Network_Interface_Management#How_To_Test
<A> lutter: There's also a blog post somebody else wrote on netcf http://linux-kvm.com/content/netcf-silver-bullet-network-configuration
besides bz ? ;)
mchua: lutter, Is there a place folks should be watching to see things go up as the F12 GA date approaches?
*grin* what components should we be keeping track of?
<A> lutter:I don't know of a good central place where this gets summarized, though FWN has been pretty good reporting about virt features. Besides that, watching the individual projects is everybody's best bet libvirt, libguestfs, virt-install, virt-manager are the most important ones from a user's POV
Is it all Sysadmin?
mchua: lutter, The user typically being a sysadmin?
<A> lutter: virt-manager is definitely for end users, not just sysadmins; virt-install somewhere in the middle, the others get fairly technical
mchua: lutter, What would be a use-case for an end-user using virt-manager? (I'm guessing there will be users reading this interview who may not have tried out virt stuff before, but who might read this and go "ooh, hey..." and try it out.)
<A> lutter: try out rawhide w/o the risk of breaking your current system of course, that goes for any $OS ... in general, virt-manager is a graphical user interface to most/all virt features
mchua: lutter, Ok - imagine I'm a new Fedora user, I've just installed F12, love it, want to get a preview of rawhide so I can see what's coming for F13. What do I need to install/run to get rawhide running in a VM? If that process is quick and painless enough to put in a few "try this!" lines mid-interview. I realize this is a pretty basic question, but I'd like to get virt used by as many folks as possible so that hopefully we'll have some of those folks going deeper and trying out the tools you've made
<A> lutter: lemme dig around
rwmjones: lvcreate -n F13Rawhide -L 10G vg_yourhost; virt-install -v -n F13Rawhide --accelerate -r 512 -f /dev/vg_yourhost/F13Rawhide -c /tmp/Fedora-13-netinst.iso
markmc: rwmjones, hmm, no - I'd point people at virt-manager.
mchua, go to Applications -> System Tools -> Virtual Machine Manager
rwmjones: yeah virt-manager will be easier ...
markmc: mchua, (well, first install the 'Virtualization' group in Add/Remove Software), then click on New VM, choose a name for the guest, choose network install
mchua: lutter, ^^ (I think we've got it, no worries)
markmc: mchua, and then add a URL like http://download.fedoraproject.org/pub/fedora/linux/releases/12/Fedora/x86_64/os/
, after that, the instructions in the wizard should be fairly self explanatory.
lutter: mchua, yeah, what markmc said
markmc: mchua, was that interview you did for f11 published anywhere? would be good to link to it from https://fedoraproject.org/wiki/Virtualization/History , found it : http://jaboutboul.blogspot.com/2009/05/fedora-11-virtualization-reality.html
Some history about PXE
mchua: lutter, rwmjones, markmc, in a moment, I'd like to pull back and have the three of you talk with each other about how virt in Fedora has progressed in the past few releases.
<A> markmc: one sec - I'll cover gpxe and qcow2 features
the feature owners aren't here (in this case), okay. The gPXE feature is about replacing the boot ROMs used by qemu for PXE booting with newer versions, basically etherboot was the name of the project previously, but it's now called gPXE.
It's important that we made the switch to gPXE because all future upstream development (new features, bug fixes) will go into gPXE instead of etherboot.
The qcow2 performance feature was about taking a cold hard look at the qcow2 file format and fixing an major bottlenecks basically, we see qcow2 as a very useful format for virtual machine images e.g. the size of qcow2 files is determined by the amount of disk space used by the guest; not the entire size of the virtual disk we're presenting to the guest. The images should be smaller on disk, even if you copy them between hosts. Also, qcow2 supports a "copy on write" feature whereby you can base multiple guest images from the one base image so you can reduce disk space further by installing one guest image, creating multiple qcow2 images backed by the first image and yet, the guest can still write to their disks! So, in summary, we want more people to use qcow2, but they couldn't because the performance was poor. Kevin Wolf put serious effort in upstream to iron out those kinks and obtain a serious speedup. Figures are in a table on the feature page.
mchua: markmc, to backtrack a bit, why the switch from etherboot? - From what I've read, it sounds like the switch was actually requested by the etherboot upstream, in part.
<A> markmc: Yes, the etherboot project is no more; it is deprecated in favor of gPXE, but they're not completely identical, so there was some significant work involved ... done by Glauber Costa (our Brazilian joker) and Matt Domsch from Dell (AFAIR)
mchua: markmc, is gPXE being used by other OSes and distros too? <A> markmc: yeah, it was Matt Domsch. It may be used by other distros, I'm not 100% sure about that. I'd imagine we're slightly ahead of the curve on this - upstream qemu is still using etherboot images
Some history about virt-manager
15:25:24 <rwmjones> mchua, I would say that in Fedora 6 which is where I really started off with Fedora, it was quite primitive and unfriendly, although we did have virt-manager which has always been a nice tool 15:26:53 <mchua> rwmjones: What was the F6 virt experience like? 15:27:22 <rwmjones> mchua, here's a guestfish example ... making a backup of /home from a Debian guest:
# guestfish -i --ro Debian5x64 Welcome to guestfish, the libguestfs filesystem interactive shell for editing virtual machine filesystems. Type: 'help' for help with commands 'quit' to quit the shell <fs> cat /etc/debian_version squeeze/sid <fs> tgz-out /home home.tar.gz
15:29:40 <rwmjones> mchua, Fedora 6 -> 12 .. it's a story of everything improving dramatically. It's not really that there are big new features eg. we have virt-manager back in 6, but modern virt-manager is just far better. 15:30:27 <rwmjones> and I've been trying to work on making it better for sysadmins who want to automate things, hence libguestfs is very shell-script / automation-friendly 15:33:57 <mchua> rwmjones: So one area of improvement between F6 virt and F12 virt is "F12 virt is far more automatable and shell-script friendly." 15:34:19 <rwmjones> mchua, yeah I'd say that's true 15:34:35 <mchua> rwmjones: "It's not really that there are big new features... but [features are] just far better" - so you can do the same things, more or less, just much faster (in terms of sysadmin-headache-time needed)? 15:36:10 <rwmjones> mchua, well there are a lot of big new features behind the scenes (KVM, KSM, virtio ...). It's not clear how apparent they'll be to end users, but it will just all work better and faster. 15:37:43 <rwmjones> mchua, there's a story behind virt-df (http://libguestfs.org/virt-df.1.html). When I used to manage a bunch of virtual machines at my previous job, it was the tool that I wanted. It didn't exist, so at Red Hat, I wrote it. 15:37:57 <markmc> mchua, the big change between F6 and F12 is that we've switched from Xen to KVM 15:38:34 <markmc> mchua, but because all our work is based on the libvirt abstraction layer, the tools used in F6 for using Xen should be familiar to people using KVM in F-12 15:38:58 <markmc> mchua, we've also put a significant emphasis on improving security over the last number of releases 15:39:13 <markmc> mchua, danpb has more details on the security efforts in his F-11 interview 15:39:13 <rwmjones> mchua, yeah ... someone on F6 who was using virt-manager or "virsh list", will be using exactly the same commands in F12, even though the hypervisor is completely different 15:39:23 * mchua nods 15:39:27 <markmc> mchua, and he'll also have more details wrt. the VirtPrivileges feature 15:40:19 <lutter> mchua: libvirt, and therefore the whole virt tool stack now manages a much broader area of virt related aspects, not just VM lifecycles 15:41:08 <markmc> mchua, lutter has a good point - we now have tools for e.g. managing networking and storage 15:41:14 <markmc> mchua, we also have much better support for remotely managing virtualization hosts 15:41:41 <markmc> mchua, e.g. you can point virt-manager at a host, create a guest on that host, create storage for the guest, configure the network etc. 15:42:12 <lutter> mchua: the tools are now a prety solid basis for datacenter virt management software like ovirt and RHEV-M 15:42:18 <markmc> mchua, wrt. fedora virt changing over the years, we're also pushing very hard to adopt new virtualization hardware features introduced by vendors 15:42:40 <markmc> mchua, so, for example, in F-11 we introduced VT-d support and in F-12 we're introducing SR-IOV support 15:43:01 <markmc> mchua, and KVM itself is based on Intel and AMD hardware virtualization 15:43:09 <markmc> mchua, also EPT/NPT support 15:43:38 <markmc> mchua, so yeah, we're definitely leading the field in terms of shipping support for new hardware features 15:44:00 <markmc> mchua, e.g. AFAIK no-one else (not even other hypervisor vendors) are yet shipping SR-IOV support 15:44:06 <markmc> ... 15:44:34 <lutter> yeah, Fedora is very likely the first place where you see a lot of new hardware virt features supported in OSS, mostly since so many upstream maintainers/developers for virt-related stuff work at RH and generally push their work to Fedora 'by default' .. spin that any way you want to avoid a distro war ;) 15:45:13 <mchua> All while maintaining a consistent, familiar interface - as rwmjones pointed out, folks using virt-manager and virsh on F6 are still using the same commands. Though now they also have the option to use additional tools like guestfish to script the process (so, alternative-but-even-easier interface). 15:45:40 <lutter> mchua: we also added the capability to deploy and build appliances (through virt-install/virt-image and the thincrust project)
Current draft BASE
mchua: Why don't we start with everyone introducing themselves briefly, and giving a sentence or two about what they do, and what virt features they worked on for F12? rwmjones: I'm a software engineer at Red Hat, and I am working on http://libguestfs.org/. libguestfs is a set of tools which you can use to examine and modify virtual machine images from outside (ie. from the host), so for example if you had an unbootable guest, you could try to fix it by doing: virt-edit myguest /boot/grub/grub.conf mchua: what would sysadmins have to do to fix that before libguestfs arrived? rwmjones: that's really tricky ... it was sort of possible using tools like kpartx and loopback mounts, but it was dangerous stuff, hard and you had to be root. now there's no root commands needed, and it's organized as nice little command line tools for each task with proper manual pages. I'd point people to the home page -- http://libguestfs.org/ -- to see lots of examples, and documentation. mchua: How do libguestfs capabilities in Fedora compare with how a sysadmin might do the same thing on other, non-Linux (or linux-but-on-another-distribution) platforms? Are there other similar tools? rwmjones: we've worked with Guido Gunther from Debian on getting a parts of libguestfs packaged up for Debian. On Windows, Microsoft offer something called DiscUtils.Net which is similar but not nearly as powerful. So I'm confident Fedora is well ahead of everyone here. 15:16:34 <mchua> rwmjones: Do you want to talk about the guestfish interface a bit? 15:17:35 <rwmjones> mchua, sure ... guestfish is one of the ways to get access to the libguestfs features, for use from shell scripts. The basic usage is to do: 15:17:51 <rwmjones> guestfish -i yourguest # where yourguest is some guest name known by libvirt 15:18:22 <rwmjones> and that gives you a shell where you can list files in the guest, edit them, look in directories, find out what LVs the guest has (or create new ones) ... literally 200 commands 15:18:42 <rwmjones> that's all documented here: http://libguestfs.org/guestfish.1.html 15:19:09 <mchua> rwmjones: Wow. That documentation is gorgeous. 15:19:12 <rwmjones> and if you run out of ideas, we have some "recipes" you can try out with guestfish: http://libguestfs.org/recipes.html 15:20:03 <markmc> mchua, we've certainly all been put to shame by rwmjones docs :) 15:20:32 <lutter> the pwoer of OCaml ;) ============== markmc: I'm an engineer at Red Hat, joined from Sun nearly 6 years ago. Previously worked on GNOME desktop related stuff, but have been working on virtualization for the past few years. For Fedora 12, I worked on the NIC Hotplug and Stable Guest ABI features, along with packaging, bug triaging and general shepherding of all the other virt bits. I work upstream on both qemu and libvirt, but at lot of my time is taken up by Fedora work these days. Okay, the NIC hotplug feature - the ability to add a new virtual NIC while the guest is running - was a pretty obviously missing feature from our KVM support previously. The problem we had with implementing it, is that libvirt is responsible for configuring the virtual NIC and passes a file descriptor to the qemu process when it starts it. That's much harder to do when the guest is already running. So, most of the work involved some scary UNIX voodoo to allow passing that file descriptor between two running processes. As for use cases, people often want to add and remove hardware from their guests without re-starting them. You might want to add a guest to a new network, for example. Now, the Stable Guest ABI feature is really quite boring, but is about preparing KVM so that we can maintain compatibility across new releases. The idea is that if you are running a Fedora 12 KVM host and you install a new host with Fedora 13, you might like to migrate your running guests from the Fedora 12 host to the Fedora 13 host, without re-starting them. Now, as we add new features to qemu in Fedora 13, we might end up 'upgrading' the virtual machine's hardware. We might, for example, emulate a new chipset by default or add a new default NIC. The Stable Guest ABI feature means that when you migrate to the Fedora 13 host, the hardware emulated by qemu will remain the same for that guest. 15:16:22 <markmc> As you can imagine, if you change around the hardware under a running guest, the guest may get seriously confused. 15:17:04 <markmc> But it's not just about live migration - if you upgrade your host and restart your guest, not all guest OSes will like if you've changed around the hardware. 15:17:29 <markmc> Windows for example, with significant enough changes to the hardware, will require you to re-validate your license. 15:17:51 <markmc> We want to avoid that happening when you upgrade your Fedora host. ============= lutter: David Lutterkort, software engineer at Red Hat, worked on http://fedorahosted.org/netcf (for the Network Interface Mgmt feature), in the past worked on ovirt and some of the virt-install tools. besides that, work some on http://deltacloud.org/, and http://augeas.net/ Network Interface Mgmt lets sysadmins set up fairly complex network configurations (e.g. a bridge with a bond enslaved) through a simple description of the config, using the libvirt API; in the past, that required initimate knowledge of ifcfg-* files and a lot of nailbiting. Having an API also means that such setups can be done by programs (e.g., centralized virt mgmt software or virt-manager) mchua: Awesome. If I'm understanding you right, this means that now sysadmins can automate complex custom network configurations for VMs? lutter: complex network configs on the host, generally ... a common request is 'how do I share a physical NIC between various VM's'; in the past, you had to manually go and edit ifcfg-* files. libvirt now has an API and XML description to make that setup much easier. The backend for the libvirt interface API is netcf, which is independent of virtualization, so you could use that to setup network configs in your VM's mchua: Ahhh, okay - thanks for the clarification. How does this compare to how people would set up host network configs on other platforms? lutter: right now this is exposed in the libvirt API; we're working (well, Cole Robinson is working) on exposing that in virt-manager so that people can say 'use this physical NIC for all my VM's' with one click 15:17:18 <lutter> mchua: there you either have to manually edit the network configs, which generally is only really possible for humans, not programs, or rely on the very dodgy, never-quite-right Xen networking scripts 15:21:01 <mchua> lutter: Is there a place where our readers can go to find out more about how to use the libvirt API? How do folks try these features out? 15:22:02 <lutter> mchua: there's a small amount of docs on the netcf site (I have to add more) and libvirt.org has API docs for the various virInterface* calls 15:22:03 <mchua> lutter: I see instructions on how to test at https://fedoraproject.org/wiki/Features/Network_Interface_Management#How_To_Test 15:22:59 <lutter> mchua: there's also a blog post somebody else wrote on netcf http://linux-kvm.com/content/netcf-silver-bullet-network-configuration 15:23:40 <lutter> mchua: besides bz ? ;) 15:22:28 <mchua> lutter: Is there a place folks should be watching to see things go up as the F12 GA date approaches? 15:24:01 <mchua> lutter: *grin* what components should we be keeping track of? 15:24:30 <lutter> mchua: I don't know of a good central place where this gets summarized, though FWN has been pretty good reporting about virt features. Besides that, watching the individual projects is everybody's best bet 15:25:38 <lutter> mchua: libvirt, libguestfs, virt-install, virt-manager are the most important ones from a user's POV 15:25:38 * mchua nods 15:25:51 <mchua> lutter: the user typically being a sysadmin? 15:26:30 <lutter> mchua: virt-manager is definitely for end users, not just sysadmins; virt-install somewhere in the middle, the others get fairly technical 15:27:47 <mchua> lutter: What would be a use-case for an end-user using virt-manager? (I'm guessing there will be users reading this interview who may not have tried out virt stuff before, but who might read this and go "ooh, hey..." and try it out.) 15:28:45 <lutter> mchua: try out rawhide w/o the risk of breaking your current system 15:29:37 <lutter> mchua: of course, that goes for any $OS ... in general, virt-manager is a graphical user interface to most/all virt features 15:31:59 <mchua> lutter: Ok - imagine I'm a new Fedora user, I've just installed F12, love it, want to get a preview of rawhide so I can see what's coming for F13. What do I need to install/run to get rawhide running in a VM? (If that process is quick and painless enough to put in a few "try this!" lines mid-interview.) 15:32:39 <mchua> lutter: (I realize this is a pretty basic question, but I'd like to get virt used by as many folks as possible so that hopefully we'll have some of those folks going deeper and trying out the tools you've made) 15:33:09 <lutter> mchua: lemme dig around 15:34:08 <rwmjones> mchua: lvcreate -n F13Rawhide -L 10G vg_yourhost; virt-install -v -n F13Rawhide --accelerate -r 512 -f /dev/vg_yourhost/F13Rawhide -c /tmp/Fedora-13-netinst.iso 15:34:39 <markmc> rwmjones, hmm, no - I'd point people at virt-manager 15:35:01 <markmc> mchua, go to Applications -> System Tools -> Virtual Machine Manager 15:35:10 <rwmjones> yeah virt-manager will be easier ... 15:35:44 <markmc> mchua, (well, first install the 'Virtualization' group in Add/Remove Software) 15:35:53 <markmc> mchua, then click on New VM 15:36:10 <markmc> mchua, choose a name for the guest, choose network install 15:36:26 <mchua> lutter: ^^ (I think we've got it, no worries) 15:36:55 <markmc> mchua, and then add a URL like http://download.fedoraproject.org/pub/fedora/linux/releases/12/Fedora/x86_64/os/ 15:37:12 <markmc> mchua, after that, the instructions in the wizard should be fairly self explanatory 15:37:19 <lutter> mchua: yeah, what markmc said 15:37:22 * mchua will make a video for the "how to try out virt" procedure in the next week or two ============== 15:12:22 <markmc> mchua, was that interview you did for f11 published anywhere? would be good to link to it from https://fedoraproject.org/wiki/Virtualization/History 15:21:07 <markmc> mchua, found it : http://jaboutboul.blogspot.com/2009/05/fedora-11-virtualization-reality.html ================ 15:24:42 <mchua> lutter, rwmjones, markmc: in a moment, I'd like to pull back and have the three of you talk with each other about how virt in Fedora has progressed in the past few releases. 15:25:16 <markmc> mchua, one sec - I'll cover gpxe and qcow2 featurehs 15:25:25 <markmc> mchua, the feature owners aren't here 15:25:55 <mchua> (in this case) 15:25:57 <markmc> okay, the gPXE feature is about replacing the boot ROMs used by qemu for PXE booting with newer versions, basically 15:26:14 <markmc> etherboot was the name of the project previously, but it's now called gPXE 15:27:31 <markmc> It's important that we made the switch to gPXE because all future upstream development (new features, bug fixes) will go into gPXE instead of etherboot. 15:28:28 <markmc> the qcow2 performance feature was about taking a cold hard look at the qcow2 file format and fixing an major bottlenecks 15:28:33 <markmc> basically, we see qcow2 as a very useful format for virtual machine images 15:29:15 <markmc> e.g. the size of qcow2 files is determined by the amount of disk space used by the guest, not the entire size of the virtual disk we're presenting to the guest 15:29:30 <markmc> i.e. the images should be smaller on disk, even if you copy them between hosts 15:29:45 <markmc> also, qcow2 supports a "copy on write" feature 15:30:01 <markmc> whereby you can base multiple guest images from the one base image 15:30:32 <markmc> so you can reduce disk space further by installing one guest image, creating multiple qcow2 images backed by the first image 15:30:40 <markmc> and yet, the guest can still write to their disks 15:30:59 <markmc> so, in summary, we want more people to use qcow2, but they couldn't because the performance was poor 15:31:23 <markmc> Kevin Wolf put serious effort in upstream to iron out those kinks and obtain a serious speedup 15:31:29 <markmc> figures are in a table on the feature page 15:30:01 <mchua> markmc: (to backtrack a bit) why the switch from etherboot? (From what I've read, it sounds like the switch was actually requested by the etherboot upstream, in part.) 15:31:59 <markmc> mchua, yes, the etherboot project is no more; it is deprecated in favor of gPXE 15:32:45 <markmc> mchua, but they're not completely identical, so there was some significant work involved ... done by Glauber Costa (our Brazilian joker) and Matt Domsch from Dell (AFAIR) 15:33:01 <mchua> markmc: is gPXE being used by other OSes and distros too? 15:33:06 <markmc> yeah, it was Matt Domsch 15:33:29 <markmc> mchua, it may be used by other distros, I'm not 100% sure about that 15:33:51 <markmc> mchua, I'd imagine we're slightly ahead of the curve on this - upstream qemu is still using etherboot images 15:25:24 <rwmjones> mchua, I would say that in Fedora 6 which is where I really started off with Fedora, it was quite primitive and unfriendly, although we did have virt-manager which has always been a nice tool 15:26:53 <mchua> rwmjones: What was the F6 virt experience like? 15:27:22 <rwmjones> mchua, here's a guestfish example ... making a backup of /home from a Debian guest: 15:27:30 <rwmjones> # guestfish -i --ro Debian5x64 15:27:31 <rwmjones> Welcome to guestfish, the libguestfs filesystem interactive shell for 15:27:31 <rwmjones> editing virtual machine filesystems. 15:27:31 <rwmjones> Type: 'help' for help with commands 15:27:31 <rwmjones> 'quit' to quit the shell 15:27:31 <rwmjones> ><fs> cat /etc/debian_version 15:27:33 <rwmjones> squeeze/sid 15:27:35 <rwmjones> ><fs> tgz-out /home home.tar.gz 15:29:40 <rwmjones> mchua, Fedora 6 -> 12 .. it's a story of everything improving dramatically. It's not really that there are big new features eg. we have virt-manager back in 6, but modern virt-manager is just far better. 15:30:27 <rwmjones> and I've been trying to work on making it better for sysadmins who want to automate things, hence libguestfs is very shell-script / automation-friendly 15:33:57 <mchua> rwmjones: So one area of improvement between F6 virt and F12 virt is "F12 virt is far more automatable and shell-script friendly." 15:34:19 <rwmjones> mchua, yeah I'd say that's true 15:34:35 <mchua> rwmjones: "It's not really that there are big new features... but [features are] just far better" - so you can do the same things, more or less, just much faster (in terms of sysadmin-headache-time needed)? 15:36:10 <rwmjones> mchua, well there are a lot of big new features behind the scenes (KVM, KSM, virtio ...). It's not clear how apparent they'll be to end users, but it will just all work better and faster. 15:37:43 <rwmjones> mchua, there's a story behind virt-df (http://libguestfs.org/virt-df.1.html). When I used to manage a bunch of virtual machines at my previous job, it was the tool that I wanted. It didn't exist, so at Red Hat, I wrote it. 15:37:57 <markmc> mchua, the big change between F6 and F12 is that we've switched from Xen to KVM 15:38:34 <markmc> mchua, but because all our work is based on the libvirt abstraction layer, the tools used in F6 for using Xen should be familiar to people using KVM in F-12 15:38:58 <markmc> mchua, we've also put a significant emphasis on improving security over the last number of releases 15:39:13 <markmc> mchua, danpb has more details on the security efforts in his F-11 interview 15:39:13 <rwmjones> mchua, yeah ... someone on F6 who was using virt-manager or "virsh list", will be using exactly the same commands in F12, even though the hypervisor is completely different 15:39:23 * mchua nods 15:39:27 <markmc> mchua, and he'll also have more details wrt. the VirtPrivileges feature 15:40:19 <lutter> mchua: libvirt, and therefore the whole virt tool stack now manages a much broader area of virt related aspects, not just VM lifecycles 15:41:08 <markmc> mchua, lutter has a good point - we now have tools for e.g. managing networking and storage 15:41:14 <markmc> mchua, we also have much better support for remotely managing virtualization hosts 15:41:41 <markmc> mchua, e.g. you can point virt-manager at a host, create a guest on that host, create storage for the guest, configure the network etc. 15:42:12 <lutter> mchua: the tools are now a prety solid basis for datacenter virt management software like ovirt and RHEV-M 15:42:18 <markmc> mchua, wrt. fedora virt changing over the years, we're also pushing very hard to adopt new virtualization hardware features introduced by vendors 15:42:40 <markmc> mchua, so, for example, in F-11 we introduced VT-d support and in F-12 we're introducing SR-IOV support 15:43:01 <markmc> mchua, and KVM itself is based on Intel and AMD hardware virtualization 15:43:09 <markmc> mchua, also EPT/NPT support 15:43:38 <markmc> mchua, so yeah, we're definitely leading the field in terms of shipping support for new hardware features 15:44:00 <markmc> mchua, e.g. AFAIK no-one else (not even other hypervisor vendors) are yet shipping SR-IOV support 15:44:06 <markmc> ... 15:44:34 <lutter> yeah, Fedora is very likely the first place where you see a lot of new hardware virt features supported in OSS, mostly since so many upstream maintainers/developers for virt-related stuff work at RH and generally push their work to Fedora 'by default' .. spin that any way you want to avoid a distro war ;) 15:45:13 <mchua> All while maintaining a consistent, familiar interface - as rwmjones pointed out, folks using virt-manager and virsh on F6 are still using the same commands. Though now they also have the option to use additional tools like guestfish to script the process (so, alternative-but-even-easier interface). 15:45:40 <lutter> mchua: we also added the capability to deploy and build appliances (through virt-install/virt-image and the thincrust project) ============ 15:46:13 <mchua> rwmjones_, markmc, lutter: two last questions I wanted to toss out: (1) what's coming up for virt in f13 and the future? and (2) what do you folks do for fun when you're not hacking on virt stuff? 15:46:16 <markmc> mchua, oh, "Cloud" 15:46:27 <markmc> mchua, none of us said that yet, how silly of us 15:46:31 <markmc> cloud, cloud, cloud 15:46:35 * markmc gets it in a few times 15:46:37 <markmc> for good effect 15:46:40 <rwmjones_> mchua, yeah the outlook is cloudy 15:46:56 * mchua chuckles 15:47:16 <lutter> haha .. yeah ... everybody watch deltacloud.org 15:47:21 <markmc> mchua, fedora based cloud project : http://deltacloud.org/ 15:47:42 <markmc> mchua, https://fedoraproject.org/wiki/Category:F13_Virt_Features 15:47:52 * mchua grins. We'll keep an eye on cloud for F13. 15:47:56 * rwmjones_ trolls OCaml features to C programmers 15:48:05 <markmc> mchua, VHostNet is maybe the most exciting there so far 15:48:05 <lutter> mchua: and virt datacenter mgmt along the lines of ovirt 15:48:11 <markmc> mchua, we'll be adding more feature pages as time goes on 15:48:30 <markmc> mchua, VHostNet is about handling virtio networking in the kernel, rather than in the qemu process 15:48:48 <markmc> mchua, so network traffic goes straight from the guest to the kernel out to the network 15:48:58 <markmc> mchua, without ever being diverted through the qemu process 15:49:17 <markmc> mchua, Red Hat's Michael Tsirkin is busy getting that feature into the 2.6.33 kernel 15:49:28 * lutter blames markmc and rwmjones and a bunch of other people 15:49:53 <markmc> mchua, also, the VNCResourceTunnel means we'll get sound from guests again, which would be nice :) 15:50:15 <markmc> mchua, not sure how to spin that so it doesn't sound like "uh, we suck at audio, we're going to try a little harder" though :) 15:48:55 <mchua> rwmjones_, lutter, markmc: Whoa. Documentation and project webpages and a list of F13 features and *everything.* You folks are awesome. 15:50:34 <lutter> mchua: there's a much bigger group within RH working on all these virt features .. might be worth a mention; it's far from being just us 3 or 5 15:51:11 <rwmjones_> mchua, chris lalancette too 15:51:12 <markmc> mchua, lutter's dead right - there's a huge list of people working upstream on KVM and libvirt etc. 15:51:34 <markmc> mchua, I might send you a full list later, rather than forget people now 15:51:36 <lutter> markmc: might be worth underscoring how many of them are at RH 15:52:08 <markmc> lutter, oh, I meant "a huge list of Red Hat" people 15:52:19 <lutter> mchua: off the top of my head, Danial Veillard, Matt Booth and Laine Stump should be on that list .. also a long list of qemu/kvm/kernel hackers that markmc has a better overview of 15:52:48 <lutter> mchua: Cole Robinson (virt-install and virt-manager) 15:52:49 <markmc> mchua, Avi Kivity, Gerd Hoffman, Christoph Hellwig, ... 15:52:59 <markmc> dammit, I'm just going to forget people if I try and list here 15:53:14 <lutter> yeah, that's the danger with these lists 15:53:14 <rwmjones_> yeah, not forgetting the $100M investment in qumranet, now Red Hat 15:53:35 * mchua nods. Not going to treat these lists as complete, just as potential starting spots to find out more 15:54:11 <lutter> mchua: if you want to plug virtual appliances, Bryan Kearney, Joey Boggs and David Huff are to blame for thincrust 15:54:34 <mchua> lutter, markmc, rwmjones: Thanks - this is all awesome stuff. I know y'all have a ton of work to do, and really appreciate you taking the time to come and fill us in. ====== Mel: Last question - when you're not hacking on virt stuff, what do you do for fun? Richard: I troll OCaml features to C programmers ... Mel: *grins* Got that. Richard: And cook the best pizza of anyone I know David: Hacking on non-virt stuff ? ;) I have two little kids that take up most of my free time. Mark: I live in Dublin, Ireland with my wife. Close to the sea and mountains, so I race sailing dinghys, run, hike and generally try and avoid computers as much as possible. Mel: Sounds like y'all have the good life. Mark: Mel, introduce yourself too by the way! We haven't met. Mel: I'm a new Red Hatter on the Community Architecture team, running Fedora Marketing. This is also my first Fedora release, and I had to look up "Marketing" on wikipedia after Jack and Max asked me to step in... long story. Mark: Cool stuff, welcome to Red Hat! Mel: Thanks! I think we're pretty much set, unless there's anything you folks want to chime in on. =========== Chris arrives. Mark: Chris, Mel. Mel, Chris. Chris: Hi. Mark: Chris, I said you'd cover KSM, huge pages and SR-IOV ...okay ? Chris: Ok, works for me. Mel: If you want to start by introducing yourself and giving an overview of those features... Mark: Richard, David, and I just had a case of verbal spew for the past hour. Chris: Fun! David: Talk amongst yourself. *grins* Chris: My name is Chris Wright. I'm a kernel hacker working at Red Hat on virtualization, specifically KVM. We're continually improving the virtualization infrastructure and Fedora 12 has a nice long list of virtualization specific features as a result. Mel: For those reading along who aren't familiar with KVM, that's http://en.wikipedia.org/wiki/Kernel-based_Virtual_Machine. Chris: Right, thanks. Our goals are to improve the efficiency of KVM so that there is very little cost associated with running an OS in a virtual machine compared with bare metal, and to improve the density that we can acheive when consolidating multiple guest OS's to a single physical host. 16:09:43 <danpb_ltop> mchua: hello, what do you want to talk about ? 16:09:55 <mchua> hey, danpb_ltop! 16:10:11 <mchua> danpb_ltop: the game plan is to do a round of introductions on who you are and what you're working on 16:10:23 <cdub> One of the features we added for F12 is called KSM, which is about improving density, i.e. the number of VMs we can run on a since host 16:10:35 <mchua> danpb_ltop: and then cdub was going to talk about huge page backed memory, KSM, and SR-IOV 16:10:53 <mchua> danpb_ltop: which leaves VirtPrivileges, VirtStorage and libvirt TCK for you to explain 16:10:57 <mchua> danpb_ltop: sound good? 16:11:14 <cdub> And the other 2 I planned to talk about, hugepages and SR-IOV, are about improving the efficiency of the VM. 16:11:26 * mchua nods. 16:11:49 <mchua> cdub: let's start with KSM. 16:12:19 <cdub> mchua: alright, KSM...really cool feature that addresses one of the bottlenecks in virtualization. 16:12:42 <mchua> #link https://fedoraproject.org/wiki/Features/KSM 16:13:07 <cdub> A modern computer has lots of cores, but memory is still relatively expensive 16:13:29 <mchua> That's Kernel SamePage Merging (or the Korean Service Medal, or the Kothagudem School of Mines, but we're talking about the first one here. ;) 16:13:44 <danpb_ltop> mchua: ok 16:13:45 <cdub> So you can run out of memory for all the virtual machines you may want to run on a box, despite the fact that you've got CPU power to spare 16:13:47 * mchua nods and sits back to listen 16:14:21 <mchua> danpb_ltop: I reckon we'll just go through the features in order, so feel free to chime in on the KSM/hugepages/SR-IOV discussion, and then when we switch to your 3, you get to drive. 16:14:44 <cdub> mchua: heh, right. The acronym started life with a different translation, but same underlying meaning -- used to mean Kernel Shared Memory ;-) 16:16:00 <cdub> KSM, at it's core, simply scans regions of physical memory, looking for duplicate contents. 16:17:11 <cdub> And when it finds 2 pages of memory with identical contents, it collapses them to a single page. 16:17:52 <cdub> So this has the effect of compressing memory utilization. 16:18:16 <cdub> Now, when you consider this in a virtual machine context, you can see how this can be really useful. 16:18:57 <cdub> If the virtual machines are running similar OSes, they'll contain some of the same memory just for the kernels and programs running in the OS. 16:19:43 <cdub> So when we launch a VM, we register the memory associated with the VM to KSM, and let KSM scan away in the background. 16:20:10 <mchua> cdub: Do you have a rough idea of the range of how much memory (%?) this might typically save? 16:20:15 <cdub> You can actually watch as your free memory shrinks when you start a new VM, and then slowly grows back as KSM fins pages to merge. 16:20:17 <mchua> (for a particular type of use case, etc.) 16:20:30 <mchua> Ooh, I should get some screen captures of that. 16:20:50 <cdub> mchua: good question, it's very workload dependent, I don't have a number right off the top of my head. 16:20:52 <mchua> #action mchua get screencaps of free-memory-over-time for KSM section 16:21:29 <cdub> mchua: but one thing to keep in mind that's interesting here, is that some OSes opportunistically write zeros to their free memory 16:21:47 <mchua> cdub: Ok. Is there a way to find out? (I'm happy to do the legwork needed to get that comparison if there's a quick descript of what I should do / look to compare.) 16:21:59 <cdub> this has the effect from the KVM point of view of making they hypervisor believe that the memory is in use (it's been written to) 16:22:31 <cdub> but it's actually free, awaiting an allocation. With KSM, we'll find thousands of these pages, and collapse then to a single page. 16:22:32 <mchua> cdub: Interesting. So it can't tell the free memory is free. 16:22:44 <mchua> Nifty. 16:23:11 <mchua> cdub: should we switch gears and talk about huge pages and SR-IOV for a bit? 16:23:16 <cdub> mchua: right, the hypervisor doesn't have all the information, so it can only tell when a page is used, not unused 16:23:37 <mchua> cdub: Oh - wait, before we do that... is there anything like KSM outside Fedora? 16:23:48 <mchua> similar tools in other OSes, etc? 16:23:59 <cdub> mchua: sure, one last thing on KSM...there are statistics you can view in /sys/kernel/mm/ksm/ 16:24:05 * mchua nods 16:24:28 <cdub> mchua: actually, good question, reminds me of something I wanted to point out... 16:24:50 <cdub> mchua: yes, there's at least one other OS that has a feature like this, ESX (VMware). 16:25:00 <cdub> mchua: but in Fedora, KSM is not exclusive to VMs 16:25:15 <cdub> mchua: KSM works w/ any program that registers it's memory as "mergeable" 16:25:52 <cdub> mchua: so even regular programs running in Fedora could benefit, and some number crunchers at CERN have used this to improve their own application's memory usage 16:26:18 <cdub> mchua: so that they can run more apps, and do more number crunching w/ the same hardware 16:26:54 <mchua> Nice! 16:27:03 <cdub> mchua: so that's KSM, shall we move on to hugepages and SR-IOV? 16:27:12 <mchua> cdub: I was just about to suggest the same. 16:27:16 <cdub> cool 16:27:28 <mchua> danpb_ltop: Feel free to chime in any time; we'll get to your features in just a moment. :) 16:27:30 <cdub> ok, moving from density to efficiency... 16:28:20 <cdub> hugepages, another feature added to F12, give the user the ability to run a VM backed by huge pages 16:28:38 <cdub> normally, when we run a VM, we simply malloc() the memory for the guest OS. 16:29:15 <cdub> this means that the memory for the guest will be allocated in page sized chunks, 4k to be specific 16:30:12 <cdub> the hypervisor needs to be able to translate between the guests view of physical memory and the hosts view of physical memory 16:31:10 <cdub> when the host is using large pages, like 2M pages, or huge pages, to allocate guest memory, those translations become cheaper (there are fewer to do) 16:31:46 <cdub> and, if the guest actually wants to use huge pages (database servers like to do this, as do some Java workloads) 16:32:01 <cdub> the translation cost goes down further 16:32:29 <cdub> ultimately, with hug pages we've seen double digit percentage improvement for some of those workloads 16:33:33 <cdub> in fact, w/out backing the guest's memory w/ huge pages on the host, when a guest asked for a huge page...we lied. so this is really nice to fix 16:34:09 <mchua> "we lied" --> "we said 'here is a huge page' that wasn't actually a huge page"? 16:34:32 <cdub> exactly, which is not very nice since the reason the guest asked for a huge page was to improve it's own performance 16:35:48 * mchua nods. 16:36:06 <mchua> SR-IOV real quick, and then on to danpb_ltop's features? 16:36:15 <cdub> alright 16:36:35 <cdub> SR-IOV is another attempt to improve guest VM efficiency 16:36:52 <cdub> the I/O path for a guest is traditionally tough to virtualize 16:37:28 <mchua> danpb_ltop, cdub: I'd also like to have the two of you talk together a bit on how virt has progressed overall between F6 and F12, where virt is headed for F13 and beyond, and then some non-virt stuff about yourselves so our readers get to know you a little better, just so you know what's coming up. 16:37:35 * mchua listens to SR-IOV 16:39:07 <cdub> a typical VM has a NIC and a storage device that may be either emulated devices (this is the most expensive, but least likely to require any new drivers in the OS), like an emulated realtek NIC, or a virtual device which requires a special device driver, but doesn't need to emulate anything, it knows that it's just a virtual path to the hypervisor's I/O subsystem 16:40:20 <cdub> in both of those cases, there is a fair amount of CPU involved in processing the I/O request. the emulated device has the most overhead, but even a virtual device (like KVM's virtio devices) have to copy data around, and cause expensive exits from the VM to the hypervisor 16:41:39 <cdub> SR-IOV is an attempt by the industry to move virtualization out of the CPU (hardware virtualization extensions that allow KVM to work at all), past the chipset (like an IOMMU that allows memory isolation when a guest is talking directly to a physical device), and into the I/O devices 16:42:27 <mchua> So this is something that's happening outside of just Fedora, too. 16:42:33 <cdub> So, it requires newer hardware, specifically, the CPU, IOMMU and an SR-IOV capable card (there aren't a lot of these on the market yet, so Fedora is really on the leading edge) 16:42:52 <cdub> right, SR-IOV is a PCI standard 16:43:20 <cdub> An SR-IOV capable card allows you to effectively virtualize the I/O hardware 16:44:02 <cdub> so rather than having a single physical E1000 NIC that you must share with each VM via some indirection (the emulated or virutal device I mentioned above) 16:44:35 <cdub> you get a single physical NIC that you can allocate multiple virtual instances of 16:45:14 <cdub> So now you can allocate some resource from the SR-IOV NIC (called a Virtual Function, or VF) 16:45:23 <cdub> that shows up as a PCI device, just like real hardware 16:45:51 <cdub> and w/ the existing ability in Fedora to do PCI device assignment to a guest, you can assign that VF directly to the guest 16:46:26 <cdub> that means the guest is communicating directly to the hardware, it's running the same device driver you'd run on the hypervisor 16:46:35 * mchua nods 16:46:36 <cdub> and this really shortens the I/O path. 16:46:55 <cdub> With this you can effectively acheive bare metal I/O performance from a guest 16:47:13 <cdub> IOW, the I/O bottle neck is removed 16:48:10 <mchua> Nice. 16:48:22 <cdub> Yeah, KVM in F12 is looking really good 16:48:31 <mchua> danpb_ltop: ready to go? 16:49:13 <mchua> cdub, danpb_ltop: I don't want to keep you folks waiting, so what we might do is have the two of you talk together about F6-->F12 virt improvements and what's coming down the virt pipeline for future releases together 16:49:44 <cdub> mchua: I'm fine idling waiting for danpb_ltop to finish too 16:49:44 <cdub> mchua: either way 16:49:44 <mchua> cdub: and have you at some point throw in a "when I'm not hacking on virt stuff, I... <do these other things for fun>" 16:50:09 <mchua> and then cdub can run off, and I'll go through danpb_ltop's intro and those 3 features with him 16:50:26 <mchua> cdub: while we're waiting, what do you do for fun aside from virt hackin'? 16:50:33 <mchua> cdub: (and how did you get started doing it?) 16:51:23 <cdub> mchua: well, I got started hacking on virt stuff because I was interested in security and isolation. The thing that I found really exciting was that this was an area where hardware was rapidly evolving 16:51:51 <cdub> mchua: so, as a kernel hacker, it was really fun to work on software that's adpating to these new hardware features. 16:53:07 <danpb_ltop> wow, well F6 to F12 is a huge amount of time - almost 3 years worth of Fedora releases 16:53:18 <cdub> mchua: I was convinced that the virt stuff which I originally learned about in the context of Trusted Computing, had some other more useful benefits...that was probably 6 years ago, wow 16:54:12 <danpb_ltop> Way back in F6 all focus was on Xen and making it easy to manage, this was the first release where we had apps like virt-manager available 16:54:59 <cdub> mchua: as for fun...I've got two small children...so, sleep! ;-) Heh, that or hangin' w/ my kids, or riding my bike 16:55:27 <danpb_ltop> and the first to introduce graphical installation for guests, so it really set the foundations for future work 16:55:35 <mchua> cdub: Mmm, biking. :) 16:55:46 * mchua switches gears to F6-->F12 convo 16:55:55 <cdub> mchua: markmc recently told me my bike has bling! never been associated w/ bling before ;-) 16:56:04 * mchua listens to danpb_ltop, waits for cdub to chime in on F6-->F12 too. 16:56:09 <danpb_ltop> in Fedora 7, we added a very early release of KVM, along with support in libvirt + virt-mnanager for KVM+QEMU 16:56:25 <danpb_ltop> but Xen was still the primary virt platform at that time 16:57:03 <danpb_ltop> Fedora 8 focused on stepping up the security capabilites of the management toolchain 16:57:12 <cdub> And it was a lot of effort to forward port Xen all along that path 16:57:34 <danpb_ltop> by introducing support for securely using libvirt from a remote host using TLS/SSL or SSH tunnel, with similar capability added to the VNC server 16:58:17 <danpb_ltop> yes, as chris says there was always a massive effort going on in the background from F5 right through F8 on forward porting the old Xen kernel trees to something uptodate 16:59:13 <danpb_ltop> by F8 though, KVM was really gaining ground and was genuinely usable so many Fedora users had switch from Xen to KVM already at that time 17:00:02 <danpb_ltop> in Fedora 9, we finally stopped trying to forward port the old Xen kernel trees, and switched to only support paravirt-ops based kernels from LKML upstream 17:00:30 <danpb_ltop> this meant dropping support for Xen has a virtualization host platform, so from Fedora 9 onwards we only supported Xen as a guest 17:00:54 <cdub> That was a tough decision to make. But reality was the forward porting just got too hard 17:01:05 <danpb_ltop> fortunately KVM was in great shape by then, and hardware virtualization support was pretty widely available 17:02:05 <danpb_ltop> Fedora 9 also introduce more security features, such as support for SASL which allows use of Kerberos authentication for libvirt and PolicyKit for local desktop authentication 17:02:36 <cdub> Right, that was the key change. You could find new laptops, desktops, servers all w/ hardware virt support 17:03:13 <danpb_ltop> libvirt work to provide APIs for managing storage allowed Fedora 10 to introduce full remote provisioning in virt-manage 17:04:14 <danpb_ltop> and then finally F11 we added SASL support to the VNC server, comparable to that we'd done for libvirt in F10 17:04:45 * mchua enjoying this whirlwind tour through the ages 17:04:45 <danpb_ltop> so both libvirt & VNC can now integrate with pretty much any commonly found authentication services 17:05:43 <danpb_ltop> the most important feature in F11 though was the introduction of sVirt 17:06:18 <danpb_ltop> which is integration between libvirt and SELinux to provide security protection between virtual machines running on the same host 17:06:35 <danpb_ltop> (previously SELinux had merely protected the host from VMs, but not VMs from each other) 17:07:06 <cdub> Yeah, also a nice way to show the benefit of KVM being a part of Linux 17:08:20 <danpb_ltop> yep, it avoids having to reinvent all these concepts in a separate hypervisor 17:09:01 <mchua> cdub,danpb_ltop: And I think that brings us to F12 and the features we've just covered / are about to cover. Sweet. 17:09:03 <danpb_ltop> there are soo many important & useful features available in Linux we want to take advantage of that you really don't want to start having to re-invent them all 17:09:10 * mchua nods 17:09:26 <danpb_ltop> all this stuff is summarized in our history page https://fedoraproject.org/wiki/Virtualization/History 17:09:34 <mchua> danpb_ltop, cdub: can you talk a little bit about where you see virt work headed in the future, for Fedora N where N > 12? 17:09:43 * mchua notes that and will link to virt history wiki page heavily 17:10:10 <mchua> danpb_ltop, cdub: cloud sounded like a big thing to watch. 17:10:13 <mchua> deltacloud, specifically 17:10:21 <cdub> It's nice to look back and see how in a few years we've gone from no virt support in Fedora to what you see in F12 17:11:03 <cdub> mchua: one thing to underscore in that F6->F12 is the importance of libvirt 17:11:04 <danpb_ltop> mchua: you can thing of deltacloud as doing for clouds, what libvirt did for hypervisors 17:11:13 <cdub> exactly 17:11:26 <danpb_ltop> libvirt made it possible to write applications against a simple, standard, stable API regardless of the underlying hypervisor technology 17:11:47 <cdub> so the success of libvirt in isolating tools from the underlying hypervisor is just waht deltacloud is for cloud management 17:12:09 <danpb_ltop> the fact that we now have libvirt support for Xen, KVM, QEMU, OpenVZ, VMWare ESX, VMWare GSX, LXC (native containers), IBM Power Hypervisor & OpenNebula 17:12:52 <danpb_ltop> shows just how much people like the idea of libvirt - all of those drivers except for Xen & KVm were started by libvirt community members 17:13:25 <cdub> libvirt made it possible to move from F5 w/ paravivrt only Xen to Xen w/ HVM to KVM, w/out having to keep rewriting the magnement tools (they did evolve, like danpb_ltop mentioned ;-) 17:13:29 <danpb_ltop> deltacloud is aiming todo the same for cloud providers so you can write one app targetting any service and avoid being locked into proprietary cloud mgmt APis 17:14:05 <danpb_ltop> oh, add VirtualBox to that list for libvirt - mustn't forget one ! 17:14:16 <cdub> mchua: idea being you can manage multiple clouds from single tool, and even support moving from one cloud to another 17:14:48 <mchua> Nice! 17:15:05 <cdub> mchua: of course, we'll keep working on the infrastructure too 17:15:33 <cdub> mchua: continually improving the efficiency of the hypervisor, the managability of the hypervisor, etc. 17:15:41 <danpb_ltop> there's still plenty of work to be done for non-cloud related virt of course 17:16:08 <danpb_ltop> in the management tools we really want to polish the desktop virt usage scenario 17:16:34 <danpb_ltop> we've tended to focus more on server virt, so there's some things that aren't so nice to use for the single desktop case 17:16:57 <danpb_ltop> you can see the start of this with the major design overhaul of virt-manager UI in F12 17:17:33 <danpb_ltop> there's always more work to be done with security features too 17:17:55 <cdub> another thing we should see is better work on the remote desktop 17:18:07 <danpb_ltop> previously introduced sVirt allows us to protect VMs from each other, but all VMs still had more or less the same policy rules 17:18:29 <danpb_ltop> we want to start making this more tunable so you can easily customize policy for individual VMs 17:18:50 <danpb_ltop> for example, if running a Windows desktop, you might give it a policy that blocks all network traffic on port 25 17:18:56 <danpb_ltop> to prevent it being turned into a spam botnet 17:19:38 <danpb_ltop> or just want to restrict what VMs on a host are allowed to communicate with each other 17:20:02 <cdub> the whole way we manage VM networking is being reviewed as well 17:20:51 <danpb_ltop> fine grained access control over the libvirt APIs is also another thing we'd like todo 17:21:00 <cdub> just seeing new patches on the libvirt dev list to try and create new APIs for managing the rules surrounding a VMs network interface 17:21:24 <danpb_ltop> so you can determine who can manage each VM & what operations they can perform, etc 17:22:02 <danpb_ltop> anyway, shall we get back to the F12 features 17:22:49 <mchua> danpb_ltop: Yep. 17:23:03 <danpb_ltop> (05:10:53 PM) mchua: danpb_ltop: which leaves VirtPrivileges, VirtStorage and libvirt TCK for you to explain 17:23:10 <danpb_ltop> so taking them in that order 17:23:10 <mchua> cdub: I think I have everything I need from you - you're welcome to stick around, of course! but if you have to run, we're good. ;) 17:23:15 <mchua> danpb_ltop: thanks! 17:23:24 <cdub> mchua: cool, thanks 17:23:48 <danpb_ltop> VirtPrivileges is yet another feature focusing on security (you've noticed that's a common theme in virt work :-) 17:24:20 <danpb_ltop> libvirt has two modes of running virtual machines 17:24:56 <danpb_ltop> what we call our 'system' instance, is a per-host instance that runs maximum privileges for accessing storage / networking / etc 17:25:10 <danpb_ltop> this was primarily intended for server virtualization scenarios 17:25:37 <danpb_ltop> and then what we call our 'session' instance, is a per-user instance that runs with the same privileges as the user connecting to it 17:26:11 <danpb_ltop> this was intended for desktop virtualization, although it has not been really used much yet because it is hard to provide useful networking connectivity with it 17:26:39 <danpb_ltop> For the VirtPrivileges feature we wanted to improve security by reducing the privileges of the QEMU/KVM process 17:26:53 <danpb_ltop> but without sacrificing the functionality available 17:27:12 <danpb_ltop> so we now have QEMU running as a dedicated 'qemu' user account and group, instead of 'root' 17:27:29 <danpb_ltop> and libvirt manages permissions on resources that are assigned to QEMU, such as its disks 17:27:52 <danpb_ltop> one of the hard things was being able to maintain full network connectivity 17:28:08 <danpb_ltop> so we had to work with QEMU developers to provide a new way to hotplug network cards 17:28:38 <danpb_ltop> where libvirt sets up a "TAP" device and then passes it across to an already running QEMU process with a little UNIX blackmagic 17:28:58 <danpb_ltop> so this all improved security of the libvirt "system" instance 17:29:12 <danpb_ltop> to make the 'session' instance more useful, we also changed the KVM setup so that 17:29:29 <danpb_ltop> any user on the system can access /dev/kvm and thus run hardware accelerated virtual machines 17:30:34 <danpb_ltop> once we figure out how to provide better network connectivity to unprivileged virtual machines the 'session' instance of libvirt will finally be useful for desktop virt and address alot of long standing bugs/RFEs people have had 17:31:06 <danpb_ltop> Moving onto the 'VirtStorage' feature 17:31:23 <danpb_ltop> quite a few releases back we introduced storage management APIs into libvirt 17:31:30 <mchua> Does VirtPrivileges intersect with any of the other virt feature work being done with F12? I know there's been some network interface dev going on, etc. 17:31:47 <danpb_ltop> at the time we supported local disks, LVM, file based storage, iSCSI in the storage APis 17:32:08 <danpb_ltop> mchua: yes the network interface dev work was related to it, allowing us to hotplug network interfaces to running VMs 17:32:38 <danpb_ltop> the VirtStorage feature, extends our existing storage APis to now support SCSI FibreChannel adapters 17:33:02 <danpb_ltop> so you can discover what SCSI adapters you have, and what LUNs they are exporting to the host 17:33:34 <danpb_ltop> there is some fairly new technology called "NPIV" which allows one physical SCSI host adapter to be used 17:33:48 <danpb_ltop> to create many virtual host adapters, each with their own set of LUNs 17:33:48 <rwmjones> mchua, quick question, where do you publish these interviews when you've edited them together? 17:34:09 <danpb_ltop> so work was also done to allow libvirt to create / delete virtual host adapters when NPIV is supported 17:34:35 <danpb_ltop> the idea behind NPIV is that you might have one virtual SCSI host adapter associated with each VM 17:35:01 <danpb_ltop> and so instead of having to expose all SCSI luns to all hosts 17:35:06 <mchua> rwmjones: I'm going to be doing the editing on the Fedora wiki and it'll temporarily live there, but we'll also publish it on Fedora Insight once that goes live. 17:35:17 <danpb_ltop> you only need to expose the virtual SCSI host adapter to the host on which the VM is currently running 17:35:42 <mchua> rwmjones: https://fedoraproject.org/wiki/Fedora_Insight, the publictest is http://publictest6.fedoraproject.org/zikula/ and it's almost ready to go staging --> production. 17:35:43 <danpb_ltop> this makes management of storage much more flexible, an secure 17:36:08 <rwmjones> mchua, cool - I'd not heard of that site before, but it looks excellent 17:36:32 <danpb_ltop> finally the libvirt TCK 17:37:54 <mchua> rwmjones: it's just a centralized place to publish all the Fedora marketing materials we already generate (but currently scatter across multiple blogs / wiki pages / etc) 17:38:06 <danpb_ltop> readers may or may not be aware of the Java TCK which is a huge test suite that people who write Java JRE/JDKs have to run & pass to ensure compliance with the java specification 17:38:08 <rwmjones> that's a very very good idea 17:38:37 <danpb_ltop> with libvirt we've had some ups & downs on the quality front and as we gained support for more & more APis & hypervisors 17:38:53 * mchua pulls up supporting materials to explain TCK 17:38:55 <danpb_ltop> it was becoming increasingly hard to ensure the new libvirt releases were off the quality people expect 17:38:57 <mchua> #link http://jcp.org/en/resources/tdk 17:39:08 <mchua> #link http://en.wikipedia.org/wiki/Technology_Compatibility_Kit 17:39:15 <mchua> (the wikipedia article == more useful resource) 17:39:26 <danpb_ltop> so we decided to build what we call the 'libvirt TCK' (libvirt Technology Compatability Kit) 17:39:43 <danpb_ltop> the idea being that we write a huge set of tests covering all aspects of libvirt APIs 17:40:03 <danpb_ltop> which we can then run against each hypervisor libvirt supports to ensure everything is working as it is expected to 17:40:31 <danpb_ltop> this not only finds bugs in libvirt, but also helps identify bugs in new releases of the underlying hypervisor/virtualization platform 17:40:44 <danpb_ltop> or in the way an OS distributor built / packaged them 17:40:59 <mchua> danpb_ltop: how is that kind of QA being carried out now (or before the libvirt TCK came around?) 17:41:12 <danpb_ltop> this is quite a new bit of work and we've only got a handful of test cases built into it so far 17:41:40 <danpb_ltop> but it has already allowed us to identify & fix alot of bugs before releasing which would have otherwise caused regressions for users 17:42:21 <danpb_ltop> mchua: well there's testing by upstream libvirt developers, testing by OS packagers / distributors and testing by end users (eg in a Fedora test day, or even of the final releases) 17:42:40 <danpb_ltop> the libvirt TCK is primarily targetted at upstream libvirt, and OS distributors 17:43:06 <danpb_ltop> upstream libvirt community wants to make sure they don't release something which stupid bugs in it 17:43:16 * mchua nods 17:43:25 <danpb_ltop> and OS distributors want to make sure they've built & packaged everything, and then when they update to the latest KVM / Xen / whatever 17:43:38 <danpb_ltop> that they are not going to cause regressions in libvirt or applications using libvirt 17:43:52 <danpb_ltop> above all we want to catch as many bugs as possible before they get to end-users 17:45:16 <danpb_ltop> its got fairly minimal testing coverage for F12, but come F13 we want to have all important core functionality automatically tested 17:45:35 <mchua> danpb_ltop: What can our readers to to help out with this testing? 17:45:44 <mchua> (or to try out any of these features and send feedback, really?) 17:46:45 <danpb_ltop> well had a Virtualization Test Day a few weeks back now, but if interested in doing testing 17:47:01 <danpb_ltop> keep an eye out for future test days during the course of Fedora 13 development 17:47:18 <danpb_ltop> joining the fedora-virt mailing list is a good way to get involved in Fedora virtualization work 17:47:57 <danpb_ltop> or if they have development experiance, then the various upstream communities always have plenty of need for help 17:49:27 * mchua nods - thanks! 17:49:57 <mchua> danpb_ltop: I think we're almost done - anything else on those three features (or any others) you'd like to call out/explain/plug? 17:52:35 <rwmjones> danpb_ltop, V2V? 17:54:51 <mchua> danpb_ltop: I'd also like to get a sentence or two of introduction from you (since we missed that at the beginning) and a couple things you do for fun outside of virt-hackin' 17:54:58 <mchua> and then we'll be done. 17:55:07 <mchua> danpb_ltop: Thanks for being so patient - I know this took longer than expected. 17:55:56 <danpb_ltop> rwmjones: there's no V2V stuff in F12 AFAIK 17:56:27 <rwmjones> didn't matt add it? anyhow, doesn't matter 17:58:37 <danpb_ltop> mchua: I've worked on Red Hat for quite a long time now, must be more than 7 years, with the last 3 focusing on virtualization 17:58:58 <danpb_ltop> i originally got involved in the virtualization team by writing the virt-manager application 17:59:21 <danpb_ltop> but since then Cole Robinson has taken the lead on that development, and I'm spending most time on lower level areas 17:59:51 <danpb_ltop> probably 80% libvirt, and the rest related things like QEMU / KVM / VNC 18:00:13 <mchua> danpb_ltop: Why do dev work for those things in Fedora? 18:00:15 <danpb_ltop> its a good mixture of upstream work, Fedora work and RHEL work 18:00:42 <danpb_ltop> well upstream libvirt is on an approx monthly schedule 18:01:43 <danpb_ltop> Fedora has a short 6 month schedule, which means Fedora is a great place to get early exposure to real users 18:02:16 <danpb_ltop> a new libvirt release ends up in Fedora rawhide almost always the same day 18:02:46 <danpb_ltop> and Fedora stable releases are only a couple of releases behind latest 18:03:18 <danpb_ltop> it works out well for us as libvirt community developers, and for users who always want the latest stuff 18:04:11 <mchua> Sweet. And then eventually that work finds its way over to RHEL? 18:05:50 <danpb_ltop> yep, periodically it works its way into RHEL, but on a much longer timescale 18:06:13 <danpb_ltop> since RHEL has more prolonged testing / quality control cycles before release than you'd get with Fedora 18:07:08 * mchua references http://www.youtube.com/watch?v=xu81frqUtlc, Paul's video 18:07:22 <mchua> danpb_ltop: Thanks. And what do you do when you're not hacking on virt? 18:07:57 <danpb_ltop> err, sleep 18:08:16 <mchua> Sleep is good stuff. 18:08:32 <danpb_ltop> nah, seriously i spend quite alot of time on photography 18:08:40 * mchua notes "sleep" seems to be one of the first "what do I do in my free time" responses from virt hackers... 18:08:53 <stickster> mchua: They all have guest instances working while they snooze. 18:09:28 <mchua> oh, nice! seems like we've got a pretty good set of hobbyist photographers at RH 18:09:34 <mchua> danpb_ltop, dwa, mizmo, etc 18:10:15 <mchua> danpb_ltop: is there a gallery you'd like to share with folks, just for fun? 18:10:21 <mchua> (totally optional) 18:11:07 <mchua> stickster: btw, I'm going to be cleaning this up en route to Toronto, do you want it tomorrow or sometime over the weekend or Monday or at some later date? 18:11:29 <mchua> stickster: I have *way* more than enough info now to make multiple marketing shiny things from this 18:12:10 <stickster> mchua: Monday would be fine -- I'll be mostly out of commission FAD-ing this weekend 18:12:24 <stickster> mchua: Might want to let f-mktg-l know, Kara will see it there too 18:12:33 <danpb_ltop> hah, they can google for it ! 18:12:46 <mchua> danpb_ltop: maybe I will ;) 18:12:58 <mchua> danpb_ltop: anything else? otherwise I think we're done - thanks for all your time (and patience!) 18:13:08 * mchua waves at mthompson 18:15:14 <danpb_ltop> think that's all 18:15:55 <mchua> danpb_ltop: We're all set, then. Thanks for your time! 18:16:00 * mchua will try to find a way to streamline this process in the future - I think yours was the one that went most over, because of the staggered scheduling. 18:16:22 * rbergeron yawns 18:16:29 * mchua closes out logs 18:16:31 <mchua> #endmeeting
Interview Highlights: Virtualization Improvements in Fedora 12
Red Hat's Mel Chua recently did a series of interviews on Fedora 12's virtualization improvements with members of the virtualization team. Here are some of the highlights from those discussions.
...talking about libguestfs
...on virtual upgrades to your Virtual Machine
...on reducing complexity in network scripts
...talking about the typical user
...discussing the history of PXE
...the history of virt-manager
Many thanks go out to the members of the virt team for participating in this interview, including rwmjones (aka: rwmjones), David Lutterkort (aka: lutter), and Mark McLoughlin (aka: markmc), as well as Mel Chua (aka: mchua) for arranging the interview. The transcript of the full interview is available on the Fedora Project's website.
If you want to find more information about the projects discussed in this interview, there are a number of resources available.
And of course, if you want to find out more about the Fedora Project and give it a whirl, everything you need to get started is available at www.fedoraproject.org.
--Rbergero 16:15, 7 November 2009 (UTC)