From Fedora Project Wiki
(add kvm device failover feature)
 
(Retarget to Fedora 20 as requested by Feature owner)
 
(17 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{admon/important | Comments and Explanations | The page source contains comments providing guidance to fill out each section.  They are invisible when viewing this page.  To read it, choose the "edit" link.<br/> '''Copy the source to a ''new page'' before making changes!  DO NOT EDIT THIS TEMPLATE FOR YOUR FEATURE.'''}}
= Virt Device Failover =
 
{{admon/important | Set a Page Watch| Make sure you click ''watch'' on your new page so that you are notified of changes to it by others, including the Feature Wrangler}}
 
{{admon/note | All sections of this template are required for review by FESCo.  If any sections are empty it will not be reviewed }}
 
 
<!-- All fields on this form are required to be accepted by FESCo.
We also request that you maintain the same order of sections so that all of the feature pages are uniform.  -->
 
<!-- The actual name of your feature page should look something like: Features/Your_Feature_Name.  This keeps all features in the same namespace -->
 
= KVM Device Failover <!-- The name of your feature --> =


== Summary ==
== Summary ==
<!-- A sentence or two summarizing what this feature is and what it will do.  This information is used for the overall feature summary page for each release. -->
Support for transparent failover between an assigned and an emulated device, allows enabling the migration and overcommit dynamically, while still gaining the performance benefits of device assignment and without disrupting the guest operation.
 
Support for transparent failover between an assigned and an emulated device,
allows enabling the migration and overcommit dynamically, while
still gaining the performance benefits of device assignment
and without disrupting the guest operation.


== Owner ==
== Owner ==
<!--This should link to your home wiki page so we know who you are-->
* Name: [[User:Mst|Michael S. Tsirkin]]
* Name: [[User:MST| Michael S. Tsirkin]]
* Email: mst@redhat.com
* Name: [[User:Ghammer| Gal Hammer]]
* Name: [[User:Ghammer|Gal Hammer]]
 
* Email: ghammer@redhat.com
<!-- Include you email address that you can be reached should people want to contact you about helping with your feature, status is requested, or  technical issues need to be resolved-->
* Name: [[User:crobinso|Cole Robinson]]
* Email: mst@redhat.com <your email address so we can contact you, invite you to meetings, etc.>
* Email: crobinso@redhat.com
* Email: ghammer@redhat.com <your email address so we can contact you, invite you to meetings, etc.>
* Name: [[User:laine|Laine Stump]]
* Email: laine@redhat.com


== Current status ==
== Current status ==
* Targeted release: [[Fedora 19]]  
* Targeted release: [[Releases/20|Fedora 20]]  
* Last updated: (Jan 29)
* Last updated: 2013-03-15
* Percentage of completion: 50%
* Percentage of completion: 50%
<!-- CHANGE THE "FedoraVersion" TEMPLATES ABOVE TO PLAIN NUMBERS WHEN YOU COMPLETE YOUR PAGE. -->


== Detailed Description ==
== Detailed Description ==
<!-- Expand on the summary, if appropriate.  A couple sentences suffices to explain the goal, but the more details you can provide the better. -->
For virtual machines, device assignment is the best
For virtual machines, device assignment is the best
option for performance. However, when a device is assigned to a VM, both
option for performance. However, when a device is assigned to a VM, both
Line 55: Line 35:
while device configuration is preserved. Once e.g. migration
while device configuration is preserved. Once e.g. migration
completes, the reverse switch can take place.
completes, the reverse switch can take place.
Thus the device is controlled by:
* before migration: device specific driver loaded in guest
* during migration: driver loaded in host,  virtio or emulated device driver loaded in guest
* after migration: device specific driver loaded in guest


At the kernel level, for networking, this can be done by  and creating
At the kernel level, for networking, this can be done by  and creating
a bond in a failover configuration, and for storage, using multipath,
team (or a bond) in a failover configuration, and for storage, using multipath,
on top of both the assigned and the emulated device.
on top of both the assigned and the emulated device.


== Benefit to Fedora ==
== Benefit to Fedora ==
<!-- What is the benefit to the platform?  If this is a major capability update, what has changed?  If this is a new feature, what capabilities does it bring? Why will Fedora become a better distribution or project because of this feature?-->
Complex virt setups now have less operational caveats, which makes things simpler for users.


== Scope ==
== Scope ==
<!-- What work do the developers have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
Work left to do:
Work left to do:
* kvm needs to be extended to notify the guest that the two devices are setup in a fallback configuration
* kvm needs to be extended to notify the guest that the two devices are setup in a fallback configuration<br>In particular, add support for sending dbus commands to qemu-ga.<br>Need to configure security policy appropriately to allow control of what's allowed, cleanly.
* For networking, udev and/or network manager needs to be extended to detect this and setup bonding
* For networking, NetworkManager in fedora will support bonding:<br>https://fedoraproject.org/wiki/Features/NetworkManagerBonding <br>and teaming<br>https://fedoraproject.org/wiki/Features/NetworkManagerTeaming <br>NM can be controlled using dbus.
* For storage, need to setup device-mapper-multipath to autodetect this configuration
* For storage, need to setup device-mapper-multipath to autodetect this configuration
* libvirt has to be extended to specify this configuration
* libvirt has to be extended to specify this configuration
* libvirt has to be extended to request failover, and ack on guest ack of the failover
* libvirt has to be extended to request failover, and ack on guest ack of the failover
* above covers linux guests
* above covers linux guests<br>if possible, guest agent for windows should be extended to add this support in windows guests as well
  if possible, guest agent for windows should be extended to add this
  support in windows guests as well




== How To Test ==
== How To Test ==
<!-- This does not need to be a full-fledged document.  Describe the dimensions of tests that this feature is expected to pass when it is done.  If it needs to be tested with different hardware or software configurations, indicate them.  The more specific you can be, the better the community testing can be.
Remember that you are writing this how to for interested testers to use to check out your feature - documenting what you do for testing is OK, but it's much better to document what *I* can do to test your feature.
A good "how to test" should answer these four questions:
0. What special hardware / data / etc. is needed (if any)?
1. How do I prepare my system to test this feature? What packages
need to be installed, config files edited, etc.?
2. What specific actions do I perform to check that the feature is
working like it's supposed to?
3. What are the expected results of those actions?
-->
Two systems with device assignment (IOMMU) support are required to
Two systems with device assignment (IOMMU) support are required to
test this feature.
test this feature.
To test the feature, specify an assigned device,
To test the feature, specify an assigned device,
start guest and migrate.
start guest and migrate.
XXX: Explicit test steps here for test day




== User Experience ==
== User Experience ==
<!-- If this feature is noticeable by its target audience, how will their experiences change as a result?  Describe what they will see or notice. -->
User will see that they can specify an assigned network or storage
User will see that they can specify an assigned network or storage
device and still migrate the guest seamlessly.
device and still migrate the guest seamlessly.


== Dependencies ==
== Dependencies ==
<!-- What other packages (RPMs) depend on this package?  Are there changes outside the developers' control on which completion of this feature depends?  In other words, completion of another feature owned by someone else and might cause you to not be able to finish on time or that you would need to coordinate?  Other upstream projects like the kernel (if this is not a kernel feature)? -->
For networking,
For networking,
https://fedoraproject.org/wiki/Features/NetworkManagerBonding
https://fedoraproject.org/wiki/Features/NetworkManagerBonding


== Contingency Plan ==
== Contingency Plan ==
<!-- If you cannot complete your feature by the final development freeze, what is the backup plan?  This might be as simple as "None necessary, revert to previous release behaviour."  Or it might not.  If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy.  -->
None necessary, revert to previous release behaviour.
None necessary, revert to previous release behaviour.


== Documentation ==
== Documentation ==
<!-- Is there upstream documentation on this feature, or notes you have written yourself?  Link to that material here so other interested developers can get involved. -->
Links to related upstream documentation:
*
 
Not yet.
http://www.linux-kvm.org/page/Hotadd_pci_devices
https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/DM_Multipath/index.html
http://unixfoo.blogspot.com/2007/10/yet-to-add.html


== Release Notes ==
== Release Notes ==
<!-- The Fedora Release Notes inform end-users about what is new in the release. Examples of past release notes are here: http://docs.fedoraproject.org/release-notes/ -->
* KVM guests with assigned host devices can now be migrated across hosts. The assigned device will be replaced during migration with an emulated device in a transparent manner.
<!-- The release notes also help users know how to deal with platform changes such as ABIs/APIs, configuration or data file formats, or upgrade concerns.  If there are any such changes involved in this feature, indicate them here.  You can also link to upstream documentation if it satisfies this need.  This information forms the basis of the release notes edited by the documentation team and shipped with the release. -->
*


== Comments and Discussion ==
== Comments and Discussion ==
* See [[Talk:Features/KVM_Device_Failover]] <!-- This adds a link to the "discussion" tab associated with your page.  This provides the ability to have ongoing comments or conversation without bogging down the main feature page -->
* See [[Talk:Features/KVM_Device_Failover]]
 


[[Category:FeatureReadyForWrangler]]
[[Category:FeatureReadyForWrangler]]
<!-- When your feature page is completed and ready for review -->
[[Category:Virtualization]]
<!-- remove Category:FeaturePageIncomplete and change it to Category:FeatureReadyForWrangler -->
<!-- After review, the feature wrangler will move your page to Category:FeatureReadyForFesco... if it still needs more work it will move back to Category:FeaturePageIncomplete-->
<!-- A pretty picture of the page category usage is at: https://fedoraproject.org/wiki/Features/Policy/Process -->

Latest revision as of 09:54, 15 March 2013

Virt Device Failover

Summary

Support for transparent failover between an assigned and an emulated device, allows enabling the migration and overcommit dynamically, while still gaining the performance benefits of device assignment and without disrupting the guest operation.

Owner

Current status

  • Targeted release: Fedora 20
  • Last updated: 2013-03-15
  • Percentage of completion: 50%

Detailed Description

For virtual machines, device assignment is the best option for performance. However, when a device is assigned to a VM, both migration and memory overcommit are currently disabled.

This feature aims at removing the performance/features tradeoff, by switching to an emulated device in a way that is almost transparent to users, for configurations where both host and guest are Fedora.

Fedora should detect that the emulated device serves as a failover for the assigned device. When requested by the hypervisor, it will stop and eject the assigned device, switching to failover. After this point, migration and memory overcommit are possible, while device configuration is preserved. Once e.g. migration completes, the reverse switch can take place.

Thus the device is controlled by:

  • before migration: device specific driver loaded in guest
  • during migration: driver loaded in host, virtio or emulated device driver loaded in guest
  • after migration: device specific driver loaded in guest

At the kernel level, for networking, this can be done by and creating team (or a bond) in a failover configuration, and for storage, using multipath, on top of both the assigned and the emulated device.

Benefit to Fedora

Complex virt setups now have less operational caveats, which makes things simpler for users.

Scope

Work left to do:

  • kvm needs to be extended to notify the guest that the two devices are setup in a fallback configuration
    In particular, add support for sending dbus commands to qemu-ga.
    Need to configure security policy appropriately to allow control of what's allowed, cleanly.
  • For networking, NetworkManager in fedora will support bonding:
    https://fedoraproject.org/wiki/Features/NetworkManagerBonding
    and teaming
    https://fedoraproject.org/wiki/Features/NetworkManagerTeaming
    NM can be controlled using dbus.
  • For storage, need to setup device-mapper-multipath to autodetect this configuration
  • libvirt has to be extended to specify this configuration
  • libvirt has to be extended to request failover, and ack on guest ack of the failover
  • above covers linux guests
    if possible, guest agent for windows should be extended to add this support in windows guests as well


How To Test

Two systems with device assignment (IOMMU) support are required to test this feature. To test the feature, specify an assigned device, start guest and migrate.

XXX: Explicit test steps here for test day


User Experience

User will see that they can specify an assigned network or storage device and still migrate the guest seamlessly.

Dependencies

For networking, https://fedoraproject.org/wiki/Features/NetworkManagerBonding

Contingency Plan

None necessary, revert to previous release behaviour.

Documentation

Links to related upstream documentation:

http://www.linux-kvm.org/page/Hotadd_pci_devices https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/DM_Multipath/index.html http://unixfoo.blogspot.com/2007/10/yet-to-add.html

Release Notes

  • KVM guests with assigned host devices can now be migrated across hosts. The assigned device will be replaced during migration with an emulated device in a transparent manner.

Comments and Discussion