From Fedora Project Wiki

Enable kernel acceleration for kvm networking

Summary

Enable kernel acceleration for kvm networking

Owner

Current status

  • Targeted release: Fedora 13
  • Last updated: 2010-01-26
  • Percentage of completion: 100%
    • code complete, changes in response to upstream review are being incorporated

Detailed Description

vhost net moves the task of converting virtio descriptors to skbs and back from qemu userspace to the kernel driver. It is activated by using -netdev options (instead of -net) and adding vhost=on flag.

Benefit to Fedora

Using a kernel module reduces latency and improves packets per second for small packets.

Scope

The work is all upstream in the kernel and qemu. Guest code is already upstream. Host/qemu work is in progress. For Fedora 13 will likely have to backport some of it.

Milestones Reached

- Guest Kernel:

 MSI-X support in virtio net

- Host Kernel:

 iosignalfd, irqfd, eventfd polling
 finalize kernel/user interface    
 socket polling                    
 virtio transport with copy from/to user
 TX credits using destructor (or: poll device status)
 TSO/GSO                                             
 profile and tune                                    

- qemu:

 MSI-X support in virtio net
 connect to kernel backend with MSI-X     
 PCI interrupts emulation                 
 TSO/GSO        
 profile and tune

In progress

- finalize qemu command line - qemu: migration

Code posted, but won't be upstream in time and probably not important enough to backport

 raw sockets support in qemu, promisc mode

Delayed, will likely not make it by F13 ship date

 mergeable buffers
 programming MAC/vlan filtering

Test Plan

Guest:

  • WHQL networking tests

Networking:

  • Various MTU sizes
  • Broadcasts, multicasts,
  • Ethtool
  • Latency tests
  • Bandwidth tests
  • UDP testing
  • Guest to guest communication
  • More types of protocol testing
  • Guest vlans
  • Tests combination of multiple vnics on the guests
  • With/without {IP|TCP|UDP} offload

Virtualization:

  • Live migration

Kernel side:

  • Load/unload driver

User Experience

Users should see faster networking at least in cases of SRIOV or a dedicated per-guest network device.

Dependencies

  • kernel acceleration is implemented in the kernel rpm and depends on changes in qemu-kvm to work correctly.

Contingency Plan

  • We don't turn it on by default if it turns out to be unstable.

Documentation

  • vhost net is activated by adding vhost=on to netdev option.
  • For non-vhost:
                                                                                                
 -net nic,model=virtio,netdev=foo -netdev                                                       
 tap,id=foo,ifname=msttap0,script=/home/mst/ifup,downscript=no
  • For vhost:
                                                                                                
 -net nic,model=virtio,netdev=foo -netdev                                                       
 tap,id=foo,vhost=on,vhostfd=<fd>,ifname=msttap0,script=/home/mst/ifup,downscript=no            
  • In other words, vhost=on is added to enable vhost, and vhostfd= gets the file descriptor for device.
  • To check that vhost is enabled:

1. Do lsmod and see that vhost_net module has reference count> 1

(this might happen even when vhost was disabled in the end,so please do also one of the following)


2. look up your device in guest (assuming it is the only virtio device, virtio0):

    cat /sys/bus/virtio/devices/virtio0/features                                               

bit 15 should be 0

3. run netperf from/to host, run top in host, you will see vhost thread running in host

  • Migration implications:

vhost net does not support mergeable rx buffers, so to make migration to non-vhost device safe, you must disable mergeable rx buffers on non-vhost. You can do this by adding

mrg_rxbuf=off

Release Notes

  • KVM networking is enhanced by kernel acceleration

This was shown to reduce latency by a factor of 5, and improve bandwidth from 90% native to 95% of native on some systems [[1]]

Comments and Discussion