From Fedora Project Wiki
 
(48 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== Summary ==
== Summary ==


Enable [[Virtualization|VM hosts]] to discover new SAN storage, issue NPIV operations and do basic configuration of multipath devices.
Enable [[Virtualization|VM hosts]] to discover new SAN storage and issue NPIV operations.


== Owner ==
== Owner ==


* Name: [[dallan|Dave Allan]]
* Name: [[dallan|Dave Allan]]
* Email: dallan at redhat dot com


== Current status ==
== Current status ==


* Targeted release: [[Releases/11|Fedora 11]]
* Targeted release: [[Releases/12|Fedora 12]]
* Last updated: 2009-02-25
* Last updated: 2009-08-05
* Percentage of completion: 5%
* Percentage of completion: 100%
 
=== TODO ===
 
* Implement storage discovery
* Implement NPIV operations
* Implement multipath configuration
 
=== Completed ===
 
* none


== Detailed Description ==
== Detailed Description ==
Line 27: Line 18:
=== Background ===
=== Background ===


Guest virtual machines can currently use SAN storage and multipath devices, but administrators must do the storage configuration manually using separate tools from libvirt.  This feature will permit administrators to discover storage and present it to virtual machines using libvirt.
Guest virtual machines can currently use SAN storage, but
administrators must do the storage configuration manually using
separate tools from libvirt.  This feature will permit administrators
to discover storage and present it to virtual machines using libvirt.


Datacenter operations are usually split along functional lines: the
Datacenter operations are usually split along functional lines: the
facilities management team, the server administration team, the SAN
server administration team, the SAN administration team, and others
administration team, and others (network, etc.) not relevant to this
(network, etc.) not relevant to this discussion.  Within the server
discussion.  Within the server admin group, there are often sub-groups
admin group, there are often sub-groups for each OS. When a new
for each OS.
application is deployed the SAN admins provision the storage and
 
notify whichever OS team is responsible for the server to proceed with
When a new application is deployed, whoever is driving the deployment
the OS install.
typically makes three parallel requests, one to the facilities team to
get the hardware racked and wired, one to the server admins for the OS
installation, and one to the SAN admins for storage provisioning.  The
server request blocks until the hardware is available and the storage
allocation request completes.  When the storage is available, the SAN
admins notify whichever OS team is responsible for the server to
proceed with the OS install.


Although it's inaccurate and more than a little dangerous, the common
There may be more or less information transmitted from the SAN admins
message after storage provisioning is, "I've provisioned the LUNs, and
to the server admins when the storage becomes available.  The minimal
you should be able to see them now.  I gave you LUNs 27, 28, 29 and
message is something along the lines of, "I've provisioned the LUNs,
and you should be able to see them now.  The LUNs are 27, 28, 29 and
53."  The server admin may not know what targets or hosts the new
53."  The server admin may not know what targets or hosts the new
storage is accessible through, but a rescan of all host adapters will
storage is accessible through, but a rescan of all host adapters will
show up new logical units with numbers 27, 28, 29 and 53 on some
show up new logical units of the size requested with numbers 27, 28,
target on some host, and the server admin assumes that's the new
29 and 53 on some target on some host, and the server admin assumes,
storage.
usually reasonably, that these logical units are the new storage.
 
The functionality described here does not attempt to address the many
possible failures that can result from the communication between the
storage admins and the server admins described above.  What we are
doing is providing a framework to make the described process work
within its limitations.  Perhaps more importantly, we create a
foundation for work that will make it possible for admins, using only
libvirt's APIs, to identify storage precisely and validate that the
storage intended is actually used.
 
When the new server is virtual, the entire process becomes an exercise
in software configuration.  Instead of making a request of the
facilities team to buy, rack and wire a new box, whoever is driving
the process requests creation of a new VM.  With the right management
tools, the OS team who will be installing the server can be given
rights to create VMs and the number of teams involved can be reduced
to two: the OS admins and the SAN admins.  That level of organization
requires that the tool used to manage the VMs be capable of
discovering and presenting SAN storage to VMs.


While libvirt is currently capable of using SAN storage, it lacks the
While libvirt is currently capable of using SAN storage, it lacks the
ability to trigger scans for new storage, create virtual host adapters
ability to trigger scans for new storage and create virtual host
using NPIV and manage multipath devices.  The OS admin team that
adapters using NPIV.  The OS admin team that manages the VM host must
manages the VM host must get involved to get the VM host to recognize
get involved to get the VM host to recognize the new storage.  Giving
the new storage.  Giving libvirt the ability to manage storage allows
libvirt the ability to manage storage allows the OS admin team
the OS admin team responsible for the guest OS to complete the VM
responsible for the guest OS to complete the VM build out itself.
build out itself.


=== Implementation ===
=== Implementation ===


The libvirt APIs already permit storage discovery and pool creation. These functions will be extended to discover storage on a per-SCSI-host basis and multipath devices.  The pool create and destroy functions will be extended to understand multipath and NPIV.
The libvirt APIs already permit storage discovery and pool creation.
These functions will be extended to discover and rescan storage on a
per-SCSI-host basis.  The node device APIs will be extended to create
and destroy vitrual adapters using NPIV.


== Benefit to Fedora ==
== Benefit to Fedora ==
Line 91: Line 62:


As described above, changes are required in libvirt.  Eventually the tools using libvirt will need to be updated to take advantage of the new features, but that is not within the scope of this work.
As described above, changes are required in libvirt.  Eventually the tools using libvirt will need to be updated to take advantage of the new features, but that is not within the scope of this work.
=== TODO ===
=== Completed ===
* SCSI storage discovery and rescan are complete.
* The creation and destruction of virtual HBAs using NPIV is complete.


== How To Test ==
== How To Test ==


# Use virsh to discover and configure SAN storage.
Use cases
# Use virsh to issue NPIV operations.
 
# Use virsh to configure multipath devices.
=== Discover SAN storage ===
# Assign SAN and multipath storage to VMs.
 
Provision a new logical unit on iSCSI or fibre channel storage.  Use virsh to trigger a scan for it, and confirm that it appears correctly.
 
To discover logical units on a particular HBA, create a pool for that
HBA using:
 
<code>virsh pool-create hbapool.xml</code>
 
where hbapool.xml contains:
 
<pre>
<pool type="scsi">
  <name>host6</name>
  <source>
    <adapter name="host6"/>
  </source>
  <target>
    <path>/dev/disk/by-id</path>
  </target>
</pool>
</pre>
 
Confirm that all the appropriate logical units are visible as volumes
with:
 
<code>virsh vol-list host6</code>
 
After creating the pool, add a new logical unit on a target that's
visible on that host and refresh the pool with:
 
<code>virsh pool-refresh host6</code>
 
and confirm that the new storage is visible.  Note that the refresh
code only scans for new LUs on existing targets and does not issue a
LIP to discover new targets as that would be disruptive to I/O.
 
=== Create and destroy a virtual HBA with NPIV ===
 
Issue an NPIV create call and confirm that the VM host has instantiated a new host adapter and that any storage zoned to it is usable.
 
To create virtual HBAs using libvirt, it is of course necessary to
have an NPIV capable HBA and switch.  You can confirm that you have
those by manually creating a new HBA by an echo into sysfs.
 
The file you echo into may be in one of two places, depending on which
kernel version you have.  On recent kernels it's in:
 
<code>/sys/class/fc_host/hostN</code>
 
on older kernels it's in:
 
<code>/sys/class/scsi_host/hostN</code>
 
Note also that the example WWN given below is bogus.  If you try to
use it, the kernel will reject it.  You should pick a WWN that makes
sense for your SAN.
 
For example on a recent kernel:
 
<code>echo '1111222233334444:5555666677778888' > /sys/class/fc_host/host5/vport_create</code>
 
where '1111222233334444:5555666677778888' is the WWPN:WWNN and host5
is the physical HBA you want to use to create the virtual HBA.  If the
create is successful, you'll get a new HBA in the system with the next
available host number.
 
You can then destroy the test virtual HBA with:
 
<code>echo '1111222233334444:5555666677778888' > /sys/class/fc_host/host5/vport_delete</code>
 
Testing the libvirt API
 
The libvirt API implementation is intended to be used by client
applications, but the functionality can be tested with virsh.
 
Creating a new virtual adapter using virsh is a two step process.
First, find the node device name of the HBA that's going to be used to
create the virtual adapter.  You can get a list of all the HBAs on
your system with:
 
<code>virsh nodedev-list --cap=scsi_host</code>
 
For example:
 
<pre>
# virsh nodedev-list --cap=scsi_host
pci_10df_fe00_0_scsi_host
pci_10df_fe00_0_scsi_host_0
pci_10df_fe00_scsi_host
pci_10df_fe00_scsi_host_0
pci_10df_fe00_scsi_host_0_scsi_host
pci_10df_fe00_scsi_host_0_scsi_host_0
</pre>
 
Dump the XML for each HBA until you find the host number of the
physical HBA you want to use:
 
<pre>
# virsh nodedev-dumpxml pci_10df_fe00_scsi_host
<device>
  <name>pci_10df_fe00_scsi_host</name>
  <parent>pci_10df_fe00</parent>
  <capability type='scsi_host'>
    <host>5</host>
    <capability type='fc_host'>
      <wwnn>20000000c9848140</wwnn>
      <wwpn>10000000c9848140</wwpn>
    </capability>
    <capability type='vport_ops' />
  </capability>
</device>
</pre>
 
HBAs that are capable of creating virtual adapters will have a
capability type='vport_ops'.
 
Once you know the node device name of the parent HBA, create a file
containing XML describing the virtual HBA you want to create:
 
<pre>
<device>
  <parent>pci_10df_fe00_0_scsi_host</parent>
  <capability type='scsi_host'>
    <capability type='fc_host'>
      <wwpn>1111222233334444</wwpn>
      <wwnn>5555666677778888</wwnn>
    </capability>
  </capability>
</device>
</pre>
 
The parent element is the name of the parent HBA as listed by virsh
nodedev-list.  wwpn and wwnn are, as you would expect, the WWNN and
WWPN for the virtual HBA to be created.  Libvirt does not do any
validation of the WWPN/WWNN; invalid WWNs are rejected by the kernel
and libvirt reports the failure. The error reported by the kernel is
somewhat misleading, however:
 
<pre>
# virsh nodedev-create badwwn.xml
error: Failed to create node device from badwwn.xml
error: Write of '1111222233334444:5555666677778888' to '/sys/class/fc_host/host6/vport_create' during vport create/delete failed: No such file or directory
</pre>
 
To create the new virtual HBA, feed the file to virsh:
 
<code>virsh nodedev-create new.xml</code>
 
If the operation succeeds, you'll get a message similar to:
 
<pre>
# virsh nodedev-create dpa/newhost.xml
Node device pci_10df_fe00_0_scsi_host_0_scsi_host created from new.xml
</pre>
 
and you will see the new HBA in the OS.  The create command output
gives you the node device name of the newly created device. 
 
To destroy the device, use virsh nodedev-destroy:
 
<pre>
# virsh nodedev-destroy pci_10df_fe00_0_scsi_host_0_scsi_host
Destroyed node device 'pci_10df_fe00_0_scsi_host_0_scsi_host'
</pre>
 
and you will see the HBA disappear from the OS.
 
 


== User Experience ==
== User Experience ==
Line 116: Line 261:


== Release Notes ==
== Release Notes ==
 
This functionality adds the ability in libvirt to discover storage on a per-SCSI-host basis and issue NPIV operations.  This enables administrators to discover, configure and provision storage for virtual machines without having to use multiple tools.
Fedora 11 adds the ability in libvirt to discover storage on a per-SCSI-host basis, issue NPIV operations and configure multipath devices.  This enables administrators to discover, configure and provision storage for virtual machines without having to use multiple tools.


== Comments and Discussion ==
== Comments and Discussion ==


<!--
* See [[Talk:Features/Shared Network Interface]]


[[Category:Virtualization|Shared Network Interface]]
* See [[Talk:Features/VirtStorageManagement]]
[[Category:F11 Virt Features|Shared Network Interface]]


[[Category:FeaturePageIncomplete]]
[[Category:Virtualization|Virtualized Storage Management]]
Category:FeatureReadyForWrangler -->
[[Category:F12 Virt Features|Virtualized Storage Management]]
[[Category:FeatureAcceptedF12]]

Latest revision as of 18:10, 5 August 2009

Summary

Enable VM hosts to discover new SAN storage and issue NPIV operations.

Owner

Current status

  • Targeted release: Fedora 12
  • Last updated: 2009-08-05
  • Percentage of completion: 100%

Detailed Description

Background

Guest virtual machines can currently use SAN storage, but administrators must do the storage configuration manually using separate tools from libvirt. This feature will permit administrators to discover storage and present it to virtual machines using libvirt.

Datacenter operations are usually split along functional lines: the server administration team, the SAN administration team, and others (network, etc.) not relevant to this discussion. Within the server admin group, there are often sub-groups for each OS. When a new application is deployed the SAN admins provision the storage and notify whichever OS team is responsible for the server to proceed with the OS install.

There may be more or less information transmitted from the SAN admins to the server admins when the storage becomes available. The minimal message is something along the lines of, "I've provisioned the LUNs, and you should be able to see them now. The LUNs are 27, 28, 29 and 53." The server admin may not know what targets or hosts the new storage is accessible through, but a rescan of all host adapters will show up new logical units of the size requested with numbers 27, 28, 29 and 53 on some target on some host, and the server admin assumes, usually reasonably, that these logical units are the new storage.

While libvirt is currently capable of using SAN storage, it lacks the ability to trigger scans for new storage and create virtual host adapters using NPIV. The OS admin team that manages the VM host must get involved to get the VM host to recognize the new storage. Giving libvirt the ability to manage storage allows the OS admin team responsible for the guest OS to complete the VM build out itself.

Implementation

The libvirt APIs already permit storage discovery and pool creation. These functions will be extended to discover and rescan storage on a per-SCSI-host basis. The node device APIs will be extended to create and destroy vitrual adapters using NPIV.

Benefit to Fedora

Administrators will be able to provision storage for VMs from the single set of tools that they are already using to manage the VMs.

Scope

As described above, changes are required in libvirt. Eventually the tools using libvirt will need to be updated to take advantage of the new features, but that is not within the scope of this work.

TODO

Completed

  • SCSI storage discovery and rescan are complete.
  • The creation and destruction of virtual HBAs using NPIV is complete.

How To Test

Use cases

Discover SAN storage

Provision a new logical unit on iSCSI or fibre channel storage. Use virsh to trigger a scan for it, and confirm that it appears correctly.

To discover logical units on a particular HBA, create a pool for that HBA using:

virsh pool-create hbapool.xml

where hbapool.xml contains:

<pool type="scsi">
  <name>host6</name>
  <source>
    <adapter name="host6"/>
  </source>
  <target>
    <path>/dev/disk/by-id</path>
  </target>
</pool>

Confirm that all the appropriate logical units are visible as volumes with:

virsh vol-list host6

After creating the pool, add a new logical unit on a target that's visible on that host and refresh the pool with:

virsh pool-refresh host6

and confirm that the new storage is visible. Note that the refresh code only scans for new LUs on existing targets and does not issue a LIP to discover new targets as that would be disruptive to I/O.

Create and destroy a virtual HBA with NPIV

Issue an NPIV create call and confirm that the VM host has instantiated a new host adapter and that any storage zoned to it is usable.

To create virtual HBAs using libvirt, it is of course necessary to have an NPIV capable HBA and switch. You can confirm that you have those by manually creating a new HBA by an echo into sysfs.

The file you echo into may be in one of two places, depending on which kernel version you have. On recent kernels it's in:

/sys/class/fc_host/hostN

on older kernels it's in:

/sys/class/scsi_host/hostN

Note also that the example WWN given below is bogus. If you try to use it, the kernel will reject it. You should pick a WWN that makes sense for your SAN.

For example on a recent kernel:

echo '1111222233334444:5555666677778888' > /sys/class/fc_host/host5/vport_create

where '1111222233334444:5555666677778888' is the WWPN:WWNN and host5 is the physical HBA you want to use to create the virtual HBA. If the create is successful, you'll get a new HBA in the system with the next available host number.

You can then destroy the test virtual HBA with:

echo '1111222233334444:5555666677778888' > /sys/class/fc_host/host5/vport_delete

Testing the libvirt API

The libvirt API implementation is intended to be used by client applications, but the functionality can be tested with virsh.

Creating a new virtual adapter using virsh is a two step process. First, find the node device name of the HBA that's going to be used to create the virtual adapter. You can get a list of all the HBAs on your system with:

virsh nodedev-list --cap=scsi_host

For example:

# virsh nodedev-list --cap=scsi_host
pci_10df_fe00_0_scsi_host
pci_10df_fe00_0_scsi_host_0
pci_10df_fe00_scsi_host
pci_10df_fe00_scsi_host_0
pci_10df_fe00_scsi_host_0_scsi_host
pci_10df_fe00_scsi_host_0_scsi_host_0

Dump the XML for each HBA until you find the host number of the physical HBA you want to use:

# virsh nodedev-dumpxml pci_10df_fe00_scsi_host
<device>
  <name>pci_10df_fe00_scsi_host</name>
  <parent>pci_10df_fe00</parent>
  <capability type='scsi_host'>
    <host>5</host>
    <capability type='fc_host'>
      <wwnn>20000000c9848140</wwnn>
      <wwpn>10000000c9848140</wwpn>
    </capability>
    <capability type='vport_ops' />
  </capability>
</device>

HBAs that are capable of creating virtual adapters will have a capability type='vport_ops'.

Once you know the node device name of the parent HBA, create a file containing XML describing the virtual HBA you want to create:

<device>
  <parent>pci_10df_fe00_0_scsi_host</parent>
  <capability type='scsi_host'>
    <capability type='fc_host'>
      <wwpn>1111222233334444</wwpn>
      <wwnn>5555666677778888</wwnn>
    </capability>
  </capability>
</device>

The parent element is the name of the parent HBA as listed by virsh nodedev-list. wwpn and wwnn are, as you would expect, the WWNN and WWPN for the virtual HBA to be created. Libvirt does not do any validation of the WWPN/WWNN; invalid WWNs are rejected by the kernel and libvirt reports the failure. The error reported by the kernel is somewhat misleading, however:

# virsh nodedev-create badwwn.xml
error: Failed to create node device from badwwn.xml
error: Write of '1111222233334444:5555666677778888' to '/sys/class/fc_host/host6/vport_create' during vport create/delete failed: No such file or directory

To create the new virtual HBA, feed the file to virsh:

virsh nodedev-create new.xml

If the operation succeeds, you'll get a message similar to:

# virsh nodedev-create dpa/newhost.xml
Node device pci_10df_fe00_0_scsi_host_0_scsi_host created from new.xml

and you will see the new HBA in the OS. The create command output gives you the node device name of the newly created device.

To destroy the device, use virsh nodedev-destroy:

# virsh nodedev-destroy pci_10df_fe00_0_scsi_host_0_scsi_host
Destroyed node device 'pci_10df_fe00_0_scsi_host_0_scsi_host'

and you will see the HBA disappear from the OS.


User Experience

See the previous section.

Dependencies

None, outside of the implementation efforts detailed above.

Contingency Plan

This functionality is independent of all other features. If it is not ready, administrators can continue to configure storage manually.

Documentation

Release Notes

This functionality adds the ability in libvirt to discover storage on a per-SCSI-host basis and issue NPIV operations. This enables administrators to discover, configure and provision storage for virtual machines without having to use multiple tools.

Comments and Discussion