Introduction / Problem Description
LVM is currently being used by other software projects. These projects interface with LVM by calling the LVM commandline either by invoking a shell or calling the string-based liblvmcmd. These interface methods are problematic for the following reasons:
- Interacting with LVM requires creating and parsing command line strings
- Error handling is problematic
- High level CLI functionality may not meet the needs of all consumers
- CLI is complex, often leading to improper use or confusion
- The command line utilities have to be called with a system call type interface which could lead into problems. Strict string analysis and type checking is needed altogether with escaping values.
- Return values have to be parsed to get the wanted information. This requires to use the locale C to be able to parse the output.
- If an error occurs in a command line tool, used by an application with a system call type interface, the application only knows what happened if it is parsing error messages. Because of LANG=C for the call, the error message is in english but might not be useful for the user with a different language. It might be needed that the application understands what went wrong to propose a solution. There might also be the need to process lvm commands without user interaction (think of kickstart install).
- Changes in the output format of informational output or error messages will lead into parsing problems.
libLVM proposes to create a real API for use by application programs to overcome at least some of the aforementioned limitations.
libLVM features will be driven by the needs of the known primary consumers of the API which are anaconda, system-config-storage, libvirt, and others as detailed in Appendix A. Note that some requests from existing LVM consumers in context of libLVM are really requests for LVM design changes or functionality. While important to meeting the needs of libLVM consumers, many of these issues are orthogonal to libLVM and so will be listed separately, and most likely not appear in the intial release.
One of the main drivers of libLVM is the anaconda storage rewrite, and specifically, system-config-storage. Much of the contents of the initial release of libLVM centers around supporting this effort.
Architecture / Requirements
The architecture of libLVM is currently evolving along an object-based design. A CLI architecture was considered along with an object design, with pros / cons of each approach outlined. As of 12/3/2008, it was clear that most current stakeholders prefer the object based design.
The key objectives of the architecture are:
- Object based. This means handles to PV, VG, and LV objects are returned to the caller, and a get/set paradigm on the objects is used.
- Thread-safe.
- Fine-grained error handling, but minimize error code maintenance. Current direction is to try to use errno values, and not define libLVM-specific error codes. (This would mean APIs would return '0' for success.)
An initial architecture was outlined in an API Proposal.
The current API is defined in the lvm2app.h header file.
libLVM code is being slowly integrated into the upstream LVM2 project (See liblvm for more details.)
libLVM Release content
The functionality of libLVM will be divided into releases, with the first release targetting the most widely used functionality of existing consumers. The first release of libLVM is divided into "must have" and "nice to have" functionality.
The "must have" functionality includes the equivalent of the following CLI commands:
- pvs: pv_name,vg_name,pv_size,pv_free,pv_attr,pv_fmt,pv_uuid,vg_extent_size, dev_size, pe_start
- vgs: vg_name,vg_attr,vg_size,vg_extent_size,vg_free_count,max_lv,max_pv,vg_uuid,vg_free_count
- lvs: lv_name,vg_name,stripes,stripesize,lv_attr(activation),lv_uuid,devices,origin,snap_percent,seg_start,seg_size,vg_extent_size,lv_size,vg_free_count,vg_attr
- "pvcreate -ff -y -f pvname"
- "pvremove -ff -y pvname"
- "vgcreate -v -An -s pesize vgname pvname"
- "vgremove -f vgname
- "lvcreate -v -L lvsize -n lvname -An vgname"
- "lvremove -f -v"
- "vgscan -v"
- "vgextend vgname pvname"
- "vgreduce vgname pvname"
- --config option with devices filters for various commands
- lvchange -an path; lvchange -ay path
- lvresize -An -L lvsize -v lvname
- lvreduce -f -L size lvpath
- "vgmknodes -v"
The remaining "nice to have" LVM functionality is listed below. This functionality was found in some consumers but is prioritized to a later release of libLVM.
- lvcreate --snapshot --name lvname --size lvsize origin_path
- vgcreate -c clustered vgname
- vgchange -c clustered vgname
- lvextend -L lvsize path
- /etc/lvm/lvm.conf: get/set locking_type
- "vgchange -ay -an -v" (Note: This can be achieved with a combination of "must have" functionality - scan vgs, list LVs, lvactivate -ay -an)
- any other LVM cmdline
In addition to the above functionalty used by existing consumers, the following LVM RFEs were noted in discussions with existing LVM CLI consumers as items of interest. The first release of libLVM may or may not contain any of these items as they are open to debate, and they may not be possible in the timeframe of the first libLVM release.
- Allow duplicate volumes to be activated for virtualized guest image manipulation (http://bugzilla.redhat.com/show_bug.cgi?id=207470)
- Provide an interface for efficient scanning of disks for LVM metadata. LVM should take as input one or more devices to scan and not try to figure out the set of devices as it does today in its dev-cache subsystem. In this new mode of operation, with LVM's dev-cache disabled, it will be the application's responsibility to handle any errors or incomplete information that results from limiting LVM to a set of devices. The following BZs relate to the scanning problem: (5.3 bz ON_QA) http://bugzilla.redhat.com/show_bug.cgi?id=461771, http://bugzilla.redhat.com/show_bug.cgi?id=277271, http://bugzilla.redhat.com/show_bug.cgi?id=464877, http://bugzilla.redhat.com/show_bug.cgi?id=464724
- Cloning volumes: http://bugzilla.redhat.com/show_bug.cgi?id=409031
Deliverables
Deliverables will be a shared object library (liblvm.so) with matching header file (lvm.h). Documentation will be included and may be in the form of header file comments, coding examples, man pages, and/or web pages.
Schedule
The schedule for the first release of liblvm has moved from Fedora 11 to Fedora 12. See the Fedora 12 schedule and the F12 liblvm feature page for the latest information.
We will target the "must have" features for this first release, with an emphasis on the definition of the LVM objects and their attributes, querying the system for LVM information and providing the equivalent of the pvs, vgs, and lvs commands, and creation / deletion of objects.
Milestones in the project are as follows:
Project Plan 11/18/2008 - 11/26/2008: Draft project plan 12/02/2008 - 12/02/2008: Draft functional spec due 11/26/2008 - 12/12/2008: Plan review; update project plan and functional spec 12/12/2008 - 12/16/2008: Updated plan due with final functional spec 12/16/2008 - 12/22/2008: Final plan approval; monthly status review 01/01/2009 - 09/22/2009: libLVM development 03/01/2009: Milestone - initial skeleton liblvm build 07/28/2009: Milestone - liblvm alpha build push into rawhide 08/24/2009: Milestone - liblvm beta build push into rawhide
Implementation tasks will focus on a specific portion of libLVM and will be broken into no more than 2 week increments. Tasks will begin after plan approval, tenatively starting 1/5/2009.
Implementation Tasks 12/07/2008 - 12/13/2008: Initial library build and initialization w/unit test infrastructure 12/14/2009 - 12/20/2009: configuration (/etc/lvm.conf) API 12/21/2008 - 01/03/2009: Cleanup / misc (vacation for most people) 01/04/2009 - 01/10/2009: reporting commands cleanup 01/11/2009 - 01/17/2009: vg_read review/test; udev integration discussion; init code cleanup patch (is_long_lived) 01/18/2009 - 01/24/2009: vgs initial implementation; vg_read review 01/20/2009 - 01/20/2009: Milestone; F11 Alpha 01/25/2009 - 01/31/2009: vgs, vg_read, unlock_vg, init_locking 02/01/2009 - 02/07/2009: vgs, pvs, orphan locking 02/15/2009 - 02/21/2009: Unit test existing functionality 05/01/2009 - 05/14/2009: Finish vg_read patch review and initial object attribute implementation 05/01/2009 - 07/27/2009: vgcreate / vgremove; implicit pvcreate; lvcreate (linear) / lvremove; lvchange -ay -an; vgextend / vgreduce 07/28/2009 - 08/12/2009: cleanup / address feedback / work on remaining "must have" functionality 08/13/2009 - 08/19/2009: pvresize / lvresize 08/20/2009 - 08/31/2009: cleanup / address feedback / work on remaining "must have" functionality 09/01/2009 - 09/15/2009: Final unit testing
Responsibilities
The following people are identified as having a significant role in libLVM.
- Alasdair Kergon (agk@redhat.com): libLVM design review, signoff
- Thomas Woerner (twoerner@redhat.com): system-config-storage requirements, libLVM coding, unit testing, design
- Petr Rockai (prockai@redhat.com): libLVM coding, unit testing, design
- Dave Wysochanski (dwysocha@redhat.com): libLVM coding, unit testing, design
- Dave Lehman (dlehman@redhat.com): anaconda storage project requirements, anaconda signoff
- Peter Jones (pjones@redhat.com): anaconda requirements input
- David Zeuthen <davidz@redhat.com>: Device-kit disks requirements, signoff
- Tom Coughlan <coughlan@redhat.com>: libLVM planning, milestone signoff
Outstanding Issues
- Object model performance implications with large number (1000) of volumes requiring lots of transactions (Heinz / Thomas have discussed - needs more discussions, with specific operations and scenarios outlined)
- Translation of LVM error messages (twoerner to Investigate transifex for translation of error messages)
- Require 'force' parameter to API commands
- Using cmd->mem dm_pool for memory allocation of handles prevents them from being individually freed.
Creating a vg handle with read permission then later needing write permission requires a new API or ability to free handles. Currently we cannot free objects unless we free the whole command structure. Alternatives seem to be fixing the memory freeing and using a repeated vg_read, vg_close sequence or providing an API that converts read access to write.NOTE: I believe Milan has fixed this with his vg_release patches - thanks Milan!- Should the API deprecate use of PVs or will they be necessary for future LVM work?
- Signal handling for liblvm calls. How do we handle application signal handlers? Should we install our own signal handler in each liblvm call?
- Return code for 'set / change' APIs that attempt to set a value already committed (is it an error or not to set the same value stored?)
Risk Analysis / Mitigation
The key risks are:
- Refactoring of existing LVM tool code. We will mitigate this risk with upstream LVM nightly tests.
- Object-based locking. The object design of the API will break up CLI operations into smaller functional chunks, and locks will be tied to handles which may change the frequency and duration of locking. This may have specific risks to clustered LVM. Mitigation should include some form of clustered LVM regression tests done on a monthly basis during the key development period (either upstream nightly tests or RHTS).
Appendix A: Per consumer LVM functionality usage/needs
Device-kit-disks / udev (email discussions)
- dm/LVM tools must export information about dm/LVM devices in <KEY>=<value> format, to be imported into udev database (perhaps addition to 'vol_id' or separate 'lvm_id' prog?), BZ438604 - Add env-style reporting to devmapper + dmsetup
- at least basic device information in sysfs
- no device nodes/links created in /dev, if udev is active
- proper userspace events from the kernel if something changes
Anaconda (mostly code extraction)
- "pvs --noheadings --units b --nosuffix --options pv_name,vg_name,dev_size"
- "pvremove -ff -y -v pvname"
- "pvcreate -ff -y -v pvname"
- "pvcreate -ff -y -v node"
- "vgs --noheadings --units b --nosuffix --options vg_name,vg_size,vg_extent_size,vg_free"
- "vgcreate -v -An -s pesize vgname"
- "vgremove -v vgname"
- "vgscan -v"
- "vgmknodes -v"
- "vgchange -ay -v"
- "vgchange -an -v"
- "lvs --noheadings --units b --nosuffix --separator --options vg_name,lv_name,attr
- "lvdisplay -C --units b vg_name,lv_name,lv_size,origin"
- "lvcreate -v -L lvsize -n lvname -An vgname"
- "lvremove -f -v"
- "lvresize -An -L lvsize -v"
system-config-storage
- All functionality as listed for anaconda.
- Allow duplicate volumes to be activated for virtualized guest image manipulation.
- https://bugzilla.redhat.com/show_bug.cgi?id=207470
- Provide an interface for efficient scanning of disks for LVM metadata
(Needs more detail / bz)
libvirt (mostly code extraction)
- "vgchange -ay", "vgchange -an"
- lvs --separator , --noheadings --units b --unbuffered --nosuffix --options "lv_name,uuid,devices,seg_size,vg_extent_size" VGNAME
- pvs --noheadings -o pv_name,vg_name
- vgcreate
- pvcreate
- vgs --separator : --noheadings --units b --unbuffered --nosuffix --options "vg_size,vg_free" VGNAME
- vgremove -f
- pvremove
- lvcreate --name LVNAME -L SIZE
- lvremove -f
- vgscan
- Cloning volumes: https://bugzilla.redhat.com/show_bug.cgi?id=409031
conga (rhel5 code extraction)
- pvs --options pv_name,vg_name,pv_size,pv_free,pv_attr,pv_fmt,pv_uuid,vg_extent_size
- lvs --units b --options lv_name,vg_name,stripes,stripesize,lv_attr,lv_uuid,devices,origin,snap_percent,seg_start,seg_size,vg_extent_size,lv_size,vg_free_count,vg_attr
- pvdisplay -c
- vgs -o vg_name,vg_attr,vg_size,vg_extent_size,vg_free_count,max_lv,max_pv,vg_uuid
- lvsdisplay -c --units b
- lvs -o lv_name,vg_name,origin
- pvcreate -y -f path
- pvremove -y -f path
- vgcreate --physicalextentsize pesize -c clustered vgname
- vgremove vgname
- vgextend vgname pv_path
- vgreduce vgname pv_path
- vgchange -c clustered vgname
- lvcreate --name lvname --size lvsize vgname
- lvcreate --snapshot --name lvname --size lvsize origin_path
- lvchange -an path; lvchange -ay path
- lvremove --force path
- lvreduce -f -L size lvpath
- lvextend -L lvsize path
- /etc/lvm/lvm.conf: get/set locking_type
Appendix B: Miscellaneous functionality or use cases
lvm-devel requests
- Ability to display storage underlying one LV, equivalent to "lvdisplay --maps"; e.g. lvm_lv_get_pe_layout(lv_t lv, pe_list *list); reference: http://www.redhat.com/archives/lvm-devel/2009-March/msg00049.html