From Fedora Project Wiki
(Add documentation)
 
(37 intermediate revisions by 5 users not shown)
Line 1: Line 1:
= Cluster =
= Updated Cluster Stack with Enhanced Failover Support =


== Summary ==
== Summary ==


Update the corosync/openais/cluster/clvm stack to the latest stable releases.
This is a significant update to the clustering stack - both for high availability and load balancing.
The new upstream versions brings a new and more powerful cluster stack to the end user.


== Owner ==
== Owner ==
* Name: [[User:fabbione| FabioMassimoDiNitto]]
* email: fdinitto@redhat.com
* Name: [[User:lon|LonHohberger]]
* Name: [[User:lon|LonHohberger]]
* email: lhh@redhat.com
* email: lhh@redhat.com
* Name: [[User:fabbione| FabioMassimoDiNitto]]
* email: fabbione@redhat.com


== Current status ==
== Current status ==
* Targeted release: [[Releases/{{FedoraVersion||next}} | {{FedoraVersion|long|next}} ]]  
* Targeted release: [[Releases/{{FedoraVersion||next}} | {{FedoraVersion|long|next}} ]]  
* Last updated: 9 Jul 2009
* Last updated: 26 Mar 2012
* Percentage of completion: 99%
* Percentage of completion: 100%


== Detailed Description ==
== Detailed Description ==
Upstream has been developing a new version of the cluster stack which is currently shipped with Fedora 16.  The major features are:
* Improved quorum subsystem which is integrated in to the Corosync Cluster Engine (95% complete)
* A new command-line interface for administration of both the Corosync Cluster Engine and the Pacemaker Cluster Resource Manager as well as monitoring (50% complete)
* Convergence on a single set of resource agents from the Linux clustering community
* Enhanced fencing support provided by stonith-ng, which is part of the Pacemaker project.
* Separation of GFS2-specific utilities in to a separate project (100%)
* Separation of DLM-specific utilities in to a separate project (90%)


Upstream has been developing the new version of the stack with several goals in mind, including better scalability, higher perfomance, increased reliability, more hardware support, and lays down the path of better integration with other cluster solutions (such as heartbeat and pacemaker) by sharing the same core compoments.
Note that this transition includes deprecations:
* The cluster package, including CMAN, DLM utilities, GFS2 utilities, fenced, and all associated libraries.
* The rgmanager package.  All users are advised to use Pacemaker for their failover needs.
* The openais package.  This provided some of the SA Forum AIS APIs, which were rarely used.
* The heartbeat package. Upstream development has stopped and equivalent or superior functionality is provided by the [http://corosync.org Corosync] package.
* The piranha package.  Superior functionality is provided by the [http://keepalived.org Keepalived] package.
* The ricci package.  Similar agent-style communication can be provided by Matahari or other projects.
* The luci package.  This will be replaced by Sunzi, although it is not clear this part will make Fedora 17.
* The clustermon package.  This was part of the ricci/luci administration stack and is no longer required.
* Several agents in the resource-agents package which previously relied on examination of cluster.conf at run-time.


== Benefit to Fedora ==
== Benefit to Fedora ==
 
The increased reliability and versatility of the cluster components included in Fedora 17 allow administrators to deploy Fedora in environments where greater availability and clustered file systems are required.
 
<!-- What is the benefit to the platform?  If this is a major capability update, what has changed?  If this is a new feature, what capabilities does it bring? Why will Fedora become a better distribution or project because of this feature?-->
<!-- What is the benefit to the platform?  If this is a major capability update, what has changed?  If this is a new feature, what capabilities does it bring? Why will Fedora become a better distribution or project because of this feature?-->
Removal of rgmanager matches the capabilities offered by several other Linux distributions and allows the Fedora community to consort efforts on a single failover stack.


== Scope ==
== Scope ==
<!-- What work do the developers have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
<!-- What work do the developers have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
This change impacts several utilities and other applications, including:
* The DLM, which used AIS Checkpoints to replicate data about POSIX locks (100% complete)
* Clustered LVM, which used CMAN APIs, will need to be ported to corosync. (?% complete)
* Fence-virt, which used CMAN APIs and AIS Checkpoints, will need to be ported to corosync. (0% complete)
* Pacemaker, which used CMAN APIs, will need to be ported to corosync (100% complete)
* Clustered mirroring, which uses AIS Checkpoints will need to be ported to corosync (?% complete)
* QPid, which used CMAN APIs, will need to be ported to corosync (?% complete)


== How To Test ==
== How To Test ==
Line 43: Line 65:
3. What are the expected results of those actions?
3. What are the expected results of those actions?
-->
-->
The initial use cases required for high availability are documented in the [http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Clusters_from_Scratch/ Clusters from Scratch] manual.
Information about how to run GFS2 in standalone mode will be provided as development nears completion.


== User Experience ==
== User Experience ==
<!-- If this feature is noticeable by its target audience, how will their experiences change as a result?  Describe what they will see or notice. -->
<!-- If this feature is noticeable by its target audience, how will their experiences change as a result?  Describe what they will see or notice. -->
Some differences will be noticeable by users:
* Users of the rgmanager package will need to learn the Pacemaker package, which is a more dynamic, complex, full-featured cluster resource manager.
* Users who relied on cluster.conf (whether using ccs, luci, or hand-editing) will need to learn:
** The crm and/or pcs commands and become familiar with the CIB, which is an XML database of the failover resource configuration combined with some run-time state information.
** The layout, syntax, and use of corosync.conf
* Users who relied on the clustat command will need to learn the following commands:
** cibadmin
** crm_mon
** crm_resource


== Dependencies ==
== Dependencies ==
<!-- What other packages (RPMs) depend on this package?  Are there changes outside the developers' control on which completion of this feature depends?  In other words, completion of another feature owned by someone else and might cause you to not be able to finish on time or that you would need to coordinate?  Other upstream projects like the kernel (if this is not a kernel feature)? -->
 
* corosync (2.0)
* pacemaker
* cluster-glue
* dlm (in review currently)
* libqb
* pcs (in review currently)
 
All packagers and upstreams have been informed upfront and worked together to achieve this goal.


== Contingency Plan ==
== Contingency Plan ==
<!-- If you cannot complete your feature by the final development freeze, what is the backup plan?  This might be as simple as "None necessary, revert to previous release behaviour."  Or it might not.  If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy.  -->
 
None necessary


== Documentation ==
== Documentation ==


* The Corosync Cluster Engine
* [http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Clusters_from_Scratch/ Clusters from Scratch]
** Provides a plug-in based cluster engine using the virtual synchrony communications model.
* [http://corosync.org/doku.php?id=faq Corosync FAQ]
*** Well considered plugin model and plugin API
*** Ultra high performance messaging, up to 300k messages/sec to a group of 32 nodes for service engine developers
*** Provides most services for service engine developers
*** Standard on many Linux distributions for portable application development
*** Works with mixed 32/64bit user applications, 32/64 bit big and little endian support
*** Full IPv4 or IPv6 support
** Default plug-in service engines and C APIs:
*** Closed Process Group Communication C API for cluster communications
*** Extended Virtual Synchrony passthrough C API for cluster communications at a lower level
*** Runtime Configuration Database C API for cluster configuration
*** Configuration C API for runtime cluster operations
*** Quorum engine C API for providing information related to quorum
** Reusable C Libraries or headers tuned for high performance and quality
*** Totem Single Ring and Redundant Ring Multicast Protocol library
*** Shared memory IPC library with sync and async communications models usable by other projects
*** logsys flight recorder which allows logging and tracing of complex applications and records state in core files or at user command library
*** 64 bit handle to data block mapping with handle verification header
 
* The openais Standards Based Cluster Framework which provides an implementation of the Service Availability Forum Application Interface Specification to provide high availability through application clustering:
** Packaging and design changes
*** All core features from openais related to clustering merged into The Corosync Cluster Engine.
*** openais modified to work as plugins to the Corosync Cluster Engine
** Provides implementation of various Service Availability Forum AIS Specifications as corosync service engines and C APIs:
*** Cluster Membership Service B.01.01
*** Checkpoint Service B.01.01
*** Event Service B.01.01
*** Message Service B.01.01
*** Distributed Lock Service B.01.01
*** Timer Service A.01.01
*** Experimental Availability Management Framework B.01.01
 
* cluster is now based on both corosync and openais and offers:
** pluggable configuration mechanism:
*** XML (default)
**** Configuration schema updated moved from Conga to cluster
*** LDAP
*** corosync/openais file format
** Cluster manager (cman):
*** Now runs as part of corosync
*** Provides quorum to all corosync subsystems
*** Enhanced configuration-free running
*** Better handling of configuration updates
*** Quorum disk (optional) now supports mixed-endian clusters
** fence / fence agents:
*** Improved daemon logging options
*** New operation 'list' that prints aliases with port numbers
*** Support for new devices and firmware: LPAR HMC v3, Cisco MDS, interfaces MIB (ifmib)
*** Fence agents produce resource-agent style metadata
*** Support for 'unfence' operation on boot
** rgmanager:
*** Better handling of configuration updates
*** Uses same logging configuration as the rest of the cluster stack
** clvmd:
*** Run-time switchable between cman or corosync/dlm cluster interfaces


== Release Notes ==
== Release Notes ==
<!-- The Fedora Release Notes inform end-users about what is new in the release. Examples of past release notes are here: http://docs.fedoraproject.org/release-notes/ -->
* The update to version 2.0 (needle) of the [http://www.corosync.org/doku.php Corosync Cluster Engine] offers:
<!-- The release notes also help users know how to deal with platform changes such as ABIs/APIs, configuration or data file formats, or upgrade concerns. If there are any such changes involved in this feature, indicate them here. You can also link to upstream documentation if it satisfies this need. This information forms the basis of the release notes edited by the documentation team and shipped with the release. -->
** API stability guarantees for the duration of Corosync 2.0's lifetime
*
** High performance, cluster-wide messaging.
*** The performance of corosync 2.0 is nearly 10 times that of corosync 1.x for smaller (<64kb) message sizes
*** The CPU utilization of corosync 2.0 is approximately 1/5th of corosync 1.x
** Improved quorum subsystem
** The confdb API has been replaced with the cmap API for configuration management.
* The update to version 1.1.7 of the Pacemaker Cluster Resource Manager offers:
** Improved fencing subsystem which can be used in either GFS2 standalone environments or resource-driven clusters (which may or may not include GFS2)
** Improved logging via the libqb framework


== Comments and Discussion ==
== Comments and Discussion ==
* See [[Talk:Features/Cluster]]
* See [[Talk:Features/Cluster]]


[[Category:FeaturePageIncomplete]]
[[Category:FeatureAcceptedF17]]

Latest revision as of 19:21, 27 February 2013

Updated Cluster Stack with Enhanced Failover Support

Summary

This is a significant update to the clustering stack - both for high availability and load balancing.

Owner

Current status

  • Targeted release: Fedora 42
  • Last updated: 26 Mar 2012
  • Percentage of completion: 100%

Detailed Description

Upstream has been developing a new version of the cluster stack which is currently shipped with Fedora 16. The major features are:

  • Improved quorum subsystem which is integrated in to the Corosync Cluster Engine (95% complete)
  • A new command-line interface for administration of both the Corosync Cluster Engine and the Pacemaker Cluster Resource Manager as well as monitoring (50% complete)
  • Convergence on a single set of resource agents from the Linux clustering community
  • Enhanced fencing support provided by stonith-ng, which is part of the Pacemaker project.
  • Separation of GFS2-specific utilities in to a separate project (100%)
  • Separation of DLM-specific utilities in to a separate project (90%)

Note that this transition includes deprecations:

  • The cluster package, including CMAN, DLM utilities, GFS2 utilities, fenced, and all associated libraries.
  • The rgmanager package. All users are advised to use Pacemaker for their failover needs.
  • The openais package. This provided some of the SA Forum AIS APIs, which were rarely used.
  • The heartbeat package. Upstream development has stopped and equivalent or superior functionality is provided by the Corosync package.
  • The piranha package. Superior functionality is provided by the Keepalived package.
  • The ricci package. Similar agent-style communication can be provided by Matahari or other projects.
  • The luci package. This will be replaced by Sunzi, although it is not clear this part will make Fedora 17.
  • The clustermon package. This was part of the ricci/luci administration stack and is no longer required.
  • Several agents in the resource-agents package which previously relied on examination of cluster.conf at run-time.

Benefit to Fedora

The increased reliability and versatility of the cluster components included in Fedora 17 allow administrators to deploy Fedora in environments where greater availability and clustered file systems are required. Removal of rgmanager matches the capabilities offered by several other Linux distributions and allows the Fedora community to consort efforts on a single failover stack.

Scope

This change impacts several utilities and other applications, including:

  • The DLM, which used AIS Checkpoints to replicate data about POSIX locks (100% complete)
  • Clustered LVM, which used CMAN APIs, will need to be ported to corosync. (?% complete)
  • Fence-virt, which used CMAN APIs and AIS Checkpoints, will need to be ported to corosync. (0% complete)
  • Pacemaker, which used CMAN APIs, will need to be ported to corosync (100% complete)
  • Clustered mirroring, which uses AIS Checkpoints will need to be ported to corosync (?% complete)
  • QPid, which used CMAN APIs, will need to be ported to corosync (?% complete)

How To Test

The initial use cases required for high availability are documented in the Clusters from Scratch manual.

Information about how to run GFS2 in standalone mode will be provided as development nears completion.

User Experience

Some differences will be noticeable by users:

  • Users of the rgmanager package will need to learn the Pacemaker package, which is a more dynamic, complex, full-featured cluster resource manager.
  • Users who relied on cluster.conf (whether using ccs, luci, or hand-editing) will need to learn:
    • The crm and/or pcs commands and become familiar with the CIB, which is an XML database of the failover resource configuration combined with some run-time state information.
    • The layout, syntax, and use of corosync.conf
  • Users who relied on the clustat command will need to learn the following commands:
    • cibadmin
    • crm_mon
    • crm_resource

Dependencies

  • corosync (2.0)
  • pacemaker
  • cluster-glue
  • dlm (in review currently)
  • libqb
  • pcs (in review currently)

All packagers and upstreams have been informed upfront and worked together to achieve this goal.

Contingency Plan

None necessary

Documentation

Release Notes

  • The update to version 2.0 (needle) of the Corosync Cluster Engine offers:
    • API stability guarantees for the duration of Corosync 2.0's lifetime
    • High performance, cluster-wide messaging.
      • The performance of corosync 2.0 is nearly 10 times that of corosync 1.x for smaller (<64kb) message sizes
      • The CPU utilization of corosync 2.0 is approximately 1/5th of corosync 1.x
    • Improved quorum subsystem
    • The confdb API has been replaced with the cmap API for configuration management.
  • The update to version 1.1.7 of the Pacemaker Cluster Resource Manager offers:
    • Improved fencing subsystem which can be used in either GFS2 standalone environments or resource-driven clusters (which may or may not include GFS2)
    • Improved logging via the libqb framework

Comments and Discussion