HA Cluster Infrastructure
The Ha Cluster Infrastructure in F11 includes major changes and new features including hundreds of bug fixing and a major boost in performance.
New features from upstream
- The Corosync Cluster Engine
- Provides a plug-in based cluster engine using the virtual synchrony communications model.
- Well considered plugin model and plugin API
- Ultra high performance messaging, up to 300k messages/sec to a group of 32 nodes for service engine developers
- Provides most services for service engine developers
- Standard on many Linux distributions for portable application development
- Works with mixed 32/64bit user applications, 32/64 bit big and little endian support
- Full IPv4 or IPv6 support
- Default plug-in service engines and C APIs:
- Closed Process Group Communication C API for cluster communications
- Extended Virtual Synchrony passthrough C API for cluster communications at a lower level
- Runtime Configuration Database C API for cluster configuration
- Configuration C API for runtime cluster operations
- Quorum engine C API for providing information related to quorum
- Reusable C Libraries or headers tuned for high performance and quality
- Totem Single Ring and Redundant Ring Multicast Protocol library
- Shared memory IPC library with sync and async communications models usable by other projects
- logsys flight recorder which allows logging and tracing of complex applications and records state in core files or at user command library
- 64 bit handle to data block mapping with handle verification header
- Provides a plug-in based cluster engine using the virtual synchrony communications model.
- The openais Standards Based Cluster Framework which provides an implementation of the Service Availability Forum Application Interface Specification to provide high availability through application clustering:
- Packaging and design changes
- All core features from openais related to clustering merged into The Corosync Cluster Engine.
- openais modified to work as plugins to the Corosync Cluster Engine
- Provides implementation of various Service Availability Forum AIS Specifications as corosync service engines and C APIs:
- Cluster Membership Service B.01.01
- Checkpoint Service B.01.01
- Event Service B.01.01
- Message Service B.01.01
- Distributed Lock Service B.01.01
- Timer Service A.01.01
- Experimental Availability Management Framework B.01.01
- Packaging and design changes
- cluster is now based on both corosync and openais and offers:
- pluggable configuration mechanism:
- XML (default)
- Configuration schema updated moved from Conga to cluster
- LDAP
- corosync/openais file format
- XML (default)
- Cluster manager (cman):
- Now runs as part of corosync
- Provides quorum to all corosync subsystems
- Enhanced configuration-free running
- Better handling of configuration updates
- Quorum disk (optional) now supports mixed-endian clusters
- fence / fence agents:
- Improved daemon logging options
- New operation 'list' that prints aliases with port numbers
- Support for new devices and firmware: LPAR HMC v3, Cisco MDS, interfaces MIB (ifmib)
- Fence agents produce resource-agent style metadata
- Support for 'unfence' operation on boot
- rgmanager:
- Better handling of configuration updates
- Uses same logging configuration as the rest of the cluster stack
- clvmd:
- Run-time switchable between cman or corosync/dlm cluster interfaces
- pluggable configuration mechanism:
Packaging changes
Lots of effort has been done to cleanup the packages and to make them as complete, intuitive and modular as possible, allowing also external entities to reuse most of the infrastructure without the requirement to pull the whole stack in.
With the new package reorganization, users will find easier to update their cluster. The introduction of fence-agents and resource-agents packages will avoid the pain for users to restarts cluster nodes for simple scripts updates.