From Fedora Project Wiki
(New page: <!-- All fields on this form are required to be accepted by FESCo. We also request that you maintain the same order of sections so that all of the feature pages are uniform. --> <!-- Th...)
 
(added selinux advisory note)
 
(20 intermediate revisions by 5 users not shown)
Line 4: Line 4:
<!-- The actual name of your feature page should look something like: Features/YourFeatureName.  This keeps all features in the same namespace -->
<!-- The actual name of your feature page should look something like: Features/YourFeatureName.  This keeps all features in the same namespace -->


= Feature Name =
= GFS2 =
A stable version of the GFS2 cluster filesystem


== Summary ==
== Summary ==
<!-- A sentence or two summarizing what this feature is and what it will do.  This information is used for the overall feature summary page for each release. -->
<!-- A sentence or two summarizing what this feature is and what it will do.  This information is used for the overall feature summary page for each release. -->


A cluster filesystem allowing simultaneous access to shared storage from multiple nodes, designed for SAN environments. It is also possible to use GFS2 as a single node (local) filesystem by selecting the lock_nolock lock "module".
A cluster filesystem allowing simultaneous access to shared storage from multiple nodes, designed for SAN environments. It is also possible to use GFS2 as a single node (local) filesystem by selecting the "lock_nolock" locking protocol.


== Owner ==
== Owner ==
Line 21: Line 20:


== Current status ==
== Current status ==
* Targeted release: [[Releases/{{FedoraVersion||next}} | {{FedoraVersion|long|next}} ]]  
* Targeted release: [[Releases/{{FedoraVersion||next}} | {{FedoraVersion|long|next}} ]]
* Last updated: (04/02/2009)
* Last updated: (30 Mar 2009)
* Percentage of completion: 90%
* Percentage of completion: 100%


<!-- CHANGE THE "FedoraVersion" TEMPLATES ABOVE TO PLAIN NUMBERS WHEN YOU COMPLETE YOUR PAGE. -->
The previously pending patches have now gone upstream for 2.6.30. We thus have all the most important components of this feature in place.


== Detailed Description ==
== Detailed Description ==
Line 70: Line 69:
== User Experience ==
== User Experience ==
<!-- If this feature is noticeable by its target audience, how will their experiences change as a result?  Describe what they will see or notice. -->
<!-- If this feature is noticeable by its target audience, how will their experiences change as a result?  Describe what they will see or notice. -->
The GFS2 filesystem allows sharing of a filesystem across multiple nodes in an HA environment.


== Dependencies ==
== Dependencies ==
<!-- What other packages (RPMs) depend on this package?  Are there changes outside the developers' control on which completion of this feature depends?  In other words, completion of another feature owned by someone else and might cause you to not be able to finish on time or that you would need to coordinate?  Other upstream projects like the kernel (if this is not a kernel feature)? -->
<!-- What other packages (RPMs) depend on this package?  Are there changes outside the developers' control on which completion of this feature depends?  In other words, completion of another feature owned by someone else and might cause you to not be able to finish on time or that you would need to coordinate?  Other upstream projects like the kernel (if this is not a kernel feature)? -->


This feature depends on the cman package, which is already part of Fedora.
This feature depends on the cman package, the corosync package and the dlm kernel module, which are already part of Fedora.


== Contingency Plan ==
== Contingency Plan ==
<!-- If you cannot complete your feature by the final development freeze, what is the backup plan?  This might be as simple as "None necessary, revert to previous release behaviour."  Or it might not.  If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy.  -->
<!-- If you cannot complete your feature by the final development freeze, what is the backup plan?  This might be as simple as "None necessary, revert to previous release behaviour."  Or it might not.  If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy.  -->


If this is not ready in time, we can just push out that date at which we consider GFS2 stable. There are no other packages at the moment which depend on this feature.
If this is not ready in time, we can just push out that date at which we consider GFS2 stable. There are no other packages at the moment which depend on this feature. Bearing in mind that this is almost complete, it is fairly unlikely that we will have to do this.
 


== Documentation ==
== Documentation ==
<!-- Is there upstream documentation on this feature, or notes you have written yourself?  Link to that material here so other interested developers can get involved. -->
<!-- Is there upstream documentation on this feature, or notes you have written yourself?  Link to that material here so other interested developers can get involved. -->
* [http://en.wikipedia.org/wiki/Red_Hat_Cluster_Suite Cluster Suite] on Wikipedia
* [http://en.wikipedia.org/wiki/Global_File_System GFS/GFS2] on Wikipedia
* [http://sources.redhat.com/cluster/wiki/ Cluster Wiki] page
* [http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/filesystems/gfs2.txt;h=593004b6bbabaeee282b1041835f580cf12baa2e;hb=HEAD GFS2 kernel documentation] (a very basic introduction)
* [http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/filesystems/gfs2-glocks.txt;h=4dae9a3840bf9953ed5f12be7aae10625d6a5b99;hb=HEAD GFS2 Glock] documentation (if things go wrong, this explains what you need to know in order to find out why)


== Release Notes ==
== Release Notes ==
Line 88: Line 96:
<!-- The Fedora Release Notes inform end-users about what is new in the release.  Examples of past release notes are here: http://docs.fedoraproject.org/release-notes/ -->
<!-- The Fedora Release Notes inform end-users about what is new in the release.  Examples of past release notes are here: http://docs.fedoraproject.org/release-notes/ -->
<!-- The release notes also help users know how to deal with platform changes such as ABIs/APIs, configuration or data file formats, or upgrade concerns.  If there are any such changes involved in this feature, indicate them here.  You can also link to upstream documentation if it satisfies this need.  This information forms the basis of the release notes edited by the documentation team and shipped with the release. -->
<!-- The release notes also help users know how to deal with platform changes such as ABIs/APIs, configuration or data file formats, or upgrade concerns.  If there are any such changes involved in this feature, indicate them here.  You can also link to upstream documentation if it satisfies this need.  This information forms the basis of the release notes edited by the documentation team and shipped with the release. -->
{{ admon/warning | SELinux | As by April 2011, it is not advised to use GFS2 together with SELinux since it is not fully supported (see [[https://bugzilla.redhat.com/show_bug.cgi?id=698205 Bug #698205]]). }}
There are a few local file system operations that are not supported, or that are slightly different on GFS2. Here are the main things to watch out for:
* The flock() system call is not interruptible [https://bugzilla.redhat.com/show_bug.cgi?id=421321 Bug #421321] - maybe fixed before release.
* The fcntl() F_GETLK returns a pid which may, or may not be on the current node (there is no way to indicate the node on which the process exists with the current interface - beware if you have an application that uses this interface to get a pid to send signals to).
* Leases are not supported with lock_dlm, but they are supported with lock_nolock.
* Locking is based upon a single lock per inode. Applications which either write to a single file from multiple nodes or which insert/remove lots of files from a single directory will be slow. This is the single most frequently asked question regarding GFS/GFS2 performance and often occurs in relation to email/imap spool directories. The answer in each case is to break up the single large spool into separate directories, and to try to keep each set of files "local" to one node, as far as possible. Likewise, don't try to mmap() a file and use it as distributed shared memory: it will work, but it will be so slow that it makes no sense to do so.
* If you've used previous releases of GFS/GFS2 you might be wondering where the "lock modules" have got to. The answer is that they have been merged into the main GFS2 module, so you no longer need to load them separately.  The mount options have remained the same though. (N.B. The final part of this is still in the -nmw git tree, but it will be merged in the next kernel.org merge window).
* fallocate is not supported, but is on the TODO list [https://bugzilla.redhat.com/show_bug.cgi?id=455572 Bug #455572].
* XIP is not supported, but is also on the TODO list [https://bugzilla.redhat.com/show_bug.cgi?id=455570 Bug #455570]).
* FIEMAP is supported, but currently only for regular files and not for xattrs (again the xattr extension is on the TODO list).
* The internal glock state of GFS2 is accessible via debugfs.
* dnotify will work on a "same node" basis, but its use with GFS2 is not recommended.
* inotify will work on a "same node" basis, but we don't currently recommend its use.


== Comments and Discussion ==
== Comments and Discussion ==
Line 96: Line 120:
----
----


[[Category:FeaturePageIncomplete]]
[[Category:FeatureAcceptedF11]]
 
<!-- When your feature page is completed and ready for review -->
<!-- When your feature page is completed and ready for review -->
<!-- remove Category:FeaturePageIncomplete and change it to Category:FeatureReadyForWrangler -->
<!-- remove Category:FeaturePageIncomplete and change it to Category:FeatureReadyForWrangler -->
<!-- After review, the feature wrangler will move your page to Category:FeatureReadyForFesco... if it still needs more work it will move back to Category:FeaturePageIncomplete-->
<!-- After review, the feature wrangler will move your page to Category:FeatureReadyForFesco... if it still needs more work it will move back to Category:FeaturePageIncomplete-->
<!-- A pretty picture of the page category usage is at: https://fedoraproject.org/wiki/Features/Policy/Process -->
<!-- A pretty picture of the page category usage is at: https://fedoraproject.org/wiki/Features/Policy/Process -->

Latest revision as of 21:25, 27 April 2011


GFS2

Summary

A cluster filesystem allowing simultaneous access to shared storage from multiple nodes, designed for SAN environments. It is also possible to use GFS2 as a single node (local) filesystem by selecting the "lock_nolock" locking protocol.

Owner

  • email: <swhiteho@redhat.com>
  • mailing list: <cluster-devel@redhat.com>

Current status

  • Targeted release: Fedora 42
  • Last updated: (30 Mar 2009)
  • Percentage of completion: 100%

The previously pending patches have now gone upstream for 2.6.30. We thus have all the most important components of this feature in place.

Detailed Description

GFS2 is part of the upstream kernel, but is still listed as experimental. The plan is that this will become stable before the release of F-11. Also the gfs2-utils package is part of Fedora already, and again we hope to declare this stable before F-11.

Benefit to Fedora

The main benefit is a stable cluster filesystem which works seamlessly with the Red Hat cluster infrastructure.

Scope

Most of the remaining work now is testing and bug fixing.

How To Test

Read the docs, create a filesystem, run an application on it, check to see whether there are any problems/bugs and if so report them via the usual bugzilla process.

We will also be running the Red Hat QE tests, some performance tests and basically anything else that we can get our hands on in order to try and cover as many possible tests as possible. Any filesystem test suite would be a good thing to test with, whether for performance or correctness. We also want to see lots of testing with real applications, Apache, Samba, NFS (over GFS2), exim, sendmail, yourfavouriteapplicationhere, etc. Basically anything that uses the filesystem.

You don't need any special hardware to do single node tests - you can create a filesystem in a single file and mount it loopback. For multiple node tests you will need some shared storage (iSCSI, FC, or some other kind of SAN) plus a method of fencing failed nodes (this can be done manually if you don't have any fencing hardware, but power switches and/or remote access controllers are recommended).

If everything is working correctly, the results should be exactly the same as you'd expect running the application on a local filesystem. One point to watch though is that many applications are not written to run in a clustered environment, so if you are expecting multiple copies of an application to share the same set of data files, then please check that the application does support this mode of operation first. Usually it will require some method for inter-node communication at the application level.


User Experience

The GFS2 filesystem allows sharing of a filesystem across multiple nodes in an HA environment.

Dependencies

This feature depends on the cman package, the corosync package and the dlm kernel module, which are already part of Fedora.

Contingency Plan

If this is not ready in time, we can just push out that date at which we consider GFS2 stable. There are no other packages at the moment which depend on this feature. Bearing in mind that this is almost complete, it is fairly unlikely that we will have to do this.


Documentation

Release Notes

SELinux
As by April 2011, it is not advised to use GFS2 together with SELinux since it is not fully supported (see [Bug #698205]).

There are a few local file system operations that are not supported, or that are slightly different on GFS2. Here are the main things to watch out for:

  • The flock() system call is not interruptible Bug #421321 - maybe fixed before release.
  • The fcntl() F_GETLK returns a pid which may, or may not be on the current node (there is no way to indicate the node on which the process exists with the current interface - beware if you have an application that uses this interface to get a pid to send signals to).
  • Leases are not supported with lock_dlm, but they are supported with lock_nolock.
  • Locking is based upon a single lock per inode. Applications which either write to a single file from multiple nodes or which insert/remove lots of files from a single directory will be slow. This is the single most frequently asked question regarding GFS/GFS2 performance and often occurs in relation to email/imap spool directories. The answer in each case is to break up the single large spool into separate directories, and to try to keep each set of files "local" to one node, as far as possible. Likewise, don't try to mmap() a file and use it as distributed shared memory: it will work, but it will be so slow that it makes no sense to do so.
  • If you've used previous releases of GFS/GFS2 you might be wondering where the "lock modules" have got to. The answer is that they have been merged into the main GFS2 module, so you no longer need to load them separately. The mount options have remained the same though. (N.B. The final part of this is still in the -nmw git tree, but it will be merged in the next kernel.org merge window).
  • fallocate is not supported, but is on the TODO list Bug #455572.
  • XIP is not supported, but is also on the TODO list Bug #455570).
  • FIEMAP is supported, but currently only for regular files and not for xattrs (again the xattr extension is on the TODO list).
  • The internal glock state of GFS2 is accessible via debugfs.
  • dnotify will work on a "same node" basis, but its use with GFS2 is not recommended.
  • inotify will work on a "same node" basis, but we don't currently recommend its use.

Comments and Discussion