From Fedora Project Wiki

Open Sharedroot

Summary

The open sharedroot project provides abilities to boot multiple linux systems with the same root filesystem providing a single system filesystem based cluster.

Owner

Current status

  • Targeted release: 12
  • Last updated: 2009-07-27
  • Percentage of completion: 100%

Detailed Description

The purpose of this feature is to provide an open and flexible filesystem based single system image (SSI) linux cluster.

Currently the following filesystems can be used with fedora 11:

  • NFSv3, NFSv4
  • GFS2
  • Ocfs2
  • Ext3 as local filesystem

Basically it consists of three different software components and some small changes to the initprocess.

1. The initrd (comoonics-bootimage) to boot such a system. As it is more complex to boot from a cluster filesystem or nfs in order to use in a sharedroot configuration we need a new concept of the initrd.

2. The clustertools provide access to clusterfunctionality like querying the cluster for amount of nodes and configuration. This is organized under the software component comoonics-clustersuite.

3. The management tools for building a cdsl structure (context dependent symbolic links) and managing cdsl files. This is organized under the software component comoonics-cdsls. The cdsl concept is based on bindmounts.

The changes to the initprocess are already filed in bugzillas:

Benefit to Fedora

Being able to boot multiple nodes from the same root filesystem. Enabling fedora to be a filesystem based single system image cluster.

Scope

Except from the small changes that have to be accepted for the initprocess. Everything else is already working for FC11, RHEL5 and RHEL4. So only the migration to FC12 has to be made.

How To Test

Testenvironment

We suppose a preinstalled FC11 KVM machine (called installnode) which is installed as need be.

We suppose that this cluster is installed on a libvirt/KVM based Maschine as a two node cluster. Libvirt is installed as standard and the network that is NATed is called *default* and has the network 192.168.122.0/24 mapped (as it is default).

There is a NFS share /mnt/virtual/nfsosr/fc11 exported on the KVM Hostmaschine as follows:

 /mnt/virtual/nfsosr/fc11 192.168.122.0/255.255.255.0(rw,fsid=0,no_subtree_check,sync,no_root_squash)

The libvirt configurationfiles for the two vms can be found at (node1, node2):

Prerequisites

  • None except a libvirt configuration when using a virtualized libvirt based cluster.
  • A running DHCP/TFTP/PXE infrastructure for autobooting the cluster.

Install OSR packages

First install dracut at least version 0.7 (see dracut).

Install the following rpms:

Create a cluster configuration file

Create a cluster configuration file /etc/cluster/cluster.conf with the com_info tags.

Note, that the following cluster configuration still needs a valid fencing configuration for a properly working cluster::

   <?xml version="1.0"?>
   <cluster config_version="1" name="axqad109" type="gfs">
    <clusternodes>
            <clusternode name="node1" nodeid="1" votes="1">
                    <com_info>
                            <rootvolume fstype="nfs" name="192.168.122.1:/mnt/virtual/nfsosr/fc11"/>
                            <eth name="eth0" ip="192.168.122.171" mac="00:0C:29:C9:C6:F5" mask="255.255.255.0" gateway="192.168.122.1"/>
                    </com_info>
            </clusternode>
            <clusternode name="node2" nodeid="2" votes="2">
                    <com_info>
                            <rootvolume fstype="nfs" name="192.168.122.1:/mnt/virtual/nfsosr/fc11"/>
                            <eth name="eth0" ip="192.168.122.172" mac="00:0C:29:C9:C6:F5" mask="255.255.255.0" gateway="192.168.122.1"/>
                    </com_info>
            </clusternode>
    </clusternodes>
   </cluster>

Create the shared root filesystem

The shared root filesystem must be a moutable nfs export as a shared nfs resource.:

On installnode mount the new filesystem to '/mnt/virtual/nfsosr/fc11':

 mount -t nfs 192.168.122.1:/mnt/virtual/nfsosr/fc11 /mnt/newroot/  

Copy all data from the local installed RHEL5 root filesystem to the shared root filesystem:

 cp -ax / /mnt/newroot/
 cp /boot/*$(uname -r)* /home/marc/virtual/nfsosr/axqad109/boot/


Create some directories if need be::

 mkdir /mnt/newroot/proc
 mkdir /mnt/newroot/sys

Create a new cdsl infrastructure on the shared root filesystem::

 com-mkcdslinfrastructure -r /mnt/newroot

Mount the local cdsl infrastructure::

 mount --bind /mnt/newroot/cluster/cdsl/1/ /mnt/newroot/cdsl.local/

Mount other deps to be able to chroot::

 mount --bind /mnt/newroot/proc /mnt/newroot/proc
 mount --bind /mnt/newroot/sys /mnt/newroot/sys
 mount --bind /mnt/newroot/dev /mnt/newroot/dev
 chroot /mnt/newroot

Make '/var' hostdependent::

 com-mkcdsl -a /var 
 

Make '/var/lib' shared again::

 com-mkcdsl -s /var/lib

Make '/etc/sysconfig/network' hostdependent::

 com-mkcdsl -a /etc/sysconfig/network

Edit the hostdependent network files and change the hostnames:

 vi /cluster/cdsl/?/etc/sysconfig/network

Create '/etc/mtab' link to '/proc/mounts'::

 cd /etc/
 rm -f mtab
 ln -s /proc/mounts mtab

Remove cluster network configuration::

 rm -f /etc/sysconfig/network-scripts/ifcfg-eth0


Modify '/mnt/newroot/etc/fstab'::

 devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
 tmpfs                   /dev/shm                tmpfs   defaults        0 0
 proc                    /proc                   proc    defaults        0 0
 sysfs                   /sys                    sysfs   defaults        0 0

Disable selinux:

 [root@install-node3 comoonics]# cat /etc/sysconfig/selinux 
 # This file controls the state of SELinux on the system.
 # SELINUX= can take one of these three values:
 #       enforcing - SELinux security policy is enforced.
 #       permissive - SELinux prints warnings instead of enforcing.
 #       disabled - No SELinux policy is loaded.
 SELINUX=disabled
 # SELINUXTYPE= can take one of these two values:
 #       targeted - Targeted processes are protected,
 #       mls - Multi Level Security protection.
 SELINUXTYPE=targeted

Disable the NetworkManager:

 chkconfig NetworkManager off

Create boot configuration

Create boot configuration based on PXE or any other possibility.

Create Shared Root initrd

On installnode create the shared root initrd into the shared boot filesystem::

 dracut -f -a "osr osr-cluster" /boot/initrd_sr-$(uname -r).img $(uname -r)

Clean up

On installnode exist the chroot and:

 umount /mnt/newroot/cdsl.local
 umount /mnt/newroot/dev
 umount /mnt/newroot/proc
 umount /mnt/newroot/sys
 umount /mnt/newroot

Boot the nodes

On the host node boot the nodes:

 virsh create <path>/node1.xml
 virsh create <path>/node2.xml
   

You can now use those two node as it would be one.

Have Fun !!


User Experience

This project has been devolped since 8 years. We know of some hundreds of RHEL4/5, FC11 clusters that are running productivly for years. This concept is also supported by Red Hat on RHEL.

Dependencies

See bugzillas above. Basically changes are needed in initscripts and SysVinit (some are already integrated in this package).

Contingency Plan

None necessary.

Documentation

Red Hat Magazine: http://www.redhat.com/magazine/021jul06/features/gfs_update/

Red Hat/ATIX/SAP Whitepaper: http://www.redhat.com/f/pdf/ha-sap-v1-6-4.pdf

Release Notes

  • Fedora now provides the ability to create filesystem based Single System Image Clusters. An server with a shareable root filesystem (only NFS3/4 up to now) will be able to share the root filesystem with multiple other nodes. Hostdependent files and directory can also be managed (see Features/Opensharedroot).

Comments and Discussion