From Fedora Project Wiki
No edit summary
No edit summary
Line 211: Line 211:


==== Why don’t you fix the /usr situation by putting all the relevant binaries in /bin /sbin /lib and /lib64? ====
==== Why don’t you fix the /usr situation by putting all the relevant binaries in /bin /sbin /lib and /lib64? ====
and
Then you would share the toplevel directory hierarchy among all hosts. Hosts would need to mount /etc and /var for host-only versions. Especially /etc/fstab is not accessible, without adding information to the initramfs on how to mount it. For every host-only additional top level directory like /opt and /srv, you would have to have a mountpoint.
 
==== So, why don’t you just mount /usr from the initramfs and leave the files where they are? ====
==== So, why don’t you just mount /usr from the initramfs and leave the files where they are? ====
Ok, so imagine you have a /usr mounted from a network location and you want to update a package. So maybe you mount the master copy of /usr on your master machine and update /usr with your package manager. Then you provide a new copy of the master /usr to the other machines, when they reboot. They all have the new updated /usr now. But what about /sbin /bin /lib and /lib64? They still have the old binaries. No glibc security update for them. So, every machine has to update these directories via rsync or such (rpm will not work with a readonly /usr). This doubles the maintenance to keep both parts of the system in sync.
Ok, so imagine you have a /usr mounted from a network location and you want to update a package. So maybe you mount the master copy of /usr on your master machine and update /usr with your package manager. Then you provide a new copy of the master /usr to the other machines, when they reboot. They all have the new updated /usr now. But what about /sbin /bin /lib and /lib64? They still have the old binaries. No glibc security update for them. So, every machine has to update these directories via rsync or such (rpm will not work with a readonly /usr). This doubles the maintenance to keep both parts of the system in sync.

Revision as of 20:28, 26 January 2012

Move all to /usr

Summary

Provide a simple way of mounting almost the entire installed operating system read-only, atomically snapshot it, or share it between multiple hosts to save maintenance and space. Instead of spreading RPM package content all over the place in the filesystem, and artificially separate /bin from /usr/bin and /lib from /usr/lib, move all content to /usr and provide only symlinks in the root filesystem.

/usr on its own filesystem provides a lot of valuable options in custom setups. For historic reasons, we split-off more and more tools from /usr and put them in /. But, advanced features in today's systems can not really bootup with an empty /usr anymore. More and more fails in subtle ways in such setups.

Instead of moving more tools to /, we today already require /usr to be mounted from inside the initramfs, to be available before the real 'init' starts. The split of the root filesystem an /usr serves no purpose in Linux anymore and only complicates or prevents simple and more flexible setups.

Owner

Current status

  • Targeted release: Fedora 17
  • Last updated: 2011-11-22
  • Percentage of completion: 50%

Detailed Description

There is no way to reliably bring up a modern system with an empty /usr, there are two alternatives to fix it: copy /usr back to the rootfs or use an initramfs which can hide the split-off from the system.

Historically /bin, /sbin, /lib had the purpose to contain the utilities to mount /usr. This role can now be taken by the initramfs. Because the initramfs knows, where to find the root partition (which includes /etc), it can parse /etc/fstab and other configuration files and mount /usr before it finally switches the root partition and executes /usr/bin/init. From this point on init mounts the remaining partitions in /etc/fstab and the system starts as usual.

The long-term plan is to clean up the mess and confusion the current split of / vs. /usr has created. All tools will move back to /usr where they belong, and the rootfs will only contain compat-symlinks into /usr. Almost the entire system installed by packages will reside in /usr. This will split all non-host specific data to /usr. /usr can then be seen as the Unix System Resources partition (/System), which defines the base operating system (e.g. F18 or RHEL-7).

This new /usr could be mounted read-only by default, while the rootfs is read-write and contains only empty mount points, compat-symlinks to /usr and the host-specific data like /etc, /root, /srv. Compared to today's setups, the rootfs will be very small. The new /usr could also easily be shared read-only across several systems, and it would contain almost the entire system. Such setups are more efficient, can optionally provide a lot more security, are more flexible, provide more sane options for custom setups, and are much simpler to setup and maintain.

This leaves us with the following well-defined directories, which compose the base of the system:

  • /usr - installed system; shareable; possibly read-only
  • /etc - config data; non-shareable
  • /var - persistent data; non-shareable;
  • /run - volatile data; non-shareable; mandatory tmpfs filesystem
/
|-- etc
|-- usr
|   |-- bin
|   |-- sbin
|   |-- lib
|   `-- lib64
|-- run
|-- var
|-- bin -> usr/bin
|-- sbin -> usr/sbin
|-- lib -> usr/lib
`-- lib64 -> usr/lib64

Benefit to Fedora

  • Simpler and cleaner overall file system layout, with full compatibility.
  • Clear separation of operating system and host specific resources.
  • Improve compatibility with other Unixes/linux, no confusion about tools install locations, no $PATH fiddling, all possible paths to a binary will always work. All binaries will be available on both /usr and / thus minimizing compatibility problems.
  • Improve compatibility with build systems such as GNU autotools who never have been aware of the /usr split in the first place
  • Minimize difference to other Unixes, such as Solaris, which already did the same move
  • Isolate the vendor-supplied mostly read-only operating system resources from the rest, thus allow snapshotting of the OS, and easy lightweight container OS duplication

Scope

  • The ability to share /usr is especially useful for clusters and virtual machines.
  • The ability to mount /usr read-only (e.g. on read-only media) can add to the security of the machine.
  • The entire /usr can safely be snapshotted during upgrades.

How To Test

  • update a Fedora package with files in /bin, /sbin, /lib or /lib64 via yum

-> see symbolic links in /bin, /sbin, /lib or /lib64 pointing to the file /usr/bin /usr/sbin /usr/lib or /usr/lib64

or

  • install a fresh F17

-> see symbolic toplevel links:

 /lib -> usr/lib
 /lib64 -> usr/lib64
 /sbin -> usr/sbin
 /bin -> usr/bin

Ensure that basic shell operations work.

# /sbin/ifconfig |/bin/grep -i ip

User Experience

  • less toplevel directories

Dependencies

  • initramfs (dracut)
  • changes in selinux policies
  • repackaging of packages with content in /bin, /sbin, /lib*
  • alternatives symlinks?
  • filesystem rpm, toplevel symlinks

Roadmap

  • Provide a shell script to move all content from /bin, /sbin, /lib, /lib64 to /usr.
  • Add check to filesystem.rpm version 3, that refuses to install itself when /bin, /sbin, /lib, /lib64 is a directory. On new installation: create symlinks /bin -> usr/bin, /sbin -> usr/sbin, /lib -> usr/lib, /lib64 -> usr/lib64
  • Change the ~260 RPM packages with files in /bin, /sbin, /lib, /lib64 to install into /usr, and add Conflicts: filesystem < 3.
  • Change the SELinux policies.
  • Make sure dracut is able to mount needed filesystems specifies in /etc/fstab before starting systemd.

Buildsystem Transition

  • Disable mock root cache in buildsystem
  • Step 1: Build without filesystem conflict
 acl
 attr
 db4
 findutils
 gawk
 gettext
 gzip
 nss-softokn
 policycoreutils
 libdb
  • Step 2: Build
 coreutils (without filesystem conflict)
 filesystem
 util-linux
 bash
  • Step 3: Build
 udev
 systemd
  • Step 4: Build
 alsa-utils
 blktool
 davfs2
 ethtool
 fuse
 iputils
 isdn4k-utils
 libselinux
 nano
 nspr
 ncpfs
 plymouth
 psacct
 rp-pppoe
 vim
  • Step 5: Build with filesystem conflict
 acl
 attr
 db4
 findutils
 gawk
 gettext
 gzip
 libdb
 nss-softokn
 policycoreutils
 coreutils
  • Turn root cache back in buildsystem mock

Contingency Plan

  • We do not support to bootup with an empty /usr today, so moving things to /usr and have compat links in the rootfs should be low risk.

Documentation

Release Notes

  • With this release, packages will not install files anymore in the following directories: /bin /sbin /lib /lib64 and /usr/sbin.
  • Fresh installations of this release, will have the following symbolic links in the toplevel directory:
 /bin -> usr/bin
 /sbin -> usr/sbin
 /lib -> usr/lib

and for 64bit architectures

 /lib64 -> usr/lib64

Comments and Discussion

FAQ

What problem are you trying to solve?

We want to make /usr shareable in a sane way.

Additional benefits of this feature are:

  • less clutter across the filesystem
  • if you snapshot /usr before updating, you have snapshotted the OS at once.

What is currently broken with having /usr as a separate partition?

http://www.freedesktop.org/wiki/Software/systemd/separate-usr-is-broken

I don’t have /usr as a separate partition. What changes for me?

Nothing changes in functionality. All the old paths are reachable, because there a compat symlinks in place, which will not go away (at least not in the near future). All your scripts and binaries should work, like they did before. For the upgrade process to work, you will find /sbin, /bin, /lib and /lib64 mostly containing symbolic links. As soon, as these directories only contain symbolic links, the whole directory is replaced by only one symbolic link. These three or four toplevel symbolic links will stay there as long as the linux elf loader ABI is defined with “/lib/ld-linux.so.2” or their architecture specific counterpart like “/lib64/ld-linux-x86-64.so.2”, and as long as scripts use “#!/bin/sh”.

I have /usr as a separate partition. What changes for me?

Not sure, how you managed to do that. In general, having /usr as a separate partition does not really work right now. See http://www.freedesktop.org/wiki/Software/systemd/separate-usr-is-broken. But with this feature implemented, things will now come back to a sane and supported way of having a /usr mount point.

Why don’t you fix the /usr situation by putting all the relevant binaries in /bin /sbin /lib and /lib64?

Then you would share the toplevel directory hierarchy among all hosts. Hosts would need to mount /etc and /var for host-only versions. Especially /etc/fstab is not accessible, without adding information to the initramfs on how to mount it. For every host-only additional top level directory like /opt and /srv, you would have to have a mountpoint.

So, why don’t you just mount /usr from the initramfs and leave the files where they are?

Ok, so imagine you have a /usr mounted from a network location and you want to update a package. So maybe you mount the master copy of /usr on your master machine and update /usr with your package manager. Then you provide a new copy of the master /usr to the other machines, when they reboot. They all have the new updated /usr now. But what about /sbin /bin /lib and /lib64? They still have the old binaries. No glibc security update for them. So, every machine has to update these directories via rsync or such (rpm will not work with a readonly /usr). This doubles the maintenance to keep both parts of the system in sync.

You are doing it wrong! /bin and /sbin are there to rescue a broken /usr!

The most critical filesystem is /boot, because the kernel lives there. So the purpose of having /bin and /sbin for /usr repairing relied on _two_ working filesystems ( / and /boot). If either of them was broken, you were not able to rescue /usr. The role of the rescue system can easily be fulfilled by a rescue initramfs. So having the rescue initramfs in /boot, which contains the fsck utils, is in the same danger of becoming corrupted as the kernel. Now you only have to pull out your rescue CD, if /boot is corrupted and not if / is corrupted.

Then, let’s share /bin /sbin /lib /lib64 and /usr and mount them all from the initramfs!

Now, you get a feeling, that moving everything to /usr might make things easier....

Why don’t you move all /usr contents to / and forget about /usr?

Because this introduces a lot of new toplevel directories, which all have to be mount points then to be shared across other hosts.

Ok, but what about a root filesystem on the network and mounting local filesystems only?

Then you would share the toplevel directory hierarchy among all hosts. Hosts would need to mount /etc and /var for host-only versions. Especially /etc/fstab is not accessible, without adding information to the initramfs on how to mount it. For every host-only additional top level directory like /opt and /srv, you would have to have a mountpoint.