No edit summary |
|||
Line 16: | Line 16: | ||
# New package dnf-plugin-dnf: | # New package dnf-plugin-dnf: | ||
## Code is written | ## Code is written | ||
## | ## Github repo needs to be published | ||
## Plugin needs to be packaged in Fedora | ## Plugin needs to be packaged in Fedora | ||
Line 42: | Line 42: | ||
The outcome is intended to be the same, but the order of operations is different. | The outcome is intended to be the same, but the order of operations is different. | ||
# Decompression happens inline with download. This has a positive effect on resource usage: downloads are typically limited by bandwidth. Decompression and writing the full data into a single file per rpm is essentially free. Additionally: if there is more than one download at a time, a multi-CPU system can be better utilized. | # Decompression happens inline with download. This has a positive effect on resource usage: downloads are typically limited by bandwidth. Decompression and writing the full data into a single file per rpm is essentially free. Additionally: if there is more than one download at a time, a multi-CPU system can be better utilized. All compression types supported in RPM work because this uses the rpm I/O functions. | ||
# RPMs are cached on local storage between downloading and installation time as normal. This allows DNF to defer actual RPM installation to when all the RPM are available. This is unchanged. | # RPMs are cached on local storage between downloading and installation time as normal. This allows DNF to defer actual RPM installation to when all the RPM are available. This is unchanged. | ||
# The file format for RPMs is different with Copy on Write. The headers are indentical, but the payload is different. There is also a footer. | # The file format for RPMs is different with Copy on Write. The headers are indentical, but the payload is different. There is also a footer. | ||
## Files are converted (“transcoded”) locally during download using <code>/usr/bin/rpm2extents</code> (part of rpm codebase). The format is not intended to be “portable” - i.e. | ## Files are converted (“transcoded”) locally during download using <code>/usr/bin/rpm2extents</code> (part of rpm codebase). The format is not intended to be “portable” - i.e. copying the files from the cache is not supported. | ||
## Regular RPMs use a compressed .cpio based payload. In contrast, extent based RPMs contain uncompressed data aligned to the fundamental page size of the architecture, e.g. | ## Regular RPMs use a compressed .cpio based payload. In contrast, extent based RPMs contain uncompressed data aligned to the fundamental page size of the architecture, e.g. 4KiB on x86_64. This alignment is required for <code>FICLONERANGE</code> to work. Only files are represented in the payload, other directory entries like symlinks, device nodes etc are constructed entirely from rpm header information. Files are referenced by their digest, so identical files are de-duplicated. | ||
## The footer currently has three sections | |||
### Table of original (rpm) file digests, used to validate the integrity of the download in dnf. | |||
### Table of digest → offset used when actually installing files. | |||
### Signature 8 bytes at the end of the file, used to differentiate between traditional RPMs and extent based. | |||
=== Notes === | === Notes === | ||
Line 53: | Line 57: | ||
## Each file with content has a digest. Originally this was md5, but today it’s usually sha256. In normal RPM this is only used to verify the integrity of files, e.g. <code>rpm -V</code>. With CoW we use this as a content key. | ## Each file with content has a digest. Originally this was md5, but today it’s usually sha256. In normal RPM this is only used to verify the integrity of files, e.g. <code>rpm -V</code>. With CoW we use this as a content key. | ||
## There is/are one or two digests (<code>PAYLOADDIGEST</code> and <code>PAYLOADDIGESTALT</code>) covering the payload archive (compressed cpio). The header value is preserved, but transcoded RPMs do not preserve the original structure so RPM’s pre-installation verification (controlled by <code>%_pkgverify_level</code> will fail. <code>dnf-plugin-cow</code> disables this check in dnf because it verifies the whole file digest which is captured during download/transcoding. The second one is likely used for delta rpm. | ## There is/are one or two digests (<code>PAYLOADDIGEST</code> and <code>PAYLOADDIGESTALT</code>) covering the payload archive (compressed cpio). The header value is preserved, but transcoded RPMs do not preserve the original structure so RPM’s pre-installation verification (controlled by <code>%_pkgverify_level</code> will fail. <code>dnf-plugin-cow</code> disables this check in dnf because it verifies the whole file digest which is captured during download/transcoding. The second one is likely used for delta rpm. | ||
# This is | # This is untested, and possibly incompatible with delta RPM (drpm). The process for reconstructing an rpm to install from a delta is expensive from both a CPU and I/O perspective, while only providing marginal benefits on download size. It is expected that having delta rpm enabled (which is the default) will be handled gracefully. | ||
# Disk space requirements are expected to be marginally higher than before: all new packages or updates will consume their installed size before installation instead of about half their size (regular rpms with payloads still cost space). | # Disk space requirements are expected to be marginally higher than before: all new packages or updates will consume their installed size before installation instead of about half their size (regular rpms with payloads still cost space). | ||
# <code>dnf-plugin-reflink</code> will fall back to simple file copying when the destination path is not on the same filesystem/subvolume. A common example is <code>/boot</code> and/or <code>/boot/efi</code>. | # <code>dnf-plugin-reflink</code> will fall back to simple file copying when the destination path is not on the same filesystem/subvolume. A common example is <code>/boot</code> and/or <code>/boot/efi</code>. | ||
# The system will still work on other filesystem types, but will ''always'' fall back to simple copying. This is expected to be slightly slower than not enabling CoW because the source for copying will be the decompressed data. | # The system will still work on other filesystem types, but will ''always'' fall back to simple copying. This is expected to be slightly slower than not enabling CoW because the source for copying will be the decompressed data. | ||
# For systems that enable transparent filesystem compression: every file will continue to be decompressed from the original rpm, and then transparently re-compressed by the filesystem. There is no effective change here. There is a future project to investigate alternate distribution mechanics to provide parallel versions of file content pre-compressed in a filesystem specific format, reducing both CPU costs and I/O. It is expected that this will result in slightly higher network utilization because filesystem compression is purposely restricted to allow random I/O. | |||
# Current implementation of <code>dnf-plugin-cow</code> is in Python, but it looks possible to implement this in <code>libdnf</code> instead which would make it work in <code>packagekit</code>. | |||
=== Performance Metrics === | === Performance Metrics === | ||
Line 66: | Line 72: | ||
=== Terminology === | === Terminology === | ||
* '''Copy on Write (CoW)''' is a broad description of any technology that reduces or eliminates data duplication by sharing the data behind the scenes until one of the references makes changes. This has been a cornerstone technology in memory management in Unix systems. Here we are using it to specifically reference Copy on Write as supported in modern filesystems, e.g. | * '''Copy on Write (CoW)''' is a broad description of any technology that reduces or eliminates data duplication by sharing the data behind the scenes until one of the references makes changes. This has been a cornerstone technology in memory management in Unix systems. Here we are using it to specifically reference Copy on Write as supported in modern filesystems, e.g. btrfs, xfs and potentially others. | ||
* '''Reflink''' is the verb for duplicating stored data on a filesystem. See [https://man7.org/linux/man-pages/man2/ioctl_ficlonerange.2.html ioctl_ficlonerange(2)] for the specific call we use on Linux | * '''Reflink''' is the verb for duplicating stored data on a filesystem. See [https://man7.org/linux/man-pages/man2/ioctl_ficlonerange.2.html ioctl_ficlonerange(2)] for the specific call we use on Linux | ||
* '''Extent''' (based RPMs) refers to how payload file data is stored in within an RPM. Normal RPMs simply contain a compressed CPIO archive. Extent based RPMs contain the raw data uncompressed, which can be referenced with reflink. | * '''Extent''' (based RPMs) refers to how payload file data is stored in within an RPM. Normal RPMs simply contain a compressed CPIO archive. Extent based RPMs contain the raw data uncompressed, which can be referenced with reflink. | ||
Line 132: | Line 138: | ||
== Documentation == | == Documentation == | ||
Documentation will be available at https://github.com/facebookincubator/dnf-plugin-cow in the coming weeks | Documentation will be available at https://github.com/facebookincubator/dnf-plugin-cow in the coming weeks | ||
== Release Notes == | == Release Notes == |
Revision as of 23:46, 18 December 2020
DNF/RPM Copy on Write enablement for all variants
Summary
RPM Copy on Write provides a better experience for Fedora Users as it reduces the amount of I/O and offsets CPU cost of package decompression. RPM Copy on Write uses reflinking capabilities in btrfs, which is the default filesystem in Fedora 33.
Owners
- Name: Matthew Almond, Davide Cavalca
- Email: malmond@fb.com, dcavalca@fb.com
Current status
- Changes to
rpm
: in a fork of the main repo. No PR yet. - Changes to
librepo
: in a fork of the main repo. No PR yet. - New package dnf-plugin-dnf:
- Code is written
- Github repo needs to be published
- Plugin needs to be packaged in Fedora
- Targeted release: Fedora 34
- Last updated: tbd
- Tracker bug: tbd
- Release notes tracker: tbd
Detailed description
Installing and upgrading software packages is a standard part of managing the lifecycle of any operating system. For the entire lifecycle of Fedora, all software is packaged and distributed using the RPM file fomat. This proposal changes how software is downloaded and installed, leaving the distribution process unmodified.
Current process
- Resolve packaging request into a list of packages and operations
- Download and verify new packages
- Install and/or upgrade packages sequentially using RPM files, decompressing, and writing a copy of the new files to storage.
New process
- Resolve packaging request into a list of packages and operations
- Download and decompress packages into a locally optimized rpm file
- Install and/or upgrade packages sequentially using RPM files, using reference linking (reflinking) to reuse data already on disk.
The outcome is intended to be the same, but the order of operations is different.
- Decompression happens inline with download. This has a positive effect on resource usage: downloads are typically limited by bandwidth. Decompression and writing the full data into a single file per rpm is essentially free. Additionally: if there is more than one download at a time, a multi-CPU system can be better utilized. All compression types supported in RPM work because this uses the rpm I/O functions.
- RPMs are cached on local storage between downloading and installation time as normal. This allows DNF to defer actual RPM installation to when all the RPM are available. This is unchanged.
- The file format for RPMs is different with Copy on Write. The headers are indentical, but the payload is different. There is also a footer.
- Files are converted (“transcoded”) locally during download using
/usr/bin/rpm2extents
(part of rpm codebase). The format is not intended to be “portable” - i.e. copying the files from the cache is not supported. - Regular RPMs use a compressed .cpio based payload. In contrast, extent based RPMs contain uncompressed data aligned to the fundamental page size of the architecture, e.g. 4KiB on x86_64. This alignment is required for
FICLONERANGE
to work. Only files are represented in the payload, other directory entries like symlinks, device nodes etc are constructed entirely from rpm header information. Files are referenced by their digest, so identical files are de-duplicated. - The footer currently has three sections
- Table of original (rpm) file digests, used to validate the integrity of the download in dnf.
- Table of digest → offset used when actually installing files.
- Signature 8 bytes at the end of the file, used to differentiate between traditional RPMs and extent based.
- Files are converted (“transcoded”) locally during download using
Notes
- The headers are preserved bit for bit during transcoding. This preserves signatures. The signatures cover the main header blob, and the main header blob ensures the integrity of data in two ways:
- Each file with content has a digest. Originally this was md5, but today it’s usually sha256. In normal RPM this is only used to verify the integrity of files, e.g.
rpm -V
. With CoW we use this as a content key. - There is/are one or two digests (
PAYLOADDIGEST
andPAYLOADDIGESTALT
) covering the payload archive (compressed cpio). The header value is preserved, but transcoded RPMs do not preserve the original structure so RPM’s pre-installation verification (controlled by%_pkgverify_level
will fail.dnf-plugin-cow
disables this check in dnf because it verifies the whole file digest which is captured during download/transcoding. The second one is likely used for delta rpm.
- Each file with content has a digest. Originally this was md5, but today it’s usually sha256. In normal RPM this is only used to verify the integrity of files, e.g.
- This is untested, and possibly incompatible with delta RPM (drpm). The process for reconstructing an rpm to install from a delta is expensive from both a CPU and I/O perspective, while only providing marginal benefits on download size. It is expected that having delta rpm enabled (which is the default) will be handled gracefully.
- Disk space requirements are expected to be marginally higher than before: all new packages or updates will consume their installed size before installation instead of about half their size (regular rpms with payloads still cost space).
dnf-plugin-reflink
will fall back to simple file copying when the destination path is not on the same filesystem/subvolume. A common example is/boot
and/or/boot/efi
.- The system will still work on other filesystem types, but will always fall back to simple copying. This is expected to be slightly slower than not enabling CoW because the source for copying will be the decompressed data.
- For systems that enable transparent filesystem compression: every file will continue to be decompressed from the original rpm, and then transparently re-compressed by the filesystem. There is no effective change here. There is a future project to investigate alternate distribution mechanics to provide parallel versions of file content pre-compressed in a filesystem specific format, reducing both CPU costs and I/O. It is expected that this will result in slightly higher network utilization because filesystem compression is purposely restricted to allow random I/O.
- Current implementation of
dnf-plugin-cow
is in Python, but it looks possible to implement this inlibdnf
instead which would make it work inpackagekit
.
Performance Metrics
Ballpark performance difference is about half the duration for file download+install time. A lot of rpms are very small, so it’s difficult to see/measure. Larger RPMs give much clearer signal.
(Actual numbers/charts will be supplied in Jan 2021)
Terminology
- Copy on Write (CoW) is a broad description of any technology that reduces or eliminates data duplication by sharing the data behind the scenes until one of the references makes changes. This has been a cornerstone technology in memory management in Unix systems. Here we are using it to specifically reference Copy on Write as supported in modern filesystems, e.g. btrfs, xfs and potentially others.
- Reflink is the verb for duplicating stored data on a filesystem. See ioctl_ficlonerange(2) for the specific call we use on Linux
- Extent (based RPMs) refers to how payload file data is stored in within an RPM. Normal RPMs simply contain a compressed CPIO archive. Extent based RPMs contain the raw data uncompressed, which can be referenced with reflink.
Feedback
(pending initial discussion)
Benefit to Fedora
Faster package installs and upgrades
Scope
- Proposal owners:
- Merge changes to rpm, librepo to enable capabilities
- Add dnf-plugin-cow to available packages
- Test days
- Aid with documentation
- Other developers:
- rpm, librepo: review PRs as needed
- Release engineering: tbd
- Policies and guidelines: N/A
- Trademark approval: N/A
Upgrade/compatibility impact
None, RPM with CoW is not enabled by default.
Upgrades with keepcache
in dnf.conf will be able to use existing packages, but it will not convert them. This only happens at download time.
If a system is configured to keep packages in the cache (keepcache
in dnf.conf
) and dnf-plugin-cow
is removed then the packages will be unusable. Recommend dnf clean packages
to resolve this.
How to test
Enable RPM with CoW with
$ sudo dnf install dnf-plugin-cow ... $ sudo dnf install hello ... $ hello Hello, world!
There should be no end user visible changes, except timing.
User experience
No anticipated user visible changes in this change proposal. This makes the feature available, but does not enable it by default.
Dependencies
- A copy-on-write filesystem; this Change is primarily targeting btrfs, but RPM with CoW should work with XFS as well (untested)
- Most package install paths and the dnf package cache on the same filesystem / subvolume.
rpm
with Copy on Write patch set: https://github.com/malmond77/rpm/tree/cowlibrepo
with transcoding support: https://github.com/malmond77/librepo/tree/transcode_cow- dnf-plugin-reflink (a new package): https://github.com/facebookincubator/dnf-plugin-cow/
Contingency plan
- Contingency mechanism: will not include PR patches if not merged upstream, skip
dnf-plugin-cow
- Contingency deadline: Final freeze
- Blocks release? No
- Blocks product? No
Documentation
Documentation will be available at https://github.com/facebookincubator/dnf-plugin-cow in the coming weeks
Release Notes
RPM with CoW is not enabled by default. To enable it:
$ sudo dnf install dnf-plugin-cow