From Fedora Project Wiki
(Change was previously announced on mailing list)
(More benchmark results. Added recommendation for selecting the compression algorith and it's parameters.)
Line 58: Line 58:


<!-- Expand on the summary, if appropriate.  A couple sentences suffices to explain the goal, but the more details you can provide the better. -->
<!-- Expand on the summary, if appropriate.  A couple sentences suffices to explain the goal, but the more details you can provide the better. -->
As of Fedora 31, the LiveOS/squashfs.img file on the installation image, is compressed with default settings of mksquashfs. The standard configuration is set to XZ algorithm with block size of 128k and BCJ filter enabled. Those parameters can be adjusted which will lead to a better compression ratio and/or reduction of the CPU usage at build time.
As of Fedora 31, the LiveOS/squashfs.img file on the installation image, is compressed with default block size of mksquashfs. The default block size is 128k. Additionaly, lorax sets BCJ filter depending on the architecture. Those parameters can be adjusted which will lead to a better compression ratio and/or reduction of the CPU usage at build time.
[[File:Squashfs- Build time vs size.png|Comparison of different SquashFS compression options]]
[[File:Compression vs SquashFS creation time.png|thumb|Comparison of different SquashFS compression options and their impact on the time required to generate the SquashFS filesystem]]
<blockquote>
 
Unless other specified, all tests use the default configuration plus additional parameters listed. In column A
This is simple to achieve. Recently, Lorax has gotten support for adjusting the compression options for mksquashfs via the configuration file. The file should be altered as following:
* The ext4 filesystem contains duplicate files. I used hardlink utility to de-duplicate them. 12.43s of hardlink run time is included
** I was unable to boot plain SquashFS images. Perhaps an issue with dracut initialization scripts: unable to find /dev/mapper/live-rw
</blockquote>
This is simple to achieve. Recently, Lorax has gotten support[1] for adjusting the compression options for mksquashfs via the configuration file. The file should be altered as following:
<pre>
<pre>
[compression]
[compression]
bcj = yes
bcj = no
args = -b 1M -Xdict-size 1M -no-recovery
args = -b 1M -Xdict-size 1M -no-recovery
</pre>
</pre>
Where -b 1M and -Xdict-size 1M are block and dictionary sizes respectively. Could be adjusted.
Where -b 1M and -Xdict-size 1M are block and dictionary sizes respectively; bcj -- branch-call-jump filter.
 
The compression algorithm is to be selected. Below you can see the  impact of the compression algorithms on the install time. With current kernel configuration:
[[File:Block size, compression vs install time.png|thumb|Different compression options and their impact on the installation time]]


Based on the results above, I'd suggest selecting the following ''optimal configuration'': XZ algorithm, with block size of 1MiB and without BCJ filter (plain xz -b 1M, without -Xbcj x86).
On the right, you can see the impact of the compression algorithms on installation time.
[[File:Compression vs installation time.png|thumb|Different compression options and their impact on the installation time]]
As can be seen from the picture on the right hand side, selecting 'plain xz -b 1M configuration' has minimal impact on the installation time and CPU usage during the installation. The compression will result in +6.51% or, in real terms, +24.94s additional installation time, compared to the savings of 141.77 MiB on the installation media.
I noticed, that even with maximum compression, CPU is not fully utilized during installation.
[[File:Alternative compressors -- squashfs.png|thumb|Alternative compression algorithms and the resulting filesystem size]]
== Benefit to Fedora ==
== Benefit to Fedora ==


Line 111: Line 110:
* Proposal owners:
* Proposal owners:
<!-- What work do the feature owners have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
<!-- What work do the feature owners have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
The build environment should have support for adjusting the Lorax configuration file. Lorax is a program that produces the LiveOS/squashfs.img file on the installation media.
The build environment should have support for adjusting the Lorax configuration file and -squashfs-only parameter. Lorax is a program that produces the LiveOS/squashfs.img file on the installation media.


One of the way to enable such customization is to introduce support in Pungi to pass -c option to Lorax.
One of the ways to enable such customization is to introduce support in Pungi to pass -c option to Lorax.


* Other developers: <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Other developers: <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
Line 191: Line 190:
mksquashfs(1)<br/>
mksquashfs(1)<br/>
lorax(1)<br/>
lorax(1)<br/>
https://docs.pagure.org/pungi
https://docs.pagure.org/pungi<br/>
[[File:Comparison Table SquashFS.ods]]


== Release Notes ==
== Release Notes ==

Revision as of 08:33, 19 January 2020


Reduce installation media size by improving the compression ratio of SquashFS filesystem

Summary

Improve compression ratio of SquashFS filesystem on the installation media.

Owner

Current status

  • Targeted release: Fedora 32
  • Last updated: 2020-01-19
  • Tracker bug: <will be assigned by the Wrangler>
  • Release notes tracker: <will be assigned by the Wrangler>

Detailed Description

As of Fedora 31, the LiveOS/squashfs.img file on the installation image, is compressed with default block size of mksquashfs. The default block size is 128k. Additionaly, lorax sets BCJ filter depending on the architecture. Those parameters can be adjusted which will lead to a better compression ratio and/or reduction of the CPU usage at build time.

Comparison of different SquashFS compression options and their impact on the time required to generate the SquashFS filesystem

This is simple to achieve. Recently, Lorax has gotten support for adjusting the compression options for mksquashfs via the configuration file. The file should be altered as following:

[compression]
bcj = no
args = -b 1M -Xdict-size 1M -no-recovery

Where -b 1M and -Xdict-size 1M are block and dictionary sizes respectively; bcj -- branch-call-jump filter.

Based on the results above, I'd suggest selecting the following optimal configuration: XZ algorithm, with block size of 1MiB and without BCJ filter (plain xz -b 1M, without -Xbcj x86). On the right, you can see the impact of the compression algorithms on installation time.

Different compression options and their impact on the installation time

As can be seen from the picture on the right hand side, selecting 'plain xz -b 1M configuration' has minimal impact on the installation time and CPU usage during the installation. The compression will result in +6.51% or, in real terms, +24.94s additional installation time, compared to the savings of 141.77 MiB on the installation media. I noticed, that even with maximum compression, CPU is not fully utilized during installation.

Alternative compression algorithms and the resulting filesystem size

Benefit to Fedora

  • Reduction of the installation media size and the cost of storing and distributing Fedora.
  • Reduction of the CPU usage at build time. Depending on which compression parameters chosen.

Scope

  • Proposal owners:

The build environment should have support for adjusting the Lorax configuration file and -squashfs-only parameter. Lorax is a program that produces the LiveOS/squashfs.img file on the installation media.

One of the ways to enable such customization is to introduce support in Pungi to pass -c option to Lorax.

  • Other developers:

The pungi utility should support passing the custom configuration file location to the Lorax utility. This option should apply during buildInstall phase of pungi.

  • Release engineering: [1]
  • Policies and guidelines: Not required.
  • Trademark approval: N/A (not needed for this Change)

Upgrade/compatibility impact

This change comes at a cost of higher memory usage during the installation. Based on my personal estimations, this should not be the issue. Since the decompression should require up to 1MiB per thread. At the moment, the decompression of SquashFS is single-threaded.

N/A (not a System Wide Change)

How To Test

mkdir -p /mnt/new /mnt/old
sudo mount -o loop,ro FedoraInstallationOld.iso
sudo mount -o loop,ro FedoraInstallationNew.iso
ls -l /mnt/{new,old}/LiveOS/squashfs.img

And then calculate the size difference.

User Experience

  • Decreasing the installation image size will reduce cost of mirroring and storing Fedora installation images.
  • Decreasing the installation image size will reduce the download time.
  • Increasing the block size on the current configuration with EXT4 file system, should increase latency while accessing the EXT4 filesystem. The exact impact is to be evaluated.
  • The impact of latency will be reduced, if the plain SquashFS option is be choosen.

Dependencies

Pungi, a utility that builds the compose, should include new functionality mentioned above. Alternatively, the /etc/lorax/lorax.conf should be altered in the environment where Lorax is running.

Contingency Plan

N/A

Documentation

https://pagure.io/releng/issue/9127.
mksquashfs(1)
lorax(1)
https://docs.pagure.org/pungi
File:Comparison Table SquashFS.ods

Release Notes