From Fedora Project Wiki
(→‎Current status: Prime time!)
Line 51: Line 51:
As of 2023 there is an active effort to implement [https://docs.fedoraproject.org/en-US/reproducible-builds/ Reproducible builds] in Fedora. Reproducible builds will allow our users to be able to independently verify that the RPMs have not been tampered with (either maliciously or via hardware/software fault): someone can do an independent rebuild of a package and confirm that they get identical binaries when building with the same versions of the compiler and other tools. This Change allows us to move forward in this direction by removing the common sources of irreproducibility.
As of 2023 there is an active effort to implement [https://docs.fedoraproject.org/en-US/reproducible-builds/ Reproducible builds] in Fedora. Reproducible builds will allow our users to be able to independently verify that the RPMs have not been tampered with (either maliciously or via hardware/software fault): someone can do an independent rebuild of a package and confirm that they get identical binaries when building with the same versions of the compiler and other tools. This Change allows us to move forward in this direction by removing the common sources of irreproducibility.


[https://github.com/keszybz/add-determinism add-determinism] is a Rust program which, as its name suggests, adds determinism to files that are given as input by attempting to standardize metadata contained in binary or source files to ensure consistency and clamping to $SOURCE_DATE_EPOCH in all instances. `add-determinism` is the "Fedora version" of [strip-nondeterminism](https://salsa.debian.org/reproducible-builds/strip-nondeterminism) from the Debian project. Since strip-nondeterminism is written in perl, it is undesirable for use in Fedora, as we don't want to pull perl in the buildroot for every package.
[https://github.com/keszybz/add-determinism add-determinism] is a Rust program which, as its name suggests, adds determinism to files that are given as input by attempting to standardize metadata contained in binary or source files to ensure consistency and clamping to $SOURCE_DATE_EPOCH in all instances. `add-determinism` is the "Fedora version" of [https://salsa.debian.org/reproducible-builds/strip-nondeterminism strip-nondeterminism] from the Debian project. Since strip-nondeterminism is written in perl, it is undesirable for use in Fedora, as we don't want to pull perl in the buildroot for every package.


It's worth noting that this Change does not intend to impose any specific reproducibility requirements on Fedora packages. Once this Change is implemented and we have been through a mass rebuild and can verify that the common causes of irreproducibility have indeed been removed, we can consider further steps. But that will be at least one release later.
It's worth noting that this Change does not intend to impose any specific reproducibility requirements on Fedora packages. Once this Change is implemented and we have been through a mass rebuild and can verify that the common causes of irreproducibility have indeed been removed, we can consider further steps. But that will be at least one release later.
Line 63: Line 63:
* [https://fedoraproject.org/wiki/Changes/ReproducibleBuildsClampMtimes Clamp build mtimes to SOURCE_DATE_EPOCH]
* [https://fedoraproject.org/wiki/Changes/ReproducibleBuildsClampMtimes Clamp build mtimes to SOURCE_DATE_EPOCH]
* [https://fedoraproject.org/wiki/Changes/RPM-4.20 RPM 4.20] — this pulls in changes to `%autosetup -S git` which removed a source of irreproducibility.
* [https://fedoraproject.org/wiki/Changes/RPM-4.20 RPM 4.20] — this pulls in changes to `%autosetup -S git` which removed a source of irreproducibility.


== Feedback ==
== Feedback ==

Revision as of 16:36, 7 April 2024

Reproducible Package Builds

This is a proposed Change for Fedora Linux.
This document represents a proposed Change. As part of the Changes process, proposals are publicly announced in order to receive community feedback. This proposal will only be implemented if approved by the Fedora Engineering Steering Committee.

Summary

A post-build cleanup is integrated into the RPM build process so that common causes of build irreproducibility in packages are removed, making most of Fedora packages reproducible.

Owner

  • Email: dcavalca@fedoraproject.org
  • Email: neil at shrug.pw
  • Email: mhroncok at redhat.com
  • Email: zbyszek at in.waw.pl

Current status

  • Targeted release: Fedora Linux 41
  • Last updated: 2024-04-07
  • [<will be assigned by the Wrangler> devel thread]
  • FESCo issue: <will be assigned by the Wrangler>
  • Tracker bug: <will be assigned by the Wrangler>
  • Release notes tracker: <will be assigned by the Wrangler>

Detailed Description

As of 2023 there is an active effort to implement Reproducible builds in Fedora. Reproducible builds will allow our users to be able to independently verify that the RPMs have not been tampered with (either maliciously or via hardware/software fault): someone can do an independent rebuild of a package and confirm that they get identical binaries when building with the same versions of the compiler and other tools. This Change allows us to move forward in this direction by removing the common sources of irreproducibility.

add-determinism is a Rust program which, as its name suggests, adds determinism to files that are given as input by attempting to standardize metadata contained in binary or source files to ensure consistency and clamping to $SOURCE_DATE_EPOCH in all instances. add-determinism is the "Fedora version" of strip-nondeterminism from the Debian project. Since strip-nondeterminism is written in perl, it is undesirable for use in Fedora, as we don't want to pull perl in the buildroot for every package.

It's worth noting that this Change does not intend to impose any specific reproducibility requirements on Fedora packages. Once this Change is implemented and we have been through a mass rebuild and can verify that the common causes of irreproducibility have indeed been removed, we can consider further steps. But that will be at least one release later.

This change does add a small amount of time to the processing of RPMs at the end of a build. Accordingly, packages containing large quantities or sizes of files be slower, but this effect is not expected to be noticeable. add-determinism takes steps to ensure it does not interfere with other buildroot post processors like mangle-shebangs, python-hardlink, python-bytecompile. It defaults to not doing any modifications in case it doesn't understand the input file or there are any other problems.

A mechanism to opt-out will be provided: to either completely disable the postprocessing step or to disable specific "handlers" (i.e. implementations of cleanup for specific file types, for example static archives). See macros.build-reproducibility.

Related Changes

Feedback

Benefit to Fedora

Adding determinism (i.e., removing non-determinsim) enables the Fedora community to have confidence that, if given the same source code, build environment, build instructions, and metadata from the build artifacts, any party can recreate copies of the artifacts that are identical except for the signatures and some parts of metadata.

Reproducibility of builds leads to packages of higher quality. It turns out that quite often those irreproducible bits are caused by an error or sloppiness in the code. In particular, any dependence on architecture in noarch packages is almost always unwanted and/or a bug. Test builds that check reproducibility will expose such instances.

Reproducibility of builds makes it easier to develop packages: when a small change is made and a package is rebuilt (in the same environment), then with a reproducible package, the only difference is directly caused by the change. If the package is different every time it is rebuilt, making a comparison is much harder.

Build reproducibility for noarch subpackages solves the problem where package builds on different architectures are different, causing mock to reject the whole build. In particular, this issue occurs for pyc files. This will now be solved without requiring opt-in from individual packages.


Scope

  • Proposal Owners:
    • Integrate add-determinism as a BuildRoot Policy script
    • Add a dependency on marshalparser to python3 (probably conditionalized on rpm-build)
  • Other Developers:
    • Test their packages with the additional phase, report problems
    • Potentially integrate changes to packages to enable reproducibility
  • Release Engineering: Ideally we want this to happen before the mass rebuild, but that is not strictly required.
  • Policies and Guidelines: Fedora Packaging Guidelines should be updated to include information on the add-determinism BuildRoot Policy. User documentation should be amended to include instructions on how to verify reproducibility for a given package, and what packages are known to be non-reproducible, and how to opt-out.
  • Trademark approval: N/A (not needed for this Change)
  • Alignment with Community Initiatives: All software and requests are consistent with the decision process and similar across other groups in Fedora. The Fedora Reproducibility Working group begin at Flock 2023 in Cork.

Upgrade/compatibility impact

No impact is expected.

How To Test

To test on the level of individual files:

  • install add-determinism
  • call SOURCE_DATE_EPOCH=… add-determinism -v ./path/to/file

To test package builds:

(This can be done on a normal system or in a mock chroot.)

User Experience

No impact is expected.

Dependencies

Contingency Plan

  • Contingency mechanism:
    • In case of major problems, disable the change in redhat-rpm-config.
    • In case of problems with specific packages, opt-out by setting a macro.
  • Contingency deadline: No limit really.
  • Blocks release? No.

Documentation

Release Notes

Fedora package builds are now more deterministic, bringing the distribution closer to the goal of achieving fully reproducible builds for all of its packages.