(Created page with "<!-- The actual name of your proposed change page should look something like: Changes/Your_Change_Proposal_Name. This keeps all change proposals in the same namespace --> =...") |
(→Documentation: add opt-out docs) |
||
(14 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
<!-- The actual name of your proposed change page should look something like: Changes/Your_Change_Proposal_Name. This keeps all change proposals in the same namespace --> | <!-- The actual name of your proposed change page should look something like: Changes/Your_Change_Proposal_Name. This keeps all change proposals in the same namespace --> | ||
= LTO by default for package builds <!-- The name of your change proposal --> = | = Changes/LTO by default for package builds <!-- The name of your change proposal --> = | ||
== Summary == | == Summary == | ||
<!-- A sentence or two summarizing what this change is and what it will do. This information is used for the overall changeset summary page for each release. | <!-- A sentence or two summarizing what this change is and what it will do. This information is used for the overall changeset summary page for each release. | ||
Note that motivation for the change should be in the Motivation section below, and this part should answer the question "What?" rather than "Why?". --> | Note that motivation for the change should be in the Motivation section below, and this part should answer the question "What?" rather than "Why?". --> | ||
This is a proposal to enable link time optimization (LTO) of packages by default. LTO, in | This is a proposal to enable link time optimization (LTO) of packages built with rpmbuild by default. It's an over-simplification, but think of LTO as deferring analysis, optimization and code generation until creation of an executable or dynamic shared object. | ||
This is implemented by adding the option "-flto" the injected flags in redhat-rpm-config. There will be a simple way for packages to opt-out of LTO. | |||
== Owner == | == Owner == | ||
Line 13: | Line 17: | ||
This should link to your home wiki page so we know who you are. | This should link to your home wiki page so we know who you are. | ||
--> | --> | ||
* Name: | * Name: Jeff Law | ||
<!-- Include you email address that you can be reached should people want to contact you about helping with your change, status is requested, or technical issues need to be resolved. If the change proposal is owned by a SIG, please also add a primary contact person. --> | <!-- Include you email address that you can be reached should people want to contact you about helping with your change, status is requested, or technical issues need to be resolved. If the change proposal is owned by a SIG, please also add a primary contact person. --> | ||
* Email: law@redhat.com | * Email: law@redhat.com | ||
Line 25: | Line 29: | ||
== Current status == | == Current status == | ||
* Targeted release: Fedora | * Targeted release: Fedora 33 | ||
* Last updated: <!-- this is an automatic macro — you don't need to change this line --> {{REVISIONYEAR}}-{{REVISIONMONTH}}-{{REVISIONDAY2}} | * Last updated: <!-- this is an automatic macro — you don't need to change this line --> {{REVISIONYEAR}}-{{REVISIONMONTH}}-{{REVISIONDAY2}} | ||
<!-- After the change proposal is accepted by FESCo, tracking bug is created in Bugzilla and linked to this page | <!-- After the change proposal is accepted by FESCo, tracking bug is created in Bugzilla and linked to this page | ||
Line 34: | Line 38: | ||
CLOSED as NEXTRELEASE -> change is completed and verified and will be delivered in next release under development | CLOSED as NEXTRELEASE -> change is completed and verified and will be delivered in next release under development | ||
--> | --> | ||
* Tracker bug: | * Tracker bug: [https://bugzilla.redhat.com/show_bug.cgi?id=1789115 #1789115] | ||
* Release notes tracker: | * Release notes tracker: [https://pagure.io/fedora-docs/release-notes/issue/429 #429] | ||
== Detailed Description == | == Detailed Description == | ||
<!-- Expand on the summary, if appropriate. A couple sentences suffices to explain the goal, but the more details you can provide the better. --> | <!-- Expand on the summary, if appropriate. A couple sentences suffices to explain the goal, but the more details you can provide the better. --> | ||
Programs built with rpmbuild and which honor flags injection via redhat-rpm-config will be built with LTO by default. A simple opt-out mechanism will be provided for packages which use features that are not LTO compatible. | |||
The LTO bytecode itself will not be distributed as it is not stable from one GCC release to the next. This is enforced by stripping the LTO bytecode from any installed .o/.a files. We'll use bits SuSE has already written for redhat-rpm-config to implement this. (RFE filed for [https://github.com/rpminspect/rpminspect/issues/129 verifying this in rpminspect]) | |||
Minor changes are desirable to the %configure macro in redhat-rpm-config to fix common code idioms used by autoconf generated scripts which are compromised by the additional optimization enabled by LTO. Minor updates to various packages will be needed to opt-out of LTO or fix bugs exposed by LTO. | |||
Bugs for the 3 issues we should address in redhat-rpm-config are here: | |||
https://bugzilla.redhat.com/show_bug.cgi?id=1789099 | |||
https://bugzilla.redhat.com/show_bug.cgi?id=1789137 | |||
https://bugzilla.redhat.com/show_bug.cgi?id=1789149 | |||
== Benefit to Fedora == | == Benefit to Fedora == | ||
The primary benefits of building with LTO enabled are smaller, faster executables/DSOs. A secondary benefit is LTO allows deeper analysis of package source code at compile time which can improve various GCC diagnostics and thus improve our ability to catch bugs at compile time. | The primary benefits of building with LTO enabled are smaller, faster executables/DSOs. A secondary benefit is LTO allows deeper analysis of package source code at compile time which can improve various GCC diagnostics and thus improve our ability to catch bugs at compile time such as uninitialized objects, buffer overflows, unterminated strings, restrict violations, etc. | ||
This change also brings us back on-par with openSUSE who enabled LTO by default for their free distribution earlier in 2019. | |||
If you're interested in some of the performance data and lower level details: | |||
http://www.ucw.cz/~hubicka/slides/opensuse2018-e.pdf | |||
https://hubicka.blogspot.com/2019/05/gcc-9-link-time-and-inter-procedural.html | |||
And opensuse's bug tracker for their LTO enablement work: | |||
https://bugzilla.opensuse.org/1133084 | |||
== Scope == | == Scope == | ||
Line 53: | Line 76: | ||
The primary change is to redhat-rpm-config to add LTO to the default compile/link flags as well as a conditional which allows easy opt-out on a package by package basis. Additionally the post-build scripts need to strip the LTO bytecodes from any installed .o/.a files. | The primary change is to redhat-rpm-config to add LTO to the default compile/link flags as well as a conditional which allows easy opt-out on a package by package basis. Additionally the post-build scripts need to strip the LTO bytecodes from any installed .o/.a files. | ||
Additionally, we know there are many packages with configure scripts that are compromised by LTO. I have tweaks to the %configure macro in redhat-rpm-config which fixes the vast majority of these problems with a few simple sed scripts on the generated output. Like the basic support for injecting the LTO flags, this will require coordination with the redhat-rpm-config maintainers. Packages which call configure directly and have compromised tests will need a one line change to their .spec files to fix their configure scripts. | |||
Some packages will need to opt-out of using LTO at this time. The most common case are packages that use symbol versioning or toplevel ASM statements. While there is a new mechanism to make LTO work with symbol versioning, I don't think any packages have been updated to use that mechanism. This will require a one line change to 50-75 packages (my script to find these is still running). | |||
Finally, some packages will fail to build with LTO due to deeper analysis for compile-time diagnostics catching programming mistakes that have gone unnoticed until now. I'll obviously be working with package maintainers on all of these issues. | |||
Note that even though the changes are fairly well localized in redhat-rpm-config and a small number of packages, the real scope of this change is much larger since it affects all packages in the distribution that are compiled with GCC and which honor the flags injection by redhat-rpm-config. | |||
* Other developers: | * Other developers: | ||
As I mentioned, I'm happy to contact package owners that need to modify their packages and suggest how their package needs to be fixed. | As I mentioned, I'm happy to contact package owners that need to modify their packages and suggest how their package needs to be fixed. As a multi-decade GCC developer, I'm particularly well suited to describe LTO, its limitations and how LTO impacts the diagnostics from GCC to any package owner that needs additional information. | ||
I'm also capable and available to address any GCC issues that we may arise as a result of this change. I don't expect much of the latter as SuSE has already enabled this feature for their distribution and thus weeded out most of the issues. | |||
The highest level of coordination will be with the redhat-rpm-config maintainers. | |||
I | I will also be coordinating with the GDB team to address debugging issues related to LTO. The most important issue is to ensure that we can pass the GDB testsuite with and without the -flto option being enabled. Failure to meet this goal would be considered a blocking issue for LTO enablement. | ||
I'm also already in contact with SuSE and Debian/Ununtu engineers to discuss issues with gcc-10 with and without LTO. | |||
<!-- What work do other developers have to accomplish to complete the feature in time for release? Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?--> | <!-- What work do other developers have to accomplish to complete the feature in time for release? Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?--> | ||
We know there are some problems with debugging LTO | We know there are some problems with debugging LTO code. I will be working with the GDB team to identify these issues and fix them either in the debugger or compiler as needed. | ||
I have prototype code for the required redhat-rpm-config changes and I'll coordinate with the redhat-rpm-config maintainer to get them into the desired final form. | |||
I also know every package that fails with LTO enabled. I'm still categorizing those failures. Many will ultimately need to use the opt-out mechanism because they use features that are not compatible with LTO. I expect to have all this ready to go the first work week of the new year. I will coordinate with package owners to either add the opt-out markers or fix issues in the package as needed. | |||
* Release engineering: [ https://pagure.io/releng/issue/9119 ] | |||
Aside from the redhat-rpm-config changes to ensure we do not ship LTO bytecode in .o/.a files, I do not expect any work from releng to be necessary. However, they need to be aware of the change and who to contact in case of issues. The redhat-rpm-config changes necessary to implement stripping of LTO sections/symbols can be found attached to this BZ: | |||
https://bugzilla.redhat.com/show_bug.cgi?id=1789099 | |||
We will not do a special mass rebuild for this feature. However a mass rebuild is typically planned for the introduction of a new compiler. We would want to piggy back on that mass rebuild. | |||
* Policies and guidelines: | * Policies and guidelines: | ||
<!-- Do the packaging guidelines or other documents need to be updated for this feature? If so, does it need to happen before or after the implementation is done? If a FPC ticket exists, add a link here. --> | <!-- Do the packaging guidelines or other documents need to be updated for this feature? If so, does it need to happen before or after the implementation is done? If a FPC ticket exists, add a link here. --> | ||
It would be useful to document how to opt-out of LTO in the packaging guidelines | It would be useful to document how to opt-out of LTO in the packaging guidelines. | ||
* Trademark approval: N/A (not needed for this Change) | * Trademark approval: N/A (not needed for this Change) | ||
Line 88: | Line 127: | ||
In the short term, I'm happy to expose a repository with a gcc-10 snapshot and updated redhat-rpm-config. Developers could then use that repo to pick up gcc-10 and LTO optimizations for testing purposes. I'm already doing this internally for x86_64 and exposing it to the world would be trivial. | In the short term, I'm happy to expose a repository with a gcc-10 snapshot and updated redhat-rpm-config. Developers could then use that repo to pick up gcc-10 and LTO optimizations for testing purposes. I'm already doing this internally for x86_64 and exposing it to the world would be trivial. | ||
Given such a repository, another developer would merely use that repo when building their package. No special hardware is needed. The most useful testing is first to identify FTBFS issues and get them proactively fixed. I'm happy to own that since I'm already doing that for | Given such a repository, another developer would merely use that repo when building their package. No special hardware is needed. The most useful testing is first to identify FTBFS issues and get them proactively fixed. I'm happy to own that since I'm already doing that for baseline gcc-10 issues as well as gcc-10 + LTO issues. | ||
Doing the same testing on other architectures would definitely be useful. I'd be particularly concerned about large packages on the 32bit architectures. I wouldn't be surprised if we find some packages need to opt-out of LTO because they run out of memory at link/compile time. I'm already in contact with some Debian maintainers who want to do testing around this issue as they're investigating a similar change for Debian. | Doing the same testing on other architectures would definitely be useful. I'd be particularly concerned about large packages on the 32bit architectures. I wouldn't be surprised if we find some packages need to opt-out of LTO because they run out of memory at link/compile time. I'm already in contact with some Debian maintainers who want to do testing around this issue as they're investigating a similar change for Debian. | ||
I'm already building all of Fedora with the weekly gcc-10 snapshots (including LTO builds starting the week of 12/15). This is primarily to proactive find/address issues with the gcc-10 transition, but verification of LTO state pretty much piggy backs for free on the gcc-10 work. | |||
== User Experience == | == User Experience == | ||
In theory, the only noticeable difference to users would be smaller, faster binaries and DSOs. However, a developer that uses rpmbuild to build their own code may see their package fail to build if it's got errors or uses certain features that do not work with LTO. | In theory, the only noticeable difference to users would be smaller, faster binaries and DSOs. However, a developer that uses rpmbuild to build their own code may see their package fail to build if it's got errors or uses certain features that do not work with LTO. | ||
Users who try to debug Fedora executables could notice differences in the debugging experience. | Users who try to debug Fedora shipped executables could notice differences in the debugging experience. | ||
== Dependencies == | == Dependencies == | ||
None expected beyond addressing FTBFS issues and coordination between GCC and GDB teams on any debugging issues we find | None expected beyond addressing FTBFS issues and coordination between GCC and GDB teams on any debugging issues we find over the next few weeks. | ||
== Contingency Plan == | == Contingency Plan == | ||
<!-- If you cannot complete your feature by the final development freeze, what is the backup plan? This might be as simple as "Revert the shipped configuration". Or it might not (e.g. rebuilding a number of dependent packages). If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy. --> | <!-- If you cannot complete your feature by the final development freeze, what is the backup plan? This might be as simple as "Revert the shipped configuration". Or it might not (e.g. rebuilding a number of dependent packages). If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy. --> | ||
* Contingency mechanism: | * Contingency mechanism: Revert the LTO flags injection <!-- REQUIRED FOR SYSTEM WIDE CHANGES --> | ||
<!-- When is the last time the contingency mechanism can be put in place? This will typically be the beta freeze. --> | <!-- When is the last time the contingency mechanism can be put in place? This will typically be the beta freeze. --> | ||
* Contingency deadline: | * Contingency deadline: Beta freeze, but shooting for prior to mass rebuilds starting <!-- REQUIRED FOR SYSTEM WIDE CHANGES --> | ||
<!-- Does finishing this feature block the release, or can we ship with the feature in incomplete state? --> | <!-- Does finishing this feature block the release, or can we ship with the feature in incomplete state? --> | ||
* Blocks release? No <!-- REQUIRED FOR SYSTEM WIDE CHANGES --> | * Blocks release? No <!-- REQUIRED FOR SYSTEM WIDE CHANGES --> | ||
* Blocks product? No <!-- Applicable for Changes that blocks specific product release/Fedora.next --> | * Blocks product? No <!-- Applicable for Changes that blocks specific product release/Fedora.next --> | ||
Most critically, if we don't address the GDB testsuite issue noted above, our fallback position would be to simply disable the LTO injection globally and re-evaluate for Fedora 33, similarly if we were to find some show-stopping LTO issue. | |||
Otherwise the plan is to analyze the remaining 100-125 package build failures. These are likely a mixture of configure issues that can't be trivially fixed via %configure, new diagnostics exposed by the deeper analysis from LTO, and other small issues. | |||
== Documentation == | == Documentation == | ||
<!-- Is there upstream documentation on this change, or notes you have written yourself? Link to that material here so other interested developers can get involved. --> | <!-- Is there upstream documentation on this change, or notes you have written yourself? Link to that material here so other interested developers can get involved. --> | ||
To opt out: `%global _lto_cflags %nil` | |||
== Release Notes == | == Release Notes == | ||
Line 125: | Line 166: | ||
--> | --> | ||
<!-- When your change proposal page is completed and ready for review and announcement --> | <!-- When your change proposal page is completed and ready for review and announcement --> | ||
<!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler --> | <!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler --> | ||
Line 132: | Line 173: | ||
<!-- Select proper category, default is Self Contained Change --> | <!-- Select proper category, default is Self Contained Change --> | ||
[[Category: | [[Category:SystemWideChange]] | ||
[[Category:ChangeAcceptedF33]] |
Latest revision as of 12:31, 13 January 2021
Changes/LTO by default for package builds
Summary
This is a proposal to enable link time optimization (LTO) of packages built with rpmbuild by default. It's an over-simplification, but think of LTO as deferring analysis, optimization and code generation until creation of an executable or dynamic shared object.
This is implemented by adding the option "-flto" the injected flags in redhat-rpm-config. There will be a simple way for packages to opt-out of LTO.
Owner
- Name: Jeff Law
- Email: law@redhat.com
Current status
- Targeted release: Fedora 33
- Last updated: 2021-01-13
- Tracker bug: #1789115
- Release notes tracker: #429
Detailed Description
Programs built with rpmbuild and which honor flags injection via redhat-rpm-config will be built with LTO by default. A simple opt-out mechanism will be provided for packages which use features that are not LTO compatible.
The LTO bytecode itself will not be distributed as it is not stable from one GCC release to the next. This is enforced by stripping the LTO bytecode from any installed .o/.a files. We'll use bits SuSE has already written for redhat-rpm-config to implement this. (RFE filed for verifying this in rpminspect)
Minor changes are desirable to the %configure macro in redhat-rpm-config to fix common code idioms used by autoconf generated scripts which are compromised by the additional optimization enabled by LTO. Minor updates to various packages will be needed to opt-out of LTO or fix bugs exposed by LTO.
Bugs for the 3 issues we should address in redhat-rpm-config are here:
https://bugzilla.redhat.com/show_bug.cgi?id=1789099 https://bugzilla.redhat.com/show_bug.cgi?id=1789137 https://bugzilla.redhat.com/show_bug.cgi?id=1789149
Benefit to Fedora
The primary benefits of building with LTO enabled are smaller, faster executables/DSOs. A secondary benefit is LTO allows deeper analysis of package source code at compile time which can improve various GCC diagnostics and thus improve our ability to catch bugs at compile time such as uninitialized objects, buffer overflows, unterminated strings, restrict violations, etc.
This change also brings us back on-par with openSUSE who enabled LTO by default for their free distribution earlier in 2019.
If you're interested in some of the performance data and lower level details:
http://www.ucw.cz/~hubicka/slides/opensuse2018-e.pdf
https://hubicka.blogspot.com/2019/05/gcc-9-link-time-and-inter-procedural.html
And opensuse's bug tracker for their LTO enablement work:
https://bugzilla.opensuse.org/1133084
Scope
- Proposal owners:
The primary change is to redhat-rpm-config to add LTO to the default compile/link flags as well as a conditional which allows easy opt-out on a package by package basis. Additionally the post-build scripts need to strip the LTO bytecodes from any installed .o/.a files.
Additionally, we know there are many packages with configure scripts that are compromised by LTO. I have tweaks to the %configure macro in redhat-rpm-config which fixes the vast majority of these problems with a few simple sed scripts on the generated output. Like the basic support for injecting the LTO flags, this will require coordination with the redhat-rpm-config maintainers. Packages which call configure directly and have compromised tests will need a one line change to their .spec files to fix their configure scripts.
Some packages will need to opt-out of using LTO at this time. The most common case are packages that use symbol versioning or toplevel ASM statements. While there is a new mechanism to make LTO work with symbol versioning, I don't think any packages have been updated to use that mechanism. This will require a one line change to 50-75 packages (my script to find these is still running).
Finally, some packages will fail to build with LTO due to deeper analysis for compile-time diagnostics catching programming mistakes that have gone unnoticed until now. I'll obviously be working with package maintainers on all of these issues.
Note that even though the changes are fairly well localized in redhat-rpm-config and a small number of packages, the real scope of this change is much larger since it affects all packages in the distribution that are compiled with GCC and which honor the flags injection by redhat-rpm-config.
- Other developers:
As I mentioned, I'm happy to contact package owners that need to modify their packages and suggest how their package needs to be fixed. As a multi-decade GCC developer, I'm particularly well suited to describe LTO, its limitations and how LTO impacts the diagnostics from GCC to any package owner that needs additional information.
I'm also capable and available to address any GCC issues that we may arise as a result of this change. I don't expect much of the latter as SuSE has already enabled this feature for their distribution and thus weeded out most of the issues.
The highest level of coordination will be with the redhat-rpm-config maintainers.
I will also be coordinating with the GDB team to address debugging issues related to LTO. The most important issue is to ensure that we can pass the GDB testsuite with and without the -flto option being enabled. Failure to meet this goal would be considered a blocking issue for LTO enablement.
I'm also already in contact with SuSE and Debian/Ununtu engineers to discuss issues with gcc-10 with and without LTO.
We know there are some problems with debugging LTO code. I will be working with the GDB team to identify these issues and fix them either in the debugger or compiler as needed.
I have prototype code for the required redhat-rpm-config changes and I'll coordinate with the redhat-rpm-config maintainer to get them into the desired final form.
I also know every package that fails with LTO enabled. I'm still categorizing those failures. Many will ultimately need to use the opt-out mechanism because they use features that are not compatible with LTO. I expect to have all this ready to go the first work week of the new year. I will coordinate with package owners to either add the opt-out markers or fix issues in the package as needed.
- Release engineering: [ https://pagure.io/releng/issue/9119 ]
Aside from the redhat-rpm-config changes to ensure we do not ship LTO bytecode in .o/.a files, I do not expect any work from releng to be necessary. However, they need to be aware of the change and who to contact in case of issues. The redhat-rpm-config changes necessary to implement stripping of LTO sections/symbols can be found attached to this BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=1789099
We will not do a special mass rebuild for this feature. However a mass rebuild is typically planned for the introduction of a new compiler. We would want to piggy back on that mass rebuild.
- Policies and guidelines:
It would be useful to document how to opt-out of LTO in the packaging guidelines.
- Trademark approval: N/A (not needed for this Change)
Upgrade/compatibility impact
Should not affect compatibility. Stripping of the LTO bytecode is critical to ensure there are not long term compatibility issues.
How To Test
In the short term, I'm happy to expose a repository with a gcc-10 snapshot and updated redhat-rpm-config. Developers could then use that repo to pick up gcc-10 and LTO optimizations for testing purposes. I'm already doing this internally for x86_64 and exposing it to the world would be trivial.
Given such a repository, another developer would merely use that repo when building their package. No special hardware is needed. The most useful testing is first to identify FTBFS issues and get them proactively fixed. I'm happy to own that since I'm already doing that for baseline gcc-10 issues as well as gcc-10 + LTO issues.
Doing the same testing on other architectures would definitely be useful. I'd be particularly concerned about large packages on the 32bit architectures. I wouldn't be surprised if we find some packages need to opt-out of LTO because they run out of memory at link/compile time. I'm already in contact with some Debian maintainers who want to do testing around this issue as they're investigating a similar change for Debian.
I'm already building all of Fedora with the weekly gcc-10 snapshots (including LTO builds starting the week of 12/15). This is primarily to proactive find/address issues with the gcc-10 transition, but verification of LTO state pretty much piggy backs for free on the gcc-10 work.
User Experience
In theory, the only noticeable difference to users would be smaller, faster binaries and DSOs. However, a developer that uses rpmbuild to build their own code may see their package fail to build if it's got errors or uses certain features that do not work with LTO.
Users who try to debug Fedora shipped executables could notice differences in the debugging experience.
Dependencies
None expected beyond addressing FTBFS issues and coordination between GCC and GDB teams on any debugging issues we find over the next few weeks.
Contingency Plan
- Contingency mechanism: Revert the LTO flags injection
- Contingency deadline: Beta freeze, but shooting for prior to mass rebuilds starting
- Blocks release? No
- Blocks product? No
Most critically, if we don't address the GDB testsuite issue noted above, our fallback position would be to simply disable the LTO injection globally and re-evaluate for Fedora 33, similarly if we were to find some show-stopping LTO issue.
Otherwise the plan is to analyze the remaining 100-125 package build failures. These are likely a mixture of configure issues that can't be trivially fixed via %configure, new diagnostics exposed by the deeper analysis from LTO, and other small issues.
Documentation
To opt out: %global _lto_cflags %nil