From Fedora Project Wiki
(Dropping the proposal after no activity on the FESCo ticket)
 
(10 intermediate revisions by 2 users not shown)
Line 12: Line 12:


== Current status ==
== Current status ==
[[Category:ChangeAnnounced]]
[[Category:ChangePageIncomplete]]
<!-- When your change proposal page is completed and ready for review and announcement -->
<!-- When your change proposal page is completed and ready for review and announcement -->
<!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler -->
<!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler -->
Line 25: Line 25:
* Last updated: <!-- this is an automatic macro — you don't need to change this line -->  {{REVISIONYEAR}}-{{REVISIONMONTH}}-{{REVISIONDAY2}}  
* Last updated: <!-- this is an automatic macro — you don't need to change this line -->  {{REVISIONYEAR}}-{{REVISIONMONTH}}-{{REVISIONDAY2}}  


* FESCo issue: <will be assigned by the Wrangler>
* FESCo issue: [https://pagure.io/fesco/issue/2441 #2441]
* Tracker bug: <will be assigned by the Wrangler>
* Tracker bug: <will be assigned by the Wrangler>
* Release notes tracker: <will be assigned by the Wrangler>
* Release notes tracker: <will be assigned by the Wrangler>
Line 59: Line 59:


The other information needed to bump reliably will be taken from another key = value buildsys.conf file also included by default in SRPM sources.
The other information needed to bump reliably will be taken from another key = value buildsys.conf file also included by default in SRPM sources.
|-
| How does the code knows what is the "last build event" to bump from?
|
The same process that writes a new state of the changelog file in sources, writes the date that was written in the changelog in a separate key = value file (with the components of the build evr, the last packager id, etc).
That means, you can trim the detached changelog file (if you find the  list of build events uninteresting), the SRPM will still remember to  bump the next EVR to something above the last build (even if it does not  appear in the changelog file).
(That also means I could dispense with writing a parser for the custom  timestamp format rpm changelogs use, and save the date in easy to parse RFC 3339/ISO 8601 format)


|-
|-
Line 108: Line 117:


The bumping is automatic and does not require explicit packager action except launching a build.
The bumping is automatic and does not require explicit packager action except launching a build.
|-
|-
| Is the old changelog discarded?
| Is the old changelog discarded?
Line 127: Line 135:


However, a git infra can make use of the now detached changelog to feed commit info to the rpm build process.
However, a git infra can make use of the now detached changelog to feed commit info to the rpm build process.
|-
| %changelog section should be auto-generated from commits messages I don't want to maintain a separate file with
the changelog.
|
The feature is not about generating changelogs, it’s about bumping them. You can generate the changelog file any way you wish.


|-
|-
Line 242: Line 256:
| I don’t want to prepare magic bumping SRPMs, I have enough work for now
| I don’t want to prepare magic bumping SRPMs, I have enough work for now
|
|
You don’t need to prepare any special SRPM (except moving the changelog to a detached file) the build process is strictly identical to the existing build process.
The proposal does not involve preparing a special SRPM out of band, that is then fed to koji, the SRPM containing the bumped
changelog and last build info is the result of the build process alongside the binary packages.


If you want to propagate bumps, you do need to import the build results in whatever you use for your next build (typically, importing the SRPM produced by a build) but you can also choose to kill this branch of history and scratch the produced SRPM. That’s your packager choice.
If you want to propagate bumps, you do need to import the build results in whatever you use for your next build (typically, importing the SRPM produced by a build) but you can also choose to kill this branch of history and scratch the produced SRPM. That’s your packager choice.


|-
|-
| I think the Change Page does not mention that Koji will be committing anything to the dist-git.
|
|
This feature does not need koji to commit anything to dist-git. While that would  be nice to have, the back-commit can be done by the human who scheduled  the build, or by fedpkg, or whatever.
That also means that till this back commit is done whoever scheduled the build can decide to scratch it
all as a dead evolutionary branch. And you can do "I feel lucky" tests and forget about them if they turn out bad.
But yes, in  a model where bumping is infrastructure-independent and done at the rpm level, various infrastructures still needs to pick up the results of rpm builds and do whatever they want to do with them. rpmbuild can create rpms, it can not record things in organization-specific systems
See also ↓.
|-
|
Builders currently do not have commit access to git and I'm not sure if we want them to considering they have git installed (so they can clone) as well as access to all the packages in dist-git from a networking point of view (again so they can clone).
Builders currently do not have commit access to git and I'm not sure if we want them to considering they have git installed (so they can clone) as well as access to all the packages in dist-git from a networking point of view (again so they can clone).


Line 266: Line 294:


You could also try to filter source files to limit the back commit to specific files. But really, if you don’t trust your build process to modify files in a secure way, you should not distribute the produced RPMs in the first place.
You could also try to filter source files to limit the back commit to specific files. But really, if you don’t trust your build process to modify files in a secure way, you should not distribute the produced RPMs in the first place.
|-
| rpmautospec relies on git tags to store the build info, could it be considered here?
Why did you bother with it in the first place?
|
That does not solve any of the hard problems, because:
* rpm builds are both producers and consumers (in the rpm changelog) of build info. Feeding build info from an external system like git involves unhealthy dependency loops and taking bets in git that a build will occur and succeed at a specific date. build info should not depend on anything but the build process itself.
* using git tags as reference ties our builds into Fedora git infra, breaking import/export packaging workflows
* spec munging will probably fail except on very simple spec files. Relying on spec munging opens an arms race between the people that invent more advanced packaging patterns in spec files, and the tooling people that try to parse and munge the result. The spec author knows his own spec data structures, keeping spec modifications within the spec build process is the most reliable and future-proof solution.
* conversely, external spec templating adds yet another layer to our packaging process, that will need long-term maintenance. Historically external rpm templating solutions (and there’s been a *lot* of those) fail to adapt to rpm changes and finish producing poor lowest-common-denominator spec files. This change does not require any additional software to the Fedora packaging stack, it’s just a bunch of macros.


|-
|-
Line 287: Line 325:
The change is proposed a full cycle before its target release to let time for people to think on it and decide if and how they wish to use it.
The change is proposed a full cycle before its target release to let time for people to think on it and decide if and how they wish to use it.


|-
| Why did you bother with it in the first place!
|
From a purely architecture POW I’m convinced the proposed approach is the correct approach. Anything else proposed so far involves:
* tying a low-level event like "build occurred at date XXX" to high-level Fedora infra (making our workflow non portable and incompatible with downstreams and third parties)
* taking bets in git that a build will occur and succeed (before it  actually occurs and succeeds, in real life builds fail for various reasons), and
*attempting to munge spec file behind the packager back (unlikely to work fine the more automated and dynamic we made those).
|}
|}



Latest revision as of 15:26, 4 February 2021

RPM-level auto release and changelog bumping

Summary

redhat-rpm-config will be updated so users of the auto framework get automated release and changelog bumping.

Owner

  • Email: <nicolas.mailhot at laposte.net>

Current status

  • FESCo issue: #2441
  • Tracker bug: <will be assigned by the Wrangler>
  • Release notes tracker: <will be assigned by the Wrangler>

Detailed Description

This is a system-wide change because all packages build with redhat-rpm-config, but it only concerns packages that opted to use this part of redhat-rpm-config (auto framework).

The change will make packages that use the %auto_<call> redhat-rpm-config macros auto-bump and auto-changelog at the rpm level, in an infrastructure-independent way. The %auto_<call> framework is proposed in https://fedoraproject.org/wiki/Changes/Patches_in_Forge_macros_-_Auto_macros_-_Detached_rpm_changelogs

In that context, auto-bumping means that a SRPM, produced in any compatible build system (that is, any build system that does not inhibit low-level rpmbuild behaviour), will rebuild by default to a release higher, than the last build release, in the next build system it is imported into, without any manual change to the SRPM source files.

Auto-changelog means that the build event will also be traced in the rpm changelog (again, without any manual change).

Feedback

How is it better than rpmdev-bumpsec?

Unlike rpmdev-bumpsec, the feature is automatic.

It does not require explicit human action. Releases get bumped even if the human forgot a particular release has already been built.

It does not rely on an external tool, nor requires this external tool to be able to parse a spec file (which can be difficult for heavily automated spec files like the ones that take advantage of %auto<call> macros).

A rebuild does not touch the spec file at all. That means, the spec files changes tracked by your favorite scm, will show only spec logic changes, without drowning those in no-logic-change build events.

Where will it take changelog from? The previous changelog state will be taken from a separate rpm-changelog.txt included by default in SRPM sources.

The other information needed to bump reliably will be taken from another key = value buildsys.conf file also included by default in SRPM sources.

How does the code knows what is the "last build event" to bump from?

The same process that writes a new state of the changelog file in sources, writes the date that was written in the changelog in a separate key = value file (with the components of the build evr, the last packager id, etc).

That means, you can trim the detached changelog file (if you find the list of build events uninteresting), the SRPM will still remember to bump the next EVR to something above the last build (even if it does not appear in the changelog file).

(That also means I could dispense with writing a parser for the custom timestamp format rpm changelogs use, and save the date in easy to parse RFC 3339/ISO 8601 format)

How does it affect reproducibility?

The change makes SRPMs non-reproducible by default (reproducibility is the antithesis of auto-bumping).

To reproduce a build, you will need to take the SRPM produced by last build, and rebuild it with

 %reproducible_build = true

(or anything else except false, in your ~/.rpmmacros, or config_opts['macros'], or as rpmbuild --define flags)

It is not possible to reproduce a build from the SRPM as it existed at the start of last build, since that SRPM does not know the date at which the next build occurred.

Why is the change adding an auto_ framework?
Is %autosetup included in the auto_ framework?
  • %autosetup is not part of the new framework, it antedates it.
  • All the new %auto_call macros are named %auto_something.
Why is the change separating the changelog from the spec file?
Why is a separate "rpm-changelog.txt" file changelog better than current changelog inside .spec?

Without separation, we would lose the benefit of auto-bumping at the SCM level, since no-logic-change rebuilds would still result in a spec file change.

Separation makes automation a lot easier since adding to the changelog is just pre-pending some lines, and does not require any knowledge of rpm syntax. Auto-bumping will add a "* date name evr" line on the next rebuild, so changelog additions can limit themselves to plain text descriptions of new changes at the top of the existing file.

Separation is a requirement for auto-changelog bumping at the rpm level. Once rpmbuilt is lauched, it can not modify the processed spec file. Therefore making the changelog modifiable by the build process requires splitting it out of the spec file first.

How do I let rpm generate the changelog automatically?

This feature is not changelog generation, just changelog bumping on build events. You still need some other method to put non-build events in the changelog.

The bumping is automatic and does not require explicit packager action except launching a build.

Is the old changelog discarded?

The old changelog file is replaced in srpm sources with a new file containing new build event lines before the old lines.

A rpmbuild -bs or rpmspec -P will show the future bump, but nothing will be stored in sources before the build reaches the %build phase, to avoid bumping whenever something parses the spec file.

Why is a separate "rpm-changelog.txt" file with manually maintained changelog better than current manually maintained changelog inside .spec?

See next question.

How about using git commit log for changelog instead?

This is a low level rpm change that does not depend on any specific SCM infrastructure, git included. It works directly at the rpm level. So it does not depend on the existence of git and git commits.

However, a git infra can make use of the now detached changelog to feed commit info to the rpm build process.

%changelog section should be auto-generated from commits messages I don't want to maintain a separate file with

the changelog.

The feature is not about generating changelogs, it’s about bumping them. You can generate the changelog file any way you wish.

How will the changelog be maintained?

The changelog will be maintained any way you wish to maintain it, it’s just a plain text file in package sources.

An infrastructure that uses git, can feed git commit events to the detached changelog file, using dumb or elaborate git commit hooks, and any other method it wants to implement. The auto-bump logic does not care, it will use the detached changelog file in the state it exists at the start of the build process.

Because the logic catches all rebuilds, regular manual trimming of the lines that add no value is recommended.

What about having one macro called %buildsys_packager instead of two?

This is certainly doable, a good suggestion, and has been done in the last version of the code.

Can you list the relevant %auto macros explicitly somewhere?

Auto release bumping and auto changelog bumping involve registering some processing in the preamble (to compute the next evr), in %sourcelist (to deal with the source files involved in saving state) in %build (to commit the new data to disk once the build is ongoing) and in %changelog (to get rpmbuild to record the new changelog state in package metadata)

ie it depends on processing in %auto_pkg, %auto_sources, %auto_build and %auto_changelog. Some of this processing is unchanged from https://fedoraproject.org/wiki/Changes/Patches_in_Forge_macros_-_Auto_macros_-_Detached_rpm_changelogs

The bumping is done by the buildsys subsystem ie practically by %new_package (called by %auto_pkg, directly or via %buildsys_pkg), by %buildsys_sources (called by %auto_sources), %buildsys_build (called by %auto_build) and %buildsys_changelog (called by %auto_changelog).

It’s done by the buildsys subsystem because the %buildsys subsystem is tasked with writing the SRPM header in the new %auto_call framework, so only it knows which of the various (sub)package epochs and versions are the ones that apply to the SRPM.

But, see bellow ↓ reading the diff is probably simpler than reading all I just wrote.

Can I see the code diff?

The current code state is visible in

It’s one small commit on top of the huge change queued in:

That PR can still evolve based on feedback and testing. Therefore, I can’t promise that the auto-bumping logic will always apply directly, just that it will look more or less this way after rebasing. I do not rebase it on every change to the other PR.

This is very young code, there are probably lots of easy things to tidy up in there. However it works.

Can I see some spec file examples?

You can take any spec file in https://copr.fedorainfracloud.org/coprs/nim/refactoring-forge-patches-auto-call-bump-changelog-fonts/builds/

except for the two macro packages those use. (Those are fonts packages, so the fonts macros are required, but autobumping is done at the generic redhat-rpm-config level, not at the fonts macro level)

However you won’t find any bumping specific changes in there. The spec files are strictly identical to those that were posted in: https://fedoraproject.org/wiki/Changes/Patches_in_Forge_macros_-_Auto_macros_-_Detached_rpm_changelogs

The feature does not require additional spec changes.

Why does the feature require a complex %auto_call framework?

The %auto_call framework was not coded for the feature but to solve other Fedora packaging problems: https://fedoraproject.org/wiki/Changes/Patches_in_Forge_macros_-_Auto_macros_-_Detached_rpm_changelogs

Once it was coded, adding autobumping was a trivial addition.

Can it be used without the %auto_call framework?

Unfortunately, no, the implementation is simple and reliable and easy because the %auto_call framework already splits the release value into easy to process parts, puts them in variables that rpm can manipulate, provides the entry points the feature uses at various stages of the spec file, and detaches the changelog file.

An independant implementation would need to reproduce all this work first.

Anything that lets the packager fill in Release: directly in the spec file can not bump at the RPM level without RPM changes, since the act of reading the release tag (and making it available as %[release} for further processing, sets the Release value in stone.

Are there other things to tidy up?

A production implementation would probably split %{dist} in %{distcore} and %{distprefix} (the .gitdatehash things we stuff in Releases and in rpm changelogs as opposed to the fcX part we want to appear in Release but not in changelogs). I know where the offending code is in fedora- release and the split up is trivial to implement, but there’s no point in worrying about this level of detail before the core of the feature is approved (or not).

Right now the implementation inhibits %dist in bulk (decorations included) in changelogs.

Is this related to Piere/Pingou's work on the same topic that was deployed to koji staging?

It’s a different implementation, at the rpm level, that does not tie bumping to Fedora infra (koji included). Though, it is probably complementary to what pingou did on the changelog alimentation front.

IMHO the design mistake so far was to conflate bumping and non-build event changelog filling. You need to do both of course but build event should be a build event driven by the lowest common denominator (rpmbuild) with koji/infra scrapping rpmbuild results as usual and exposing them to users.

I don’t want to prepare magic bumping SRPMs, I have enough work for now

The proposal does not involve preparing a special SRPM out of band, that is then fed to koji, the SRPM containing the bumped changelog and last build info is the result of the build process alongside the binary packages.

If you want to propagate bumps, you do need to import the build results in whatever you use for your next build (typically, importing the SRPM produced by a build) but you can also choose to kill this branch of history and scratch the produced SRPM. That’s your packager choice.

I think the Change Page does not mention that Koji will be committing anything to the dist-git.

This feature does not need koji to commit anything to dist-git. While that would be nice to have, the back-commit can be done by the human who scheduled the build, or by fedpkg, or whatever.

That also means that till this back commit is done whoever scheduled the build can decide to scratch it all as a dead evolutionary branch. And you can do "I feel lucky" tests and forget about them if they turn out bad.

But yes, in a model where bumping is infrastructure-independent and done at the rpm level, various infrastructures still needs to pick up the results of rpm builds and do whatever they want to do with them. rpmbuild can create rpms, it can not record things in organization-specific systems

See also ↓.

Builders currently do not have commit access to git and I'm not sure if we want them to considering they have git installed (so they can clone) as well as access to all the packages in dist-git from a networking point of view (again so they can clone).

So if we were to give the builders commit access to dist-git, an attacker could easily commit to any other packages, potentially from something as easy as a scratch-build.

From a pure high-level view, the thing in our infra that gates builds and decides whether they are official or scratched is bodhi.

So if you wanted to push Fedora release logic to its ultimate conclusion, the thing that should be in charge of committing the new release/changelog build state to package history in git is bodhi, not koji. And you can put security related checks there, since deciding to push things to users requires security related checks anyway (that probably also involves branching while a bodhi update is in flight and not approved yet).

However, that’s if you wanted to push the model to its ultimate conclusion and have something nice solid, automated, and future-proof.

If you don’t want to touch bodhi, and it you do not want koji to commit to git you can just:

  • make the koji client return the URL that will contain the SRPM at the end of the build process if it succeeds.
  • have the person of script that called the koji client (and has, presumably, write access to the corresponding packages) consult the build results later
  • and have this person or script decide wether he or it wants to commit the build result to history or not

That’s the REST way of doing things. It’s a cop-out because you push hard commit decisions to the client, but it’s a prefectly valid approach. The commit decision exists with or without my change, it’s just people have (successfully) convinced themselves git is magic and git makes release decisions go away.

You could also try to filter source files to limit the back commit to specific files. But really, if you don’t trust your build process to modify files in a secure way, you should not distribute the produced RPMs in the first place.

rpmautospec relies on git tags to store the build info, could it be considered here?

Why did you bother with it in the first place?

That does not solve any of the hard problems, because:

  • rpm builds are both producers and consumers (in the rpm changelog) of build info. Feeding build info from an external system like git involves unhealthy dependency loops and taking bets in git that a build will occur and succeed at a specific date. build info should not depend on anything but the build process itself.
  • using git tags as reference ties our builds into Fedora git infra, breaking import/export packaging workflows
  • spec munging will probably fail except on very simple spec files. Relying on spec munging opens an arms race between the people that invent more advanced packaging patterns in spec files, and the tooling people that try to parse and munge the result. The spec author knows his own spec data structures, keeping spec modifications within the spec build process is the most reliable and future-proof solution.
  • conversely, external spec templating adds yet another layer to our packaging process, that will need long-term maintenance. Historically external rpm templating solutions (and there’s been a *lot* of those) fail to adapt to rpm changes and finish producing poor lowest-common-denominator spec files. This change does not require any additional software to the Fedora packaging stack, it’s just a bunch of macros.
Why does it fail in mock?

The feature is currently hitting two mock limitations:

1. first, mock collects SRPM at the start, not end of a build. So the SRPM produced by mock does not contain the bumped state. That’s not a difficult problem to solve, it just involves calling the SRPM generation method at the end of the build (either by default or as an optional thing)

2. second, you need to pass packager info to mock to get the correct name and mail in the changelog. Again, this is not a difficult show-stopper problem, just some work to do on option passing to mock (you can pass the info using existing mock config files but that’s a bit awkward).

Mock issue: https://github.com/rpm-software-management/mock/issues/599

The hardest thing is to make the feature work at the rpm level, propagating it to upper levels is comparatively easier. All the upper levels are designed to consume what rpmbuild outputs.

You should talk to people first!

The feature did not exist, even as an idea, two days before the change was posted. It started as an experiment, “how hard could it be”, that was wildly successful.

The change is proposed a full cycle before its target release to let time for people to think on it and decide if and how they wish to use it.

Why did you bother with it in the first place!

From a purely architecture POW I’m convinced the proposed approach is the correct approach. Anything else proposed so far involves:

  • tying a low-level event like "build occurred at date XXX" to high-level Fedora infra (making our workflow non portable and incompatible with downstreams and third parties)
  • taking bets in git that a build will occur and succeed (before it actually occurs and succeeds, in real life builds fail for various reasons), and
  • attempting to munge spec file behind the packager back (unlikely to work fine the more automated and dynamic we made those).



Benefit to Fedora

Autobumping removes a huge packager chore and makes time-stamping in changelogs more reliable.

Scope

  • Proposal owners: The feature is coded and works at the rpm level. Unfortunately, mock filters away the srpms containing the bump state, so it does not work in upper layers.
  • Other developers: The feature requires buy-in by mock developers (and probably koji developers) to lift the restrictions that block it above the rpm level. Also, it requires a mechanism to pass the user name and email that will be used in bumped changelogs (defining two variables in ~/.rpmmacros is sufficient at rpm level)
  • Policies and guidelines: maybe eventually if things work out on the technical level
  • Trademark approval: N/A (not needed for this Change)

Upgrade/compatibility impact

This is a pure build tooling update, it changes how things are built not what is built.

How To Test

A redhat-rpm-config packages with the changes and some example packages are available in

 https://copr.fedorainfracloud.org/coprs/nim/refactoring-forge-patches-auto-call-bump-changelog-fonts/builds/

Since the mock/copr layer is currently blocking the feature, you need to install the redhat-rpm-config and forge macro packages available in this repo locally. Afterwards you can take any of the example packages in the repo and rebuild them with rpmbuild -ba to your heart content, and see the releases bump and the changelogs being updated accordingly.

To get beautiful changelogs, you also need to add

%buildsys_packager  Your name <Your email>

in ~/.rpmmacros

User Experience

N/A Packager experience change only

Dependencies

The change is a spin-off of

https://fedoraproject.org/wiki/Changes/Patches_in_Forge_macros_-_Auto_macros_-_Detached_rpm_changelogs

Therefore, it depends on the success of that other change and will probably need rebasing if the code in this other change evolves during the redhat-rpm-config merge.

It also depends on mock / copr/ koji buy-in and changes, that may add their own requirements.

Contingency Plan

There is no contingency plan because the change will happen or not at all.

Documentation

There is as much documentation as the average redhat-rpm-config change (ie comments in the macro files themselves)

Release Notes

N/A Packager productivity change only