From Fedora Project Wiki
(→‎Garbage Collection in Koji: This is not the same as permanently deleting a package from Koji and by consequence all builds for Fedora.)
(Clarify "restoring" section)
 
(4 intermediate revisions by 4 users not shown)
Line 1: Line 1:
= Garbage Collection in Koji =
= Garbage Collection in Koji =


By garbage collection, we mean removing builds from the system that are no longer needed. This is not the same as permanently deleting a package from Koji and by consequence all builds for Fedora.  While it would be nice to keep everything
By garbage collection, we mean removing builds from the system that are no longer needed. While it would be nice to keep everything
forever, we have a finite amount of space and the rpms pile up quickly. While the current implementation focuses on removing builds, we may in the future perform other sorts of cleaning.
forever, we have a finite amount of space and the rpms pile up quickly. While the current implementation focuses on removing builds, we may in the future perform other sorts of cleaning.


Line 52: Line 52:
When a build is actually deleted, the files are removed from disc and some (but not all) of the data about the build is removed from the db. The residual data is quite small, though. In particular, the build entry and rpm entries are still present. This prevents reuse of the nvr or nvras.
When a build is actually deleted, the files are removed from disc and some (but not all) of the data about the build is removed from the db. The residual data is quite small, though. In particular, the build entry and rpm entries are still present. This prevents reuse of the nvr or nvras.


== Restoring/Protecting Builds ==


== How to protect a build ==
Deleted builds ''cannot be restored''. However, the gc process can easily be prevented at any of the earlier stages by ensuring that the build is appropriately tagged.
 
If a build has been mistakenly marked for deletion, i.e. it has been placed in the trashcan by the garbage collector, then you can interrupt the GC process by tagging it elsewhere. When the GC sees that the build has acquired a new tag, it will remove the trashcan tag and leave it alone.
 
Knowing which tag to use can sometimes be tricky. If the build is still a candidate for release, then use the appropriate candidate tag. It might be useful to look at the tagging history of the build. For example:
 
<pre>
$ koji list-history --build golang-uber-ratelimit-0.2.0-1.fc37 -s tag_listing
Fri Sep  2 03:42:02 2022 golang-uber-ratelimit-0.2.0-1.fc37 tagged into f37-updates-candidate by mikelo2
Sat Oct 29 23:39:52 2022 golang-uber-ratelimit-0.2.0-1.fc37 untagged from f37-updates-candidate by oscar
Sat Nov  5 07:47:08 2022 golang-uber-ratelimit-0.2.0-1.fc37 tagged into trashcan by oscar [still active]
</pre>
 
In this example, the build was retagged into f37-updates-candidate with the following command.
 
<pre>
$ koji tag-build f37-candidate golang-uber-ratelimit-0.2.0-1.fc37
</pre>
 
If you are not sure where to tag a build, then ask [https://docs.pagure.org/releng/ release engineering].


If a build is marked for deletion, the odds are that this is the correct thing to do. However, it is possible that a build was mistakenly untagged. If you believe this is the case, the fix is to make sure the build is properly tagged. If you are not sure which tag is the correct one, ask a koji admin (rel-eng AT fedoraproject DOT org).


== More about pruning policies ==
== More about pruning policies ==
Line 96: Line 115:
</pre>
</pre>


If you think the policy needs to be adjusted, please notify a koji admin (rel-eng AT fedoraproject DOT org).
If you think the policy needs to be adjusted, please notify a koji admin.

Latest revision as of 15:51, 7 November 2022

Garbage Collection in Koji

By garbage collection, we mean removing builds from the system that are no longer needed. While it would be nice to keep everything forever, we have a finite amount of space and the rpms pile up quickly. While the current implementation focuses on removing builds, we may in the future perform other sorts of cleaning.

The intent of garbage collection is to be as careful as possible and only remove builds that truly are no longer needed. The process happens in stages and a notification email is sent to the build owner when a build is marked for deletion.


Stages

The three stages of garbage collection are:

pruning
obsoleted builds are untagged according to policies
trashing
untagged/unreferenced builds are placed in the trashcan
deleting
builds in the trashcan are deleted

You might envision a build's "life-cycle" as follows:

tagged -> untagged -> in_trashcan -> deleted

Pruning

In the pruning step, unneeded builds are automatically removed from certain tags according to a set of policies. These policies are robust and allow rules based on tag, package, age (within tag), order (within tag), and signatures. More details on this are presented below.

Because of the policies, the pruning step can be made to behave in different ways. However, the overall intent is to remove old builds from tags. Generally policies will be along the lines of "keep the latest three builds of each package in such-and-such tag."

Please note that pruning does not delete builds, it simply untags them (though this may eventually lead to their deletion via the other stages). Since a build can be multiply tagged, it may be untagged from one tag but remain in another.

Trashing

Trashing a build simply means tagging it with the special 'trashcan' tag. We say that such a build is 'marked for deletion' or 'has been placed in the trashcan.' The point of this step is to provide a safety net and give the build a chance to be salvaged if need be. The garbage collector will only place builds in the trashcan if it satisfies these basic requirements:

  • the build is untagged, and has been untagged for at least 5 days (actual delay configurable)
  • the build is not signed with a protected key
  • the build has not been used in the buildroot of another completed build
  • the build has not been used in any buildroot for 5 days (same configurable delay as above)

If a build satisfies these conditions, then it will be placed in the special trashcan tag and an email notification is sent to the build owner.

Deleting

In the deleting step, each build in the trashcan is examined. If it is still eligible for deletion and has been in the trashcan longer than the grace period (4 weeks, configurable), it is deleted. If it is not eligible (e.g. it has been tagged elsewhere or somehow acquired a protected signature), then it is removed from trashcan tag (this process is called salvage).

When a build is actually deleted, the files are removed from disc and some (but not all) of the data about the build is removed from the db. The residual data is quite small, though. In particular, the build entry and rpm entries are still present. This prevents reuse of the nvr or nvras.

Restoring/Protecting Builds

Deleted builds cannot be restored. However, the gc process can easily be prevented at any of the earlier stages by ensuring that the build is appropriately tagged.

If a build has been mistakenly marked for deletion, i.e. it has been placed in the trashcan by the garbage collector, then you can interrupt the GC process by tagging it elsewhere. When the GC sees that the build has acquired a new tag, it will remove the trashcan tag and leave it alone.

Knowing which tag to use can sometimes be tricky. If the build is still a candidate for release, then use the appropriate candidate tag. It might be useful to look at the tagging history of the build. For example:

$ koji list-history --build golang-uber-ratelimit-0.2.0-1.fc37 -s tag_listing
Fri Sep  2 03:42:02 2022 golang-uber-ratelimit-0.2.0-1.fc37 tagged into f37-updates-candidate by mikelo2
Sat Oct 29 23:39:52 2022 golang-uber-ratelimit-0.2.0-1.fc37 untagged from f37-updates-candidate by oscar
Sat Nov  5 07:47:08 2022 golang-uber-ratelimit-0.2.0-1.fc37 tagged into trashcan by oscar [still active]

In this example, the build was retagged into f37-updates-candidate with the following command.

$ koji tag-build f37-candidate golang-uber-ratelimit-0.2.0-1.fc37

If you are not sure where to tag a build, then ask release engineering.


More about pruning policies

The pruning policy is a series of rules. During pruning, the garbage collector goes through each tag in the system and considers its contents. For each build within the tag, it goes through the pruning rules until it finds one that matches. It it does, it takes that action for it.

In the policy configuration, each line is a rule, and the first matching rule wins. The format is:

test <args> [ && test <args> && ...]  :: action

The available tests are:

tag
the name of the tag must match one of the patterns
package
the name of the package must match one of the patterns
age
a comparison against the length of time since the build was tagged. This is not the same as the age of the build.
sig
true if any of the build's component rpms are signed with a matching key
order
a comparison against the order number of the build within a given tag. The order number is the number of more ; recently tagged builds for the same package within the tag. For example, the latest build of glibc in dist-f8 has order number 0, the next latest has order number 1, and so on. Note that the 'skip' action modifies this -- the build is kept, but is not counted for ordering.

Note that the tests are not being applied to just a build, but to a build within a tag. If a build is multiply tagged, it will be checked against these policies for each tag and may be kept in some but untagged in others.

The available actions are:

keep
do not untag the build from this tag.
untag
untag the build from this tag
skip
like keep, but do not count the build for ordering

Note that, regardless of any policies, locked tags are left alone.

At present, the pruning policy is:

tag *-updates :: keep
age < 1 day :: skip
sig fedora-gold :: skip
sig fedora-test && age < 12 weeks :: keep

tag *-testing *-candidate && order >= 2 :: untag
tag *-testing *-candidate && order > 0 && age > 6 weeks :: untag
tag *-candidate && age > 60 weeks :: untag

order > 2 :: untag

If you think the policy needs to be adjusted, please notify a koji admin.