From Fedora Project Wiki
mNo edit summary
mNo edit summary
Line 213: Line 213:
No functional change that might affect maintainers. Only UX differences.
No functional change that might affect maintainers. Only UX differences.


* Release engineering: [https://pagure.io/releng/issues #Releng issue number] <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Release engineering: <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- Does this feature require coordination with release engineering (e.g. changes to installer image generation or update package delivery)?  Is a mass rebuild required?  include a link to the releng issue.  
<!-- Does this feature require coordination with release engineering (e.g. changes to installer image generation or update package delivery)?  Is a mass rebuild required?  include a link to the releng issue.  
The issue is required to be filed prior to feature submission, to ensure that someone is on board to do any process development work and testing and that all changes make it into the pipeline; a bullet point in a change is not sufficient communication -->
The issue is required to be filed prior to feature submission, to ensure that someone is on board to do any process development work and testing and that all changes make it into the pipeline; a bullet point in a change is not sufficient communication -->

Revision as of 19:53, 22 December 2024


Packit as a dist-git CI

This is a proposed Change for Fedora Linux.
This document represents a proposed Change. As part of the Changes process, proposals are publicly announced in order to receive community feedback. This proposal will only be implemented if approved by the Fedora Engineering Steering Committee.

Summary

This change aims to change a dist-git CI solution to one based on Packit and deprecate current solutions (Fedora CI and Zuul).

This change does not affect the tests being run and the test execution service (=Testing Farm), but the mechanism used to trigger and report them.

This is strictly not related to the opt-in Packit-provided workflows. (I.e. this does not require using Packit for release syncing.) A slightly different branding might be used to avoid confusion.


Owner



Current status

  • Targeted release: Fedora Linux 43 (technically not tight to a specific release)
  • Last updated: 2024-12-22
  • [Announced]
  • [<will be assigned by the Wrangler> Discussion thread]
  • FESCo issue: <will be assigned by the Wrangler>
  • Tracker bug: <will be assigned by the Wrangler>
  • Release notes tracker: <will be assigned by the Wrangler>

Detailed Description

Since Packit has a long-term and feature-rich integration with Testing Farm and also automation around dist-git pull requests, the Packit team is willing to help with a current, non-ideal, situation with dist-git CI. It looks like there are not enough resources to properly support and improve the current solutions (Fedora CI and Zuul). This proposal does not influence Fedora CI Bodhi testing but might be considered as a next step.

Fedora CI maintenance had been a long-term concern without a clear owner for the dist-git integration and with plans to move it to the Testing Farm team. The Testing Farm team would like to avoid maintenance of these and rather have only a single integrator both upstream and downstream.

Since Packit Fedora automation heavily relies on working CI on dist-git pull requests, we need to make sure it works reliably.

Most of the code is already there in Packit’s codebase. The work is mainly about plumbing everything together and making sure it works as expected.

We welcome any suggestions but these are the phases we are thinking about:

  • phase 0: (current state)
    • opt-in (by Packit service configuration)
    • Scratch build being run (example PR)
  • phase 1:
    • opt-in (publicly available)
    • Scratch build being run
    • time: ~ month
  • phase 2:
  • phase 3:
    • opt-in (publicly available)
    • scratch build + installability check + user-defined TMT tests (as separate results)
    • time: ~ month
  • final phase ~ Fedora CI replacement:
    • by-default
    • Fedora CI is deprecated
    • Separate Packit deployment instance for CI; running in the Fedora Infrastructure.
    • opt-out in favour of Zuul
    • scratch build + installability check + user-defined TMT tests (as separate results)
    • time: ~ enough time for people to try before defaulting
  • next steps
    • Everything currently run by Fedora Zuul on dist-git pull requests is run via Packit (opt-in).
    • Deprecate dist-git tenant on Fedora Zuul instance.

Risks:

  • Deployment to Fedora infrastructure
    • Despite Packit requiring only OpenShift for its deployment, there can still be problems found when deploying to another cluster.
    • Some work is required to be done by the Fedora Infrastructure team that we can’t influence.
  • Performance
    • When going to the final phase when run by default, the load might significantly increase. Despite the number of packages actively using dist-git pull requests is not huge, we still need to be prepared.
  • dist-git git forge switch
    • Despite Packit being written in a git-forge-independent way, there can still be some issues related to it (be it on Packit's side or not).
    • (Forgejo-support in the underlying forge-unifying library is being researched and work can start soon.)

Support:

  • For transparency, the Packit team is a Red Hat team under the same manager as the CPT team (=Copr and Mock) and in the same group as Testing Farm and other community-related projects.
  • The team works and plans in public (see the Packit Team Kanban Board) and welcomes any external contributor to collaborate.
  • To have more Fedora control over the suggested service, the team wants to deploy it to the Fedora Infrastructure OpenShift cluster and give the Fedora Infrastructure team access to it so the Packit team can’t become a bottleneck.
  • In case of interest, we welcome anyone to join us in this effort.
  • We are active in the #packit@fedora.im Matrix channel but a separate channel can be used to improve the communication.

A couple of links to better understand the current situation:

Feedback

  • Confusion with the current Packit workflows
    • Rebranding might be needed.
  • There are concerns about fragmentation across Fedora services.
    • Packit’s goal is not to implement everything but rather integrate existing services and provide the best user experience. This can lead to the reduction of services users need to interact with.
    • Packit is already in the ecosystem so it’s about using it for more use cases.
  • STI is currently not supported by Packit.
    • Technically it can be introduced (because Testing Farm supports it), but since it has been obsoleted for some time and there is a Change Proposal to not support it, this might not be needed at all. (We can use this as an opportunity to move away.)
  • What to do about test/pipeline definition repositories? Who should be responsible for it and how to better collaborate on these?
    • As a result of this change, we would like to improve also situation around the responsibilities of shared plans and clearly state who is responsible for which part. Packit can but not need to maintain these repositories.
  • Should Packit aim to run Zuul plans for everyone? (The plans that are not run by Fedora CI.)
    • Ideally, the topic of the check choice should not be part of this change but we can do the installability check by default and let people opt-in to checks currently run by Zuul. (And make it possible to run these by default if there is a broader agreement.)
  • Should there be an opt-out mechanism for the final solution?
    • If needed, we can allow this, but ideally, there should be a broader discussion and agreement if packages can opt-out from default jobs.
  • Why not pick current solutions?
    • Covered more in the Benefits section and motivation, but the main difference is not so much in technical differences but in a team dedicated to maintaining and further developing the solution.
  • What will be the effect on Koji? Won't it increase the load because of the duplicate scratch builds?
    • At the end, there will be only a single dist-git CI so there won't be any duplicates.
    • For the opt-in phases, the duplicates are possible.
      • We can ask people to opt-out from Zuul if they opt-in to Packit-based CI. For Fedora CI, we would welcome any suggestion if there isn't a way to opt-out from it.
      • We can also try to share scratch builds with other solutions but since it's only for a limited time, this might not be worth the time.

Benefit to Fedora

  • Single CI system to run and maintain.
    • Both for the user and resource efficiency (e.g. single scratch build, single service to maintain).
  • Reliable and actively maintained+developed CI
    • Most importantly, the Packit team offers not only the implementation but also people to support the service. (We consider this to be the main issue with the current solutions.)
    • We sadly hear regular complaints about the stability of both existing solutions and issues needing to be fixed for a long time. (This is mainly related to the previous point that people do not have enough time for it.)
    • Packit has a history of being reliable (current set of SLOs, dist-git specific ones might be defined) and has a dedicated team.
    • We also developed multiple mechanisms to make the testing process more reliable (e.g. auto-retry on infrastructure issues and so-called babysitting tasks to check for results if the result message got lost).
  • Feature-rich CI experience
    • The CI integrations do not need to be only about triggering the test job and reporting, but also about being able to rerun a particular part of the test suite or influence the test job workflow based on some rules (e.g. based on labels). People can decide what to run and if to run it automatically or manually. This is especially useful when speaking about user-defined tests.
  • Same UX for upstream and downstream
    • This would allow people to have the very same test experience and definitions both upstream and downstream.
    • Packit handles not only tests. We want to unify the experience for the whole pipeline from upstream through dist-git and Koji to Bodhi.
  • Implementation of git-forge agnostic thanks to a unified git-forge API (Pagure and GitLab are available now, Forgejo is being researched and work can start soon).
    • This work can be coordinated with the git-forge swap for a smoother transition and to save time migrating the existing solutions.
  • Possible future enhancements when this work is done:
    • Copr builds (available in upstream)
    • OpenScanHub checks (available in upstream)
    • Reverse-dependency builds/tests (via Koschei, research done)

Scope

  • Proposal owners:
    • Implemement the solution following the phases above and communicate the milestones publically.
    • Update Fedora documentation.
    • Agree on the ownership of shared test plans.
  • Other developers:

No functional change that might affect maintainers. Only UX differences.

  • Release engineering: -
    • Coordination with release engineering is not required. A mass rebuild is not necessary for this change.
  • Policies and guidelines: documentation needs to be updated
  • Trademark approval: N/A (not needed for this Change)
  • Alignment with the Fedora Strategy:

This proposal aligns with Fedora’s strategy of improving contributor experience and easing workflows, supporting a more efficient and user-friendly ecosystem.

Upgrade/compatibility impact

Early Testing (Optional)

Do you require 'QA Blueprint' support? N

How To Test

During phases 1-3, one can opt-in to this solution and try it for real. (Most probably via global configuration, similar to the Zuul approach. Team has a research task in progress on this topic. Suggestions welcome.)


User Experience

  • The functionality and workflow will be preserved.
  • As a reaction to a dist-git pull request a scratch build and a follow-up test are being triggered. Users are notified about progress via statuses that lead to Packit dashboard providing more details about builds and tests.


Dependencies

Ending support for STI tests (being submitted as another change) might ease the implementation.

Contingency Plan

  • Contingency mechanism: (What to do? Who will do it?) N/A (not a System Wide Change)
  • Contingency deadline: N/A (not a System Wide Change)
  • Blocks release? No

If the implementation is not ready in time, we can (maybe temporarily) stay with one of the current solutions.


Documentation

Team research about technical feasibility of this functionality: https://packit.dev/research/integrations/fedora-ci

Release Notes