From Fedora Project Wiki
 
Line 67: Line 67:
==== Maintenance and harvesting/clean up of all of the above ====
==== Maintenance and harvesting/clean up of all of the above ====
* builds/coprs
* builds/coprs
** how long to keep them around
** how to limit space per-user/group/etc?
** do we let people update repos with new pkgs?
* repos
* repos
** for how long?
** dealing with crank-tastic things like groups?
** getting them to sync them off to somewhere else?
* builders spawned in scale-up
* builders spawned in scale-up
** kill any non-active builder
** do they need to be reset on EVERY build (probably every build from a different repo)
* ftbfs dumps.
* ftbfs dumps.
** just capture log results and dump the rest
* more? I'm sure.
* more? I'm sure.

Latest revision as of 20:27, 27 July 2012

Fedora Build System - beyond building the distro

Summary

Right now the fedora build system is focused solely on building the packages for the fedora distribution. It has little or no support for building any other arbitrary package set nor for non-fedora (but free software things).

Our goals

  • maintain existing build capabilities in koji (don't break koji)
  • maintain compatibility of our tools (don't break fedpkg)
  • create a fedora-contributor-accessible space where arbitrary pkgs from arbitrary repos, with arbitrary dependencies can be built against fedora securely and consistently. (coprs)
  • overhaul how we setup and maintain our buildboxes so we can redeploy these systems quickly and scale up during high demand. (builder git repo)


Various Problems

  • Koji is not built to handle arbitrary pkgs from arbitrary repos. It is an intentional non-feature of koji. So instead of trying to crowbar something into koji it makes sense to do this in parallel and using some of the same infrastructure
  • Koji has trust requirements that cannot be guaranteed for arbitrary repos. Specifically in terms of what rpm scriptlets can be doing when they are install for a build reqs. With this in mind it is a non-feature to overlap in ways that could compromise the trust in koji.
  • Spawning and maintaining a massive number of builders (scaling out to 300+ arm builders or needing 100 new builders for a ftbfs/mass-rebuild run)
  • enhancing the user experience in our current build tools for their own systems.

How we get there

There are a lot of areas needed to make this all come together. I broke the process into a series of smaller components such that we had something useful at each step and could assemble them to become the whole system, like voltron.

Things done or mostly done

Over the last 4 or 5 months I've been working on a series of tools to achieve our goals - they are included below

Changes to mock

  • added mockchain to mock - implements recursive chain builds and

creates the mock config from base configs adding arbitrary user repos. http://git.fedorahosted.org/cgit/mock.git/tree/py/mockchain.py http://skvidal.wordpress.com/2012/04/20/mockchain-use-cases-and-examples/

  • added plugin to mock to output the chroot system state as well as set of pkgs available in repositories to yum/mock at the time

of the build. Important criteria for solving build failures due to missing build requirements.

Changes to builders

  • reinstall and vm-ize all of our existing builder hw in the bladecenters
  • setup newbuilder process using libvirtd and ansible to make our builder deployment be trivial and consistent.


Work Started

Changes to our infrastructure

preparing for our private cloud deployment inside phx2. This should allow us to create arbitrary and COMPLETELY isolated instances on demand without a concern for where to deploy them and if we have sufficient resources (or if we have a way of paying for them :)

Coprs web interface

For user/contributors to submit urls to srpms and build repos and have the pkgs from the srpm urls built (recursively) against each other and the packages in the build repos. I've started the coprs interface as a test case for trying out flask. I have a limited interface and the data saving out to a db. This is insufficient and incomplete currently. https://copr-skvidal.rhcloud.com/

mockremote/ftbfs tools

Main control job for spawning arbitrary jobs on remote systems.

  • mockremote is a single buildsystem mechanism to parcel out jobs one at a time and maintain a local repository for them to draw build requirements from.
  • ftbfs - multi-builder/multijob builder to attempt to rebuild all of fedora distribution packages. This works but needs refactoring to scale to hundreds of builder processes. It also has a couple of race issues creating a local repo for the whole process. Tested on part of rawhide (2500ish pkgs) with good results.

Work Needed/Not Started

Changes to koji

Script to run in kojira to monitor builders and tasks which have not checked in at their regular schedule. This would check for abandoned/unfinished jobs and free them back up for another koji builder to take up. This will allow us to allocate/deallocate builders without having to worry about:

  1. disabling the builder in koji (finding a kojiadmin)
  2. if a build is currently running on any given host

Storage of COPRS/BUILDS

Figure out a storage destination for the results of any of these jobs. Right now we don't have a destination in mind for where these pkgs should go. Fedorapeople.org was suggested but it is a difficult fit to get files to that system and make sure they are accessible and owned by the right users.

Maintenance and harvesting/clean up of all of the above

  • builds/coprs
    • how long to keep them around
    • how to limit space per-user/group/etc?
    • do we let people update repos with new pkgs?
  • repos
    • for how long?
    • dealing with crank-tastic things like groups?
    • getting them to sync them off to somewhere else?
  • builders spawned in scale-up
    • kill any non-active builder
    • do they need to be reset on EVERY build (probably every build from a different repo)
  • ftbfs dumps.
    • just capture log results and dump the rest
  • more? I'm sure.