From Fedora Project Wiki

Revision as of 15:42, 10 July 2008 by Mikeb (talk | contribs)

Supporting EPEL Builds in Koji

Using Koji to build EPEL packages has been a goal for a long time. However, it has been held up by the desire to build against official RHEL packages while not making those packages public. Right now Koji can only build against packages it has a local copy of, under /mnt/koji/packages, and that directory is served via http to the public.

This is a proposal to enable Koji to populate the minimal buildroot and resolve dependencies using packages from an admin-configurable remote yum repository. A minimum amount of information about packages from the remote repository would be inserted into the local Koji database to allow the package to be traced back to its origin. This would enable Koji to pull RHEL packages from a private repo for the purpose of building EPEL packages, without making the RHEL packages public. Note that everything built by Koji would still be available to the public. This approach has the additional benefit of greatly simplifying the bootstrapping process for people running their own Koji instances. They can simply consume packages from an existing yum repo and skip the entire package import process.

Implementation

Finding the packages

Koji generates a yum config for every buildroot it creates and uses. This config points to a repo that has been created by Koji from packages imported into or built by Koji and associated with the build tag. Adding an additional, external repo raises questions about which repo a package should come from if the package sets overlap. There is a strong feeling that if a package exists in the Koji-managed local repo (whose contents the Koji admin has full control over) it should always be preferred over the external repo (whose contents the Koji admin may have little or no control over). As an example, if a package is available in the remote repo, and a custom version of the same package is built in Koji, the Koji version should always be the one available in the buildroots. It should not be overridden when the remote repo decides to update that package to a newer version, or the customizations (which we assume were made for good reason) would get silently lost. In particular, always preferring the local Koji packages over the remote repo packages allows for reverting a package to a version earlier than what is in the remote repo, which may be necessary to resolve build problems or conflicts.

After discussions with the yum developers, it was decided that the best way to support this preference for the Koji-managed repo would be to merge the local repodata and the remote repodata into a single repo. During this merge process, any packages provided by the local repo (filtering by package name) would be included in the merged repodata, and the corresponding packages from the remote repo would be elided. This filtering process would need to be done at the source rpm level, to avoid subpackages from the remote repo slipping in to the repodata. Seth Vidal and James Antill have generously offered to spearhead development of this repomerge tool.

A new field, remote_repo_url will be added to the tag_config table in the Koji database to hold the URL of the additional repo, configurable on a per-tag basis. The url may contain placeholders for arch and tag name that will be replaced with the appropriate values for the repo being created. At repo creation time, the repodata will be retrieved from the processed url and merged with the local repodata as described above. This single repo will then be used for subsequent builds against the tag.

Tracking packages

When a build is complete Koji uploads the list of rpms in the buildroot to the hub to be inserted into the database for tracking purposes. Currently, every rpm in the list must exist in the Koji database or an error will be raised. Under this proposal, if the remote repo in the tag_config table is null, this behaviour doesn't change, and Koji will work exactly as it does now.

If the remote repo is not null, Koji will assume that any rpms in the buildroot that are not in the database came from the remote repo. For each of these rpms an entry will be created in the rpminfo table that stores the name, version, release, and some additional metadata that allows the rpm to be identified. An additional field, origin, will be added to the table, and this will be populated with the processed url of the repo associated with the tag. This field will be populated with a common value, local, for all locally-managed rpms. The new rpminfo entry will then be associated with the buildroot via the buildroot_listing, just as a locally-managed rpm would. A remote rpminfo entry will not be associated with a build, and a constraint will be added to the table to ensure that only entries whose origin is not local may have a null build_id. The XML-RPC API will be updated to include the origin information in the data structures it returns, and the web UI will be updated to indicate that a rpm came from a remote repository, and provide the url to that repository.

Adding rpms from remote repositories into the rpminfo table does raise some issues. Right now that table enforces uniqueness of (name, version, release, arch). This is appropriate when all rpms are being managed locally in Koji, and we want to prevent rpms with the same N-V-R.A but different contents from existing in the system. However, remote repos may have rpms that duplicate locally-managed rpms, and it may be appropriate for one tag to pull in an rpm from a remote repo that exists locally in another tag. For this reason, the rpminfo_unique_nvra constraint on the rpminfo table will be expanded to include the origin field as well. Each remote repo will now have its own namespace for N-V-R.A. Locally-managed rpms will still have the same uniqueness constraints because they will all share the same local namespace.