No edit summary |
No edit summary |
||
Line 11: | Line 11: | ||
After discussions with the yum developers, it was decided that the best way to support this preference for the Koji-managed repo would be to merge the local repodata and the remote repodata into a single repo. During this merge process, any packages provided by the local repo (filtering by package name) would be included in the merged repodata, and the corresponding packages from the remote repo would be elided. This filtering process would need to be done at the source rpm level, to avoid subpackages from the remote repo slipping in to the repodata. Seth Vidal and James Antill have generously offered to spearhead development of this '''repomerge''' tool. | After discussions with the yum developers, it was decided that the best way to support this preference for the Koji-managed repo would be to merge the local repodata and the remote repodata into a single repo. During this merge process, any packages provided by the local repo (filtering by package name) would be included in the merged repodata, and the corresponding packages from the remote repo would be elided. This filtering process would need to be done at the source rpm level, to avoid subpackages from the remote repo slipping in to the repodata. Seth Vidal and James Antill have generously offered to spearhead development of this '''repomerge''' tool. | ||
A new field, <code>remote_repo_url</code> will be added to the <code>tag_config</code> table in the Koji database to hold the URL of | A new field, <code>remote_repo_url</code> will be added to the <code>tag_config</code> table in the Koji database to hold the URL of an additional repo, configurable on a per-tag basis. The url may contain placeholders for <code>arch</code> and <code>tag name</code> that will be replaced with the appropriate values for the repo being created. At repo creation time, the tag inheritance tree will be walked and the urls for all remote repos associated with tags in that tree will be collected. These urls will be passed, in inheritance order, to the <code>createrepo</code>/<code>mergerepo</code> tool, which will generate the final repodata by filtering out duplicate packages, using either the ''first-match-wins'' rule described above (with the Koji-managed repo being the first repo), or possibly some other configurable filtering rule (''highest-nvr-wins'' being another possibility). The resulting repo will then be used for subsequent builds against the tag. | ||
=== Tracking packages === | === Tracking packages === | ||
When a build is complete Koji uploads the list of rpms in the buildroot to the hub to be inserted into the database for tracking purposes. Currently, every rpm in the list must exist in the Koji database or an error will be raised. Under this proposal, if | When a build is complete Koji uploads the list of rpms in the buildroot to the hub to be inserted into the database for tracking purposes. Currently, every rpm in the list must exist in the Koji database or an error will be raised. Under this proposal, if there are no remote repos configured in the tag hierarchy, this behaviour doesn't change, and Koji will work exactly as it does now. | ||
If the | If there are remote repos enabled in the tag hierarchy, Koji will load the repodata used to perform that build. For each rpm found in the buildroot it will query the repodata for the <code>baseurl</code> associated with that rpm. If the <code>baseurl</code> corresponds to the location of the locally-managed Koji packages, then information about that rpm already exists in the Koji database, and it will be handled normally. If the <code>baseurl</code> points to somewhere other than the Koji package store, then that rpm came from a remote repo. For each of these rpms an entry will be created in the <code>rpminfo</code> table that stores the name, version, release, and some additional metadata, pulled from the repodata, that allows the rpm to be identified. An additional field, <code>origin</code>, will be added to the table, and this will be populated with <code>baseurl</code> associated with the rpm. This field will be populated with a common value, ''local'', for all locally-managed rpms. The new <code>rpminfo</code> entry will then be associated with the buildroot via the <code>buildroot_listing</code>, just as a locally-managed rpm would. A remote <code>rpminfo</code> entry will not be associated with a build, and a constraint will be added to the table to ensure that only entries whose origin is not ''local'' may have a null <code>build_id</code>. The XML-RPC API will be updated to include the <code>origin</code> information in the data structures it returns, and the web UI will be updated to indicate that a rpm came from a remote repository, and provide the url to that repository. | ||
Adding rpms from remote repositories into the <code>rpminfo</code> table does raise some issues. Right now that table enforces uniqueness of (''name, version, release, arch''). This is appropriate when all rpms are being managed locally in Koji, and we want to prevent rpms with the same ''N-V-R.A'' but different contents from existing in the system. However, remote repos may have rpms that duplicate locally-managed rpms, and it may be appropriate for one tag to pull in an rpm from a remote repo that exists locally in another tag. For this reason, the <code>rpminfo_unique_nvra</code> constraint on the <code>rpminfo</code> table will be expanded to include the <code>origin</code> field as well. Each remote repo will now have its own namespace for ''N-V-R.A''. Locally-managed rpms will still have the same uniqueness constraints because they will all share the same ''local'' namespace. | Adding rpms from remote repositories into the <code>rpminfo</code> table does raise some issues. Right now that table enforces uniqueness of (''name, version, release, arch''). This is appropriate when all rpms are being managed locally in Koji, and we want to prevent rpms with the same ''N-V-R.A'' but different contents from existing in the system. However, remote repos may have rpms that duplicate locally-managed rpms, and it may be appropriate for one tag to pull in an rpm from a remote repo that exists locally in another tag. For this reason, the <code>rpminfo_unique_nvra</code> constraint on the <code>rpminfo</code> table will be expanded to include the <code>origin</code> field as well. Each remote repo will now have its own namespace for ''N-V-R.A''. Locally-managed rpms will still have the same uniqueness constraints because they will all share the same ''local'' namespace. |
Revision as of 19:00, 10 July 2008
Supporting EPEL Builds in Koji
Using Koji to build EPEL packages has been a goal for a long time. However, it has been held up by the desire to build against official RHEL packages while not making those packages public. Right now Koji can only build against packages it has a local copy of, under /mnt/koji/packages
, and that directory is served via http to the public.
This is a proposal to enable Koji to populate the minimal buildroot and resolve dependencies using packages from an admin-configurable remote yum repository. A minimum amount of information about packages from the remote repository would be inserted into the local Koji database to allow the package to be traced back to its origin. This would enable Koji to pull RHEL packages from a private repo for the purpose of building EPEL packages, without making the RHEL packages public. Note that everything built by Koji would still be available to the public. This approach has the additional benefit of greatly simplifying the bootstrapping process for people running their own Koji instances. They can simply consume packages from an existing yum repo and skip the entire package import process.
Finding the packages
Koji generates a yum config for every buildroot it creates and uses. This config points to a repo that has been created by Koji from packages imported into or built by Koji and associated with the build tag. Adding an additional, external repo raises questions about which repo a package should come from if the package sets overlap. There is a strong feeling that if a package exists in the Koji-managed local repo (whose contents the Koji admin has full control over) it should always be preferred over the external repo (whose contents the Koji admin may have little or no control over). As an example, if a package is available in the remote repo, and a custom version of the same package is built in Koji, the Koji version should always be the one available in the buildroots. It should not be overridden when the remote repo decides to update that package to a newer version, or the customizations (which we assume were made for good reason) would get silently lost. In particular, always preferring the local Koji packages over the remote repo packages allows for reverting a package to a version earlier than what is in the remote repo, which may be necessary to resolve build problems or conflicts.
After discussions with the yum developers, it was decided that the best way to support this preference for the Koji-managed repo would be to merge the local repodata and the remote repodata into a single repo. During this merge process, any packages provided by the local repo (filtering by package name) would be included in the merged repodata, and the corresponding packages from the remote repo would be elided. This filtering process would need to be done at the source rpm level, to avoid subpackages from the remote repo slipping in to the repodata. Seth Vidal and James Antill have generously offered to spearhead development of this repomerge tool.
A new field, remote_repo_url
will be added to the tag_config
table in the Koji database to hold the URL of an additional repo, configurable on a per-tag basis. The url may contain placeholders for arch
and tag name
that will be replaced with the appropriate values for the repo being created. At repo creation time, the tag inheritance tree will be walked and the urls for all remote repos associated with tags in that tree will be collected. These urls will be passed, in inheritance order, to the createrepo
/mergerepo
tool, which will generate the final repodata by filtering out duplicate packages, using either the first-match-wins rule described above (with the Koji-managed repo being the first repo), or possibly some other configurable filtering rule (highest-nvr-wins being another possibility). The resulting repo will then be used for subsequent builds against the tag.
Tracking packages
When a build is complete Koji uploads the list of rpms in the buildroot to the hub to be inserted into the database for tracking purposes. Currently, every rpm in the list must exist in the Koji database or an error will be raised. Under this proposal, if there are no remote repos configured in the tag hierarchy, this behaviour doesn't change, and Koji will work exactly as it does now.
If there are remote repos enabled in the tag hierarchy, Koji will load the repodata used to perform that build. For each rpm found in the buildroot it will query the repodata for the baseurl
associated with that rpm. If the baseurl
corresponds to the location of the locally-managed Koji packages, then information about that rpm already exists in the Koji database, and it will be handled normally. If the baseurl
points to somewhere other than the Koji package store, then that rpm came from a remote repo. For each of these rpms an entry will be created in the rpminfo
table that stores the name, version, release, and some additional metadata, pulled from the repodata, that allows the rpm to be identified. An additional field, origin
, will be added to the table, and this will be populated with baseurl
associated with the rpm. This field will be populated with a common value, local, for all locally-managed rpms. The new rpminfo
entry will then be associated with the buildroot via the buildroot_listing
, just as a locally-managed rpm would. A remote rpminfo
entry will not be associated with a build, and a constraint will be added to the table to ensure that only entries whose origin is not local may have a null build_id
. The XML-RPC API will be updated to include the origin
information in the data structures it returns, and the web UI will be updated to indicate that a rpm came from a remote repository, and provide the url to that repository.
Adding rpms from remote repositories into the rpminfo
table does raise some issues. Right now that table enforces uniqueness of (name, version, release, arch). This is appropriate when all rpms are being managed locally in Koji, and we want to prevent rpms with the same N-V-R.A but different contents from existing in the system. However, remote repos may have rpms that duplicate locally-managed rpms, and it may be appropriate for one tag to pull in an rpm from a remote repo that exists locally in another tag. For this reason, the rpminfo_unique_nvra
constraint on the rpminfo
table will be expanded to include the origin
field as well. Each remote repo will now have its own namespace for N-V-R.A. Locally-managed rpms will still have the same uniqueness constraints because they will all share the same local namespace.