m (Corrected link to yum repository) |
(Update lookaside cache description for https://pagure.io/releng/issue/5846) |
||
Line 62: | Line 62: | ||
The Lookaside Cache is a storage system for upstream source archives. Most source control systems do not handle large binary files very well so we have designed a system to archive them and reference them from within our package source control. | The Lookaside Cache is a storage system for upstream source archives. Most source control systems do not handle large binary files very well so we have designed a system to archive them and reference them from within our package source control. | ||
Every package repository will have a <code>sources</code> file. Within this file there is a | Every package repository will have a <code>sources</code> file. Within this file there is a sha512 hash and source file name for each source archive the package uses. When client programs require the sources for modification or building they are fetched from the lookaside cache using the sources file for reference. When a contributor needs to add a new source file or replace an existing source file, client software will securely upload the file (using ssl certs) and update the sources file accordingly. | ||
The Lookaside Cache keeps all the previous versions of every uploaded file available and accessible by their hash. The historical files are never deleted. | The Lookaside Cache keeps all the previous versions of every uploaded file available and accessible by their hash. The historical files are never deleted. |
Revision as of 04:10, 14 March 2017
This page describes Fedora's package source control system. This covers:
- repository setup
- authentication and authorization system
- repository contents
- user interaction
- interaction with our build system.
Repository Setup
Fedora package source control consists of a set of individual git repositories, one per Fedora package. These repositories all live on a central server within the Fedora infrastructure.
The server name is pkgs.fedoraproject.org
and all the repositories are named after the sourcerpm. For example the repository for the yum package is http://pkgs.fedoraproject.org/cgit/rpms/yum
.
Repository Filesystem Configuration
All the repositories live in the /srv/git/rpms/
path on the server. This path has group sticky bit set and group owned by the packager
group. Each repository is created using the --shared
option making each repository "group shared". This ensures that multiple maintainers will be able to share commit access to the repositories, provided they are all in the packager group.
Branch Structure
Within each package repository there can be a set of "top level" or "default" in-repo branches. These branches are created for each Fedora or EPEL release a package may be built for. This allows for changes for one release to not depend or conflict with changes from another release. The naming of these branches currently follow a syntax of fNN/master or elN/master where branches that start with f
are for Fedora, followed by the number of the release, eg the branch for Fedora 14 is f14/master
. Branches that start with el and are followed by a number are for EPEL. The branch for EPEL 6 is el6/master
.
Currently "/master" is appended to the top level branches. This is due to wanting a namespace for branches beyond the top level branches, eg f14/mytopicbranch
. The design is such that all branches that are related to Fedora 14 would start with f14/
. Because of a git shortcoming we cannot have a "f14" and "f14/mytopicbranch" at the same time. Thus, "fNN/master" was made the default branch name. There is a proposal to change this.
Rawhide builds come from the master
branch. When a new Fedora release branches from Rawhide, a new top level branch in git is created from "master".
Commit Emails
When changes are pushed to the central git repos, information about those changes are emailed to two locations.
- <package>-owner@fedoraproject.org
- scm-commits@fedoraproject.org
The first is an email alias who's recipients are controlled by the Fedora Package Database. The second is an open mailing list hosted by the Fedora Project.
We currently use a post-receive hook from the gnome project to perform the emailing.
Authentication and Authorization
The Fedora Package Source Control system uses a layered authentication and authorization system to control access to the git repositories.
Authentication
Currently there are two ways to obtain clones of the git repos.
- authenticated ssh:// based clones
- anonymous git:// based clones
Authenticated ssh based clones require the client to have a valid Fedora account within our Account System, and belong to the packagers group. SSH authentication is carried out via ssh keys which are preloaded onto the git server for each potential user.
Anonymous access is through the git:// protocol and requires no authentication.
Authorization
By default git has no built in system for authorization. It relies upon filesystem permissions controlled by the operating system where the repository lives. Typically this works well enough. However within Fedora we have the concept of different access rights per Fedora/EPEL release. This manifests in the source control at the branch level. Because git has no specific filesystem segregation based on branch names we have to use an addon to git in order to achieve per-branch access rights.
Fedora Package Source Control makes use of gitolite to provide per-branch access rights. When a user attempts to push changes to a repository first the user must authenticate using ssh. Then gitolite is envoked to verify the user has any rights to the repository. If the user has rights the push attempt is forwarded on to git itself which will start its process. Eventually an update hook within the git repo is invoked which calls gitolite again to authorize whether or not the user in question has rights to commit to a particular branch path. Gitolite will check the user name against a pre-generated ACL (Access Control List) and either allow or deny the action.
The use of gitolite also allows us to have users who are allowed to have shell access to the git server and users who are not, without changing the path to where the repositories can be found.
ACL Generation
The ACLs used by gitolite are generated using data from the Fedora Package Database. This Database allows package maintainers to define who has commit access to each Fedora/EPEL release for a given package. This data is used to construct an ACL for each package and is combined with global settings which give SCM admins and secondary arch maintainers commit access to every package/branch.
ACLs are regenerated every 10 minutes via a cron job on the git server.
Repository Contents
Our repositories track changes that are important and relevant to the Fedora project as opposed to upstream changes. As such our repositories have rpm .spec files, any patches we apply, or any supplementary source files we supply. Upstream content is stored in a lookaside cache.
Lookaside Cache
The Lookaside Cache is a storage system for upstream source archives. Most source control systems do not handle large binary files very well so we have designed a system to archive them and reference them from within our package source control.
Every package repository will have a sources
file. Within this file there is a sha512 hash and source file name for each source archive the package uses. When client programs require the sources for modification or building they are fetched from the lookaside cache using the sources file for reference. When a contributor needs to add a new source file or replace an existing source file, client software will securely upload the file (using ssl certs) and update the sources file accordingly.
The Lookaside Cache keeps all the previous versions of every uploaded file available and accessible by their hash. The historical files are never deleted.
User Interaction
Users can interact with the Fedora Package Source Control in multiple ways.
- Using git clients directly
- Using fedpkg
- Using Eclipse Fedora Packager and EGit
- Browsing web frontends
Git Clients
One of the goals of the Package Source Control was to allow use of standard and well known clients to interact with it. While we do provide a helper tool for the source control actions, all interaction with the source control can be done using standard git clients. For that reason there is no/limited special setup required when working with our git repositories. Clone, commit, push. As stated above git clients can be used either anonymously or authenticated with ssh keys.
Fedpkg
fedpkg, which is part of the fedora-packager project, provides some "targets" to interact with the source control system. These targets are very loose wrappers around git itself. The intent is that the options one would pass to git are the same options one would pass to fedpkg. The wrappers exist so that there can be a single tool maintainers use to interact with the source control system, the look aside cache, and the buildsystem. Should we ever change backend systems maintainers can continue to use fedpkg to interact.
Eclipse Fedora Packager
Eclipse Fedora Packager is an Eclipse plug-in which allows for easy interaction with Fedora Git repositories using the graphical interface of EGit. Moreover, you can push builds to Koji, upload/download sources from and to the lookaside cache and push Bodhi updates, all from within Eclipse. Have a look at its project page and/or user guide for more information.
Web Frontends
Currently we are using gitweb-caching to provide a web frontend to browse package repositories. Performance has been an issue, particularly with loading the index of all 10K+ repositories. The web frontend provides another way to anonymously interact with the repositories.
Buildsystem Interaction
Fedora's build system, Koji interacts with our source control in a read-only manner. When a maintainer requests a build, the maintainer can request that the build come from a particular commit hash within the package repository. Koji configuration allows for a list of "allowed" SCM sources, as well as configuration as to what commands to run in order to populate a directory with the spec file, any and all patches, and any and all source archives necessary to build a source RPM.
Currently builds are initiated using a commit hashsum as a reference for the source. Tagging the source is unnecessary.