Support Noarch Sub Packages in Fedora
Summary
Since some months RPM supports sub packages being noarch. Right now the Fedora infrastructure does not support this feature. This feature will provide the technical abilities to use noarch sub packages and also provide help to use them within packages all over the distribution.
Owner
Current status
- Targeted release: Fedora 11
- Last updated: --Ffesti 20:40, 12 February 2009 (UTC)--Ffesti 19:17, 29 January 2009 (UTC)
- Percentage of completion: 33%
Detailed Description
There are several steps needed:
- Support in rpm (100%)
- Support in koji (75%)
- see Ticket
- Fedora infrastructure still needs to be updated
- Support in other parts of the infrastructure (unknown)
- Support in test/verification tools (unknown)
- rpmlint (?)
- ... (?)
- Get a list of possible candidates (sub packages) (100%)
- Write a mail to f-d-l and package owners (33%)
- Write best practise documentation (0%)
- Get packaging policy adjusted (see /PolicyChanges) (10%)
- Get the /PackagesChanged
Benefit to Fedora
Noarch packages have several benefits over arch dependent packages:
- They can be shared between different architectures and thus use up less disk space and bandwidth on both the Fedora infrastructure and our mirrors
- They avoid double installation of data for multilib packages.
- They tell the user that the content of the package is arch independent.
By increasing the use of noarch packages we also increase the effect of these benefits.
Additionally we can get rid of some hacks that are used to generate noarch sub packages for very few packages right now.
Scope
A small statistic on Fedora rawhide x86_64 (2009-01-22) to give an idea how many packages/files/bytes could be affected:
All files where put into one of the following categories:
- bin32: 32 bit binaries including libraries(!) (as known to rpm, file color==1)
- bin64: 64 bin binaries including libraries (file color==2)
- lib32: other files in /lib or /usr/lib
- lib64: other files in /lib64 or /usr/lib64
- noarch: everything else
Sizes are (uncompressed) bytes in files and though do not directly map to the size of packages nor to used disk space.
15560 packages (44 GB in 2.0 M files) 11 k bin32 files (2.1 GB) 31 k bin64 files (6.8 GB) 142 k lib32 files (1.7 GB) 161 k lib64 files (5.4 GB) 1.7 M noarch files (28 GB) 8906 x86_64 packages (25 GB in 1.0 M files) 31 k bin64 files (6.8 GB) 21 k lib32 files (503 MB) 161 k lib64 files (5.4 GB) 828 k noarch files (13 GB) 3489 noarch packages (14 GB in 763 k files) 88 bin32 files (2.3 MB) 87 k lib32 files (648 MB) 676 k noarch files (13 GB) 3163 i386 packages (5.4 GB in 282 k files) 10 k bin32 files (2.1 GB) 34 k lib32 files (571 MB) 237 k noarch files (2.7 GB)
Test Plan
- Create one noarch subpackage by adding BuildArch: noarch to the subpackage section
- Scratch build the package to see whether there are any problems with koji
- Build package for rawhide - check that it correctly shows up in the repository and is shown as noarch package in the metadata
- See if the package installs correctly via yum
- Check if updating from a arch dependent previous version to the new noarch package works
User Experience
- Slightly improved mirrors due to less transfer size
- Only packages containing binaries will be arch dependent
Dependencies
- rpm >= 4.6.0 (is in Fedora since months when counting release candidates)
- the steps listed in the #Detailed Description.
Contingency Plan
- Move target to Fedora 12
- As soon as the technical problems have been fixed moving more sub packages to noarch can be a continuing process.
Documentation
What's this all about?
With version 4.6.0 RPM supports subpackages being noarch by just adding "BuildArch: noarch" to their subpackage section in the spec file.
The noarch subpackages built on the different arches are going to be compared by koji with rpmdiff ignoring time stamp, size and md5 sums of files. If any other differences are found the build will be rejected. Even with those automatic checks in place it is the responsibility of the packager to make sure that the package is really arch independent - as for regular noarch packages, too.
Candidates for being switched to noarch
To get a list with good candidates all x86_64 packages that contain no binaries/libs (as known to rpm) and no files in /lib64 or /usr/lib64 were selected as a starting point. To further refine the selection and get an idea what can go wrong rpmdiff was run against the i386 sister packages - both with the relaxed koji and the strict -t settings. This showed a small number of false positives - mostly development packages that put files in different locations or undetected binary packages. Subpackages are marked by one surrounding '*' if they only fail the more strict rpmdiff -t check and by two if they also fail the rpmdiff check as used by koji. It is assumed that packages without '*' can be directly switched to noarch (assuming they don't do weird stuff on other arches). One '*' will require a more detailed look but should be OK in most cases and two '*'s is most likely a sign for a false positive. The diffs can be found below in a full and a hand shortened version.
- media:NoarchCandidates.txt
- media:NoarchRpmdiff.txt (1.6MB), media:NoarchRpmdiffShort.txt (28kB)
- media:NoarchStrictRpmdiff.txt (4.6MB), media:NoarchStrictRpmdiffShort.txt (180kB)
The 949 (sub) packages (1.4 GB package size) contain 4.1 GB data and are distributed over 584 source packages.
Candidates for splitting off noarch subpackages
To search for more data that could be moved into noarch sub packages all files in the distributions where put into one of the following categories:
- bin32: 32 bit binaries including libraries(!) (as known to rpm, file color==1)
- bin64: 64 bit binaries including libraries (file color==2)
- lib32: other files in /lib or /usr/lib
- lib64: other files in /lib64 or /usr/lib64
- noarch: everything else
To be able to detect arch independent files in (/usr)/lib x86_64 packages have been examined. It is assumed that lib32 and noarch files can be moved to noarch sub packes, bin64 and lib64 can't and bin32 should not be found. This is only a very rough estimate and must be checked for each packages and doesn't take other architectures into account. Nevertheless it gives a good idea of what packages should be considered and what results can be expected.
- media:SplitCandidates.txt - 2000 most worthy splitting candidates Updated!
- media:PackageFileTypes.txt (0.9 MB) - file type distribution for each x86_64 package sorted by owner
- media:PackageFileTypesForComaintainers.txt (1.4 MB) - same as above be each package is listed for every comaintainer
For some packages it might be better to just change the borders among the subpackages instead of blindly splitting them. Such situations are not reflected well in the above lists.
How many packages are worth splitting
The table below shows how much content can be moved to noarch packages by splitting a given number of packages - assuming that the most worthy packages are split.
# new noarch sub packages | pkg size /-------- content------------------| | noarch | other | all | ratio | 10 | 1.4 GB | 34 MB | 1.4 GB | 97% | 837 MB 20 | 2.1 GB | 168 MB | 2.2 GB | 92% | 1.2 GB 30 | 2.6 GB | 205 MB | 2.8 GB | 92% | 1.4 GB 50 | 3.3 GB | 335 MB | 3.6 GB | 90% | 1.8 GB 70 | 3.8 GB | 474 MB | 4.3 GB | 89% | 2.1 GB 100 | 4.4 GB | 573 MB | 5.0 GB | 88% | 2.3 GB 200 | 5.7 GB | 1.3 GB | 6.9 GB | 81% | 3.1 GB 300 | 6.4 GB | 1.6 GB | 8.0 GB | 79% | 3.5 GB 400 | 6.8 GB | 2.0 GB | 8.9 GB | 77% | 3.7 GB 500 | 7.2 GB | 2.2 GB | 9.4 GB | 76% | 3.9 GB 600 | 7.4 GB | 2.5 GB | 10 GB | 74% | 4.1 GB 700 | 7.7 GB | 2.7 GB | 10 GB | 73% | 4.3 GB 800 | 7.8 GB | 2.9 GB | 11 GB | 73% | 4.4 GB 900 | 8.0 GB | 3.1 GB | 11 GB | 72% | 4.5 GB 1000 | 8.1 GB | 3.3 GB | 11 GB | 70% | 4.6 GB 2000 | 8.7 GB | 4.8 GB | 13 GB | 64% | 5.3 GB 3000 | 8.9 GB | 5.6 GB | 14 GB | 61% | 5.7 GB 4000 | 9.0 GB | 6.6 GB | 16 GB | 57% | 6.0 GB 7957 | 9.0 GB | 12 GB | 21 GB | 42% | 7.9 GB
Note that there is not much gain above 1000 packages. Even the 500 package between 500 and 1000 gain less than another Gig to the 7.2 we get for the first 500. The decision to split the packages is of course left to the maintainers but we should try to split at least as much of the first 300 or 400 packages as possible. Together with the packages that can directly be converted to noarch they contain nearly 11 out of 13 GB of the noarch content not yet in noarch packages (see #Scope and #Candidates for being switched to noarch ).
What about other packages?
A lot of other packages could also make use of this feature. When considering to split up your package please avoid too complicated spec files or increasing the number of packages unnecessarily. Use your best judgement.
What can you do as a packager?
There are still fixes for koji that must hit the Fedora build system first. So noarch subpackages DO NOT WORK within Fedora yet. We hope that this can be solved soon.
If you are interested you can already play with noarch subpackages by building with mock and comparing the results on different arches with the koji version of rpmdiff (files differing in S, 5 and T might be ok). There is going to be little time between support in koji and the feature freeze. So being prepared for this short time slot is a good thing.
Please add the packages you changed or plan to change to /PackagesChanged. Put the later in parenthesis. Thanks!
What if you don't want to change your packages?
That's perfectly fine. There is no plan to force packager to use noarch subpackages. I hope we can develop a more detailed plan on how to make use of this feature in future Fedora releases. You might be interested in taking part in this discussion.
What does that mean for the Packaging Policy?
The packaging policy will require a few additions. See /PolicyChanges. Any comments and help is welcome.
Release Notes
Not applicable as visibility for the users is low and developers need to know before the release.