Packaging Python modules for Python 3
I hope to add a parallel-installable Python 3 stack to Fedora 13.
See the feature page: https://fedoraproject.org/wiki/Features/Python3F13 and also this thread: https://www.redhat.com/archives/fedora-devel-list/2009-October/msg00054.html
This requires us to come up with a sane way to package Python 3 modules, and this requires us to generalize our python packaging rules to support more than one python runtime.
The existing Python packaging guidelines are here: Packaging/Python
Multiple Python Runtimes
In Fedora we have multiple python runtimes, one for each supported major release.
Each runtime corresponds to a binary of the form /usr/bin/python$MAJOR.$MINOR
One of these python runtimes is the "system runtime". It can be identified by the destination of the symlink /usr/bin/python
. Currently this is /usr/bin/python-2.6
All python runtimes have a virtual provide for python(abi) = $MAJOR-$MINOR
. For example, the python-3.1 runtime rpm has:
$ rpm -q --provides python3 |grep -i abi python(abi) = 3.1
python modules using these runtimes should have a corresponding "Requires" line on the python runtime that they are used with. This is done automatically for files below /usr/lib[^/]*/python${PYVER}
Byte Compiling
When byte compiling a .py file, python embeds a magic number in the byte compiled files that correspond to the runtime. Files in {%python_sitelib}
and %{python_sitearch}
must correspond to the runtime for which they were built. For instance, a pure python module compiled for the 3.1 runtime needs to be below %{_usr}/lib/python3.1/site-packages
Normally, this is done for you by the brp-python-bytecompile
script. This script runs after the %install
section of the spec file has been processed and byte-compiles any .py files that it finds (this recompilation puts the proper filesystem paths into the modules otherwise tracebacks would include the %{BUILDROOT}
in them). The script determines which interpreter to byte compile the module with by following these steps:
- what directory is the module installed in? If it's /usr/lib/pythonX.Y, then pythonX.Y is used to byte compile the module. If pythonX.Y is not installed, then an error is returned and the rpm build process will exit on an error so remember to BuildRequire the proper python package.
- the script interpreter defined in __python is used to compile the modules. This defaults to the latest python2 version on Fedora. If you need to compile this module for python3, set it to /usr/bin/python3 instead. Like this:
%global __python %{__python3}
This step is useful when you have a python3 application that's installing a private module into its own directory. For instance, if the foobar application installs a module for use only by the command line application in %{_datadir}/foobar. Since these files are not in one of the python3 library paths (like /usr/lib/python3.1) you have to set
%{__python}
manually to tell brp-python-bytecompile what python interpreter to byte compile for.
These settings are enough to properly byte compile any package that only builds python modules (in %{python_sitelib}
or %{python_sitearch}
) or builds for only a single python interpreter. However, if the application you're packaging needs to build with both python2 and python3 and install into a private module directory (perhaps because it provides one utility written in python2 and a second utility written in python3) then you need to do this manually. Here's a sample spec file snippet that shows what to do:
# Turn off the brp-python-bytecompile script %global brp_python_bytecompile %{nil} # Buildrequire both python2 and python3 BuildRequires: python-devel python3-devel [...] %install # Installs a python2 private module into %{buildroot}%{_datadir}/mypackage/foo # and installs a python3 private module into %{buildroot}%{_datadir}/mypackage/bar make install DESTDIR=%{buildroot} # Manually invoke the python byte compile macro for each path that needs byte # compilation. %{py_byte_compile} /usr/bin/python2 %{buildroot}%{_datadir}/mypackage/foo %{py_byte_compile} /usr/bin/python3 %{buildroot}%{_datadir}/mypackage/bar
Python modules for non-standard runtimes
Naming
Addon Packages (python3 modules)
An rpm with a python
prefix or suffix means a python2 rpm so we need a different prefix to denote python3 packages. For this, we use python3
. We have two constraints that the python2 packages don't operate under:
- We need to be clear about these modules being for python3 so we don't have an exception for packages that already have "py" in their names like python2 modules.
- Consumers of the packages need to be able to find them even if they don't know whether they're using the python2 or python3 version.
So all python3 modules MUST have python3 in their name. Other than that, the module must be in the same format as the python2 package. Some examples:
Fedora python 2 package | Upstream name | Proposed python 3 package name |
---|---|---|
python-lxml | lxml | python3-lxml |
pygtk2 | pygtk | python3-pygtk |
gstreamer-python | gst-python | gstreamer-python3 |
gnome-python2 | gnome-python | gnome-python3 |
rpm-python | (part of rpm) | rpm-python3 |
Common SRPM vs split SRPMs
There are two approaches I'm experimenting with to packaging modules for python 3:
- create an separate specfile/srpm for the python 3 version
- extend an existing specfile so that it emits a python3- subpackage as part of the build.
I've experimented with both approaches for python3-setuptools
Split/separate SRPMs: a src.rpm for python- and another for python3-
Given package python-foo
in packaging CVS, there would be a separate python3-foo
for the python 3 version. There would be no expectation that the two would need to upgrade in lock-step. (The two SRPMS could have different maintainers within Fedora: the packager of a python 2 module might not yet have any interest in python 3)
Example: python3-setuptools
https://bugzilla.redhat.com/show_bug.cgi?id=531648
(simple adaptation of python-setuptools, apparently without needing an invocation of 2to3)
Dave Malcolm has written a tool which generates a python3-foo.spec
from a python-foo.spec
; see http://dmalcolm.fedorapeople.org/python3-packaging/rpm2to3.py
Advantages:
- if the python-foo maintainer doesn't care about python 3, he/she doesn't need to
- the two specfiles can evolve separately; if 2 and 3 need to have different versions, they can
Disadvantages:
- the two specfiles have to be maintained separately
- when upstream release e.g. security fixes, they have to be tracked in two places
Method
- Use the
-n
syntax to emit apython3-foo
subpackage from apython-foo
build. - Towards the end of the
%prep
phase, copy the code to a parallel subdirectory, and invoke2to3 --write
upon it
Examples:
- Emitting
python3-setuptools
as a subpackage frompython-setuptools
: https://bugzilla.redhat.com/show_bug.cgi?id=531895 - Emitting
python3-lxml
as a subpackage frompython-lxml
: https://bugzilla.redhat.com/show_bug.cgi?id=533290
Advantages:
- single src.rpm and build; avoid having to update multiple packages when things change.
Disadvantages:
- The Fedora maintainer needs to care about python 3. By adding python 3 to the mix, we're giving them extra work.
- 2 and 3 versions are in lockstep. Requires upstream to case about Python 3 as well (or for Python 2, for that matter)
- Bugzilla components are set up by source RPM, so they would have a single shared bugzilla component. This could be confusing to end-users, as it would be more difficult to figure out e.g. that a bug with python3-foo needs to be filed against python-foo. There's a similar problem with checking out package sources from CVS, though this is less serious as it doesn't affect end-users so much.
The easy case is when upstream release separate tarballs for the python 2 and python 3 versions of code. In that case, it makes sense to follow upstream and have separate specfiles, separate source rpms, etc.
The more difficult case is when the python module is emitted as part of the build of a larger module.
One case is for an extension module giving python bindings for a library built within the larger rpm. Some examples:
- the build of
rpm
itself emits anrpm-python
subpackage (see https://bugzilla.redhat.com/show_bug.cgi?id=531543 ) - Another example is the
postgres
srpm, which emits apostgresql-python
subpackage. - libvirt
I believe the ideal here is to patch the code so that it will build against both python versions, then take a copy of the sources during the %prep phase, and configure one subdirectory to build against python 2, another to build against python 3.
Macros
The python3-devel
subpackage contains a /etc/rpm/macros.python3
file which contains definitions of:
__python3 python3_sitelib python3_sitearch
which thus makes it unnecessary to define these in every module specfile (see https://bugzilla.redhat.com/show_bug.cgi?id=526126#c43 ).
Guidelines for adding python3 subpackages to an existing package
Use a with_python3
conditional
All parts of the build relating to python3 should be conditionalized, to make it easy to turn off the python3 build in case of problems.
You should add this fragment to the top of the source file:
%if 0%{?fedora} > 12 %global with_python3 1 %endif
Rationale: we should consistent use "with_python3". The conditionals make it easy use the same spec for RHEL and other branches than devel
.
All usage of this macro should look like this:
%if 0%{?with_python3} ... %endif # with_python3
This way the code will be disabled if the macro is not defined, and it is easy to visually match if/endif pairs
Separate python 2 and python 3 build directories
The python 2 and python 3 build should be as independent as possible.
You should define a macro "py3dir" defining the location of the python 3 build directory near the top of the .spec file.
A typical definition of this macro might look like this:
%if 0%{?with_python3} %global py3dir ../python3-%{name}-%{version} %endif # with_python3
If you have had to use a %{srcname}
macro to work around differences between the rpm name and the tarball's directory name, the definition should look like this:
%if 0%{?with_python3} %global py3dir ../python3-%{srcname}-%{version} %endif # with_python3
The %prep
phase
The %prep
phase of the build should prepare an entirely distinct source tree for the python3 build in the py3dir.
A recommended way to do this is to add this to the end of the %prep
code:
%if 0%{?with_python3} cp -a . %{py3dir} %endif # with_python3
Make sure that you are copying the correct code. The above code assumes that you are within the top of the sources directory (typically with the "Foo-1.0" within the build). If the %prep
has changed directory you will need to change back to the tarball location.
If your package requires you to apply some patches only to the python 2 build, and some patches only to the python 3 build, you should structure your %prep
like this:
%setup # Apply patches relevant to both python 2 and python 3: %patch0 %patch1 ... # Create source tree for python3 build: %if 0%{?with_python3} cp -a . %{py3dir} %endif # with_python3 # Apply patches only relevant to python 2: %patch ... # Apply patches only relevant to python 3: %if 0%{?with_python3} cd %{py3dir} %patch ... %endif # with_python3
rpmbuild
resets the directory at the end of each phase, so you don't need to restore the directory at the end of %prep
.
Other phases
For each of the %build
, %check
and %install
phases, you should copy the existing code, wrapping it with a pushd/popd of %{py3dir}
, and convert all macro references:
- from
%{__python}
to%{__python3}
, %{python_sitelib}
to%{python3_sitelib}
and%{python_sitearch}
to%{python3_sitearch}
.
For example, this %build
section:
CFLAGS="$RPM_OPT_FLAGS" %{__python} setup.py build
should become:
# Python 2: CFLAGS="$RPM_OPT_FLAGS" %{__python} setup.py build # Python 3: %if 0%{?with_python3} pushd %{py3dir} CFLAGS="$RPM_OPT_FLAGS" %{__python3} setup.py build popd %endif # with_python3
so that the python 2 and python 3 versions of the code line up vertically, making it easier to see differences. The usage of pushd/popd commands will ensure that the directories are logged.
Rationale: it's not easily possible to turn this into a loop (FIXME: is it?) due to the macro differences, so we must unroll the loop and repeat ourselves.
Avoiding collisions between the python 2 and python 3 stacks
The python 2 and python 3 stacks are intended to be fully-installable in parallel. When generalizing the package for both python 2 and python 3, it is important to ensure that two different built packages do not attempt to place different payloads into the same path.
Executables in /usr/bin
The problem
Many existing python packages install executables into /usr/bin
.
For example if we have a console_scripts
in a setup.py
shared between
python 2 and python 3 builds: these will spit out files in /usr/bin/
,
and these will collide.
For example python-coverage
has a setup.py
that contains:
entry_points = { 'console_scripts': [ 'coverage = coverage:main', ] },
which thus generates a /usr/bin/coverage
executable (this is a python
script that runs another python script whilst generating code-coverage
information on the latter).
Similarly for the 'scripts' clause; see e.g. python-pygments
:
Pygments-1.1.1/setup.py
has:
scripts = ['pygmentize'],
which generates a /usr/bin/pygmentize
(this is a python script that leverages the pygments syntax-highlighting module, giving a simple command-line interface for generating syntax-highlighted files)
Guidelines
If the executables provide the same functionality independent of whether they are run on top of Python 2 or Python 3, then only one version of the executable should be packaged. Currently it will be the python 2 implementation, but once the Python 3 implementation is proven to work, the executable can be retired from the python 2 build and enabled in the python 3 package. Be sure to test the new implementation. FOR DISCUSSION: how do we do the transition period?
Examples of this:
/usr/bin/pygmentize
ought to generate the same output regardless of whether it's implemented via Python 2 or Python 3, so only one version needs to be shipped.
If the executables provide different functionality for Python 2 and Python 3, then both versions should be packaged.
Examples of this:
/usr/bin/coverage
runs a python script, augmenting the interpreter with code-coverage information. Given that the interpreter itself is the thing being worked with, it's reasonable to package both versions of the executable./usr/bin/bpython
augments the interpreter with a "curses" interface. Again, it's reasonable to package both versions of this./usr/bin/easy_install
installs a module into one of the Python runtimes: we need a version for each runtime.
As an exception, for the rpms that are part of a python runtime itself, we plan to package both versions of the executables, so that e.g. both the python 2 and python 3 versions of 2to3
are packaged.
Naming
Many executables already contain a "-MAJOR.MINOR" suffix, for example /usr/bin/easy_install-3.1
. These obviously can be used as-is, as they won't conflict.
For other executables, the general rule is:
- if only one executable is to be shipped, then it owns its own slot
- if executables are to be shipped for both python 2 and python 3, then the python 3 version of the executable gains a
python3-
prefix. For example, the python 2 version of "coverage" remains/usr/bin/coverage
and the python 3 version is/usr/bin/python3-coverage
. FOR DISCUSSION: should the python 2 version gain apython2-
prefix, and have the main path becoming a symlink to thepython2-
version?
See this thread for a discussion of this.
Best Practices
Recommended best-practices for keeping python 2 and python 3 in sync:
- when packaging a module for python 3, you should approach the python 2 package owners.
- if separate maintainership for python 2 vs python 3 modules, you should request a watchbugzilla and watchcommit on each other's packages
- complete any python 2 Merge Review before doing a python 3 version
- add link to the python 2 Merge Review/Package Review to the python 3 Package Review
- if you need to run 2to3 to fix code, use
2to3-3
to use the/usr/bin/2to3-3
from thepython3-tools
rpm, rather than/usr/bin/2to3
from thepython-tools
rpm (rationale: this is a somewhat arbitrary decision, but it seems worthwhile to have a policy here). - if
2to3-3
runs into a problem, please file a bug. Please try to isolate a minimal test case that reproduces the problem when doing so. - remember to test the built RPMs and verify that they actually work!
TODO
These items need to be addressed before the Guidelines can be brought to the Packaging Committee
- Must address bug 532118 so the Requires: python(abi) is automatically extracted. Then we can remove the warning from the #Multiple_Runtimes section.
- [done] brp_python_bytecompile updated
- Write py_byte_compile macro to do manual byte compilation and get it into rpm or redhat-rpm-config so it's available to packagers
- Approve the Naming Guidelines rewrite