From Fedora Project Wiki
 
(47 intermediate revisions by 3 users not shown)
Line 1: Line 1:
[[File:Automobile-number-odometer.jpg|500px|thumb|right]]
= Build Python 3 to statically link with libpython3.8.a for better performance =
= Build Python 3 to statically link with libpython3.8.a for better performance =


== Summary ==
== Summary ==
Python 3 traditionally in Fedora was built with a shared library libpython3.?.so and the final binary was dynimically linked against that shared library. This change is about creating the static library and linking the final python3 binary against it, as it provides significant performance improvement, up to 15% depending on the workload. The static library will not be shipped. The shared library will continue to exist in a separate subpackage. In essence, python3 will no longer depend on libpython.
Python 3 traditionally in Fedora was built with a shared library libpython3.?.so and the final binary was dynamically linked against that shared library. This change is about creating the static library and linking the final python3 binary against it, as it provides significant performance improvement, up to 27% depending on the workload. The static library will not be shipped. The shared library will continue to exist in a separate subpackage. In essence, python3 will no longer depend on libpython.


== Owner ==
== Owner ==
* Name: [[User:Cstratak| Charalampos Stratakis]], [[User:Churchyard| Miro Hrončok]], [[User:Vstinner| Victor Stinner]]
* Name: [[User:Cstratak| Charalampos Stratakis]], [[User:Vstinner| Victor Stinner]], [[User:Churchyard| Miro Hrončok]]
* Email: python-maint@redhat.com
* Email: python-maint@redhat.com
<!--- UNCOMMENT only for Changes with assigned Shepherd (by FESCo)
<!--- UNCOMMENT only for Changes with assigned Shepherd (by FESCo)
Line 30: Line 32:
== Detailed Description ==
== Detailed Description ==


When we compile the python3 package on Fedora (prior to this change), we create the libpython3.?.so shared library and the final python3 binary (<code>/usr/bin/python3</code>) is dynamically linked against it. However by building the libpython3.?.a static library and statically linking the final binary against it, we can achieve a performance gain of approximately 15% depending on the workload. Link time optimizations and profile guided optimizations also have a greater impact when python3 is linked statically.
When we compile the python3 package on Fedora (prior to this change), we create the libpython3.?.so shared library and the final python3 binary (<code>/usr/bin/python3</code>) is dynamically linked against it. However by building the libpython3.?.a static library and statically linking the final binary against it, we can achieve a performance gain of 5% to 27% depending on the workload. Link time optimizations and profile guided optimizations also have a greater impact when python3 is linked statically.
 
Since Python 3.8, [https://docs.python.org/3.8/whatsnew/3.8.html#debug-build-uses-the-same-abi-as-release-build C extensions must no longer be linked to libpython by default]. Applications embedding Python now need to utilize the --embed flag for python3-config to be linked to libpython. During the [[Changes/Python3.8|Python 3.8 upgrade and rebuilds]] we've uncovered various cases of packages linking to libpython implicitly through various hacks within their buildsystems and fixed as many as possible. However, there are legitimate reasons to link an application to libpython and for those cases libpython should be provided so applications that embed Python can continue to do so.
 
This mirrors the Debian/Ubuntu way of building Python, where they offer a statically linked binary and an additional libpython subpackage. The libpython subpackage will be created and python3-devel will depend on it, so packages that embed Python will keep working.
 
The change was first done in Debian and Ubuntu years ago, followed by Python 3.8. manylinux1 and manylinux2010 ABI don't link C extensions to libpython either (to support Debian/Ubuntu).
 
By applying this change, libpython's namespace will be separated from Python's, so '''C extension which are still linked to libpython''' might experience side effects or break.
 
There is one exception for C extensions. If an application is linked to libpython in order to embed Python, C extensions used only within this application can continue to be linked to libpython.


Since Python 3.8, [https://docs.python.org/3.8/whatsnew/3.8.html#debug-build-uses-the-same-abi-as-release-build C extensions are no longer linked to libpython by default] (for example, they need to utilize the --embed flag for python3-config to do so). During the [[Changes/Python3.8|Python 3.8 upgrade and rebuilds]] we've uncovered various cases of packages linking to libpython implicitly through various hacks within their buildsystems and fixed as many as possible. However, there are legitimate reasons to link to libpython and for those cases libpython should be provided so applications that embed Python can continue to do so.
Currently there is no upstream option to build the static library, as well as the shared one and statically link the final binary to it, so we have to rely on a downstream patch to achieve it. We plan to work with upstream to incorporate the changes there as well.


This mirrors the Debian/Ubuntu way of building python, where they offer a statically linked binary and an additional libpython subpackage. The libpython subpackage will be created and python3-devel will depend on it, so packages that embed Python will keep working.
Before the change, python3.8 is dynamically linked to libpython3.8:


Caveats: There is currently no upstream option to build the static library, as well as the shared one and statically link the final binary to it, so we have to rely on a downstream patch to achieve it.
<pre>
+-------------------+
|                  |
|                  |        +--------------------+
|  libpython3.8.so <---------+ /usr/bin/python3.8 |
|                  |        +--------------------+
|                  |
+-------------------+
</pre>


Also libpython's namespace will be separated from python's, so packages that are python C extension and also embed python within their code might experience side effects or break, as with Python 3.8 there is a clear distinction between a C extension and embedding python.
After the change, python3.8 is statically linked to libpython3.8:
 
<pre>
                              +-----------------------+
                              |                      |
                              |  /usr/bin/python3.8  |
                              |                      |
+-------------------+        | +-------------------+ |
|                  |        | |                  | |
|                  |        | |                  | |
|  libpython3.8.so  |        | |  libpython3.8.a  | |
|                  |        | |                  | |
|                  |        | |                  | |
+-------------------+        | +-------------------+ |
                              +-----------------------+
</pre>
 
As a negative side effect, when both libpython3.8.so and /usr/bin/python3.8 are installed, the filesystem footprint will be slightly increased (libpython3.8.so on Python 3.8.0, x86_64 is ~3.4M). OTOH only a very small amount of packages will depend on libpython3.8.so.


== Benefit to Fedora ==
== Benefit to Fedora ==
Line 70: Line 107:
-->
-->


Python's performance will increase significantly depending on the workload. Since many core components of the OS also depend on python this could lead to an increase in their performance as well.
Python's performance will increase significantly depending on the workload. Since many core components of the OS also depend on Python this could lead to an increase in their performance as well, however individual benchmarks will need to be conducted to verify the performance gain for those components.
 
[https://pyperformance.readthedocs.io/ pyperformance] results, ignoring differences smaller than 5%:
 
<pre>
+-------------------------+------------------+------------------------------+
| Benchmark              | python38-3.8.0-1 | python38-3.8.0-666          |
+=========================+==================+==============================+
| nbody                  | 238 ms          | 174 ms: 1.36x faster (-27%)  |
+-------------------------+------------------+------------------------------+
| raytrace                | 919 ms          | 686 ms: 1.34x faster (-25%)  |
+-------------------------+------------------+------------------------------+
| scimark_lu              | 285 ms          | 215 ms: 1.33x faster (-25%)  |
+-------------------------+------------------+------------------------------+
| scimark_sparse_mat_mult | 8.20 ms          | 6.20 ms: 1.32x faster (-24%) |
+-------------------------+------------------+------------------------------+
| django_template        | 204 ms          | 156 ms: 1.31x faster (-24%)  |
+-------------------------+------------------+------------------------------+
| chaos                  | 203 ms          | 156 ms: 1.30x faster (-23%)  |
+-------------------------+------------------+------------------------------+
| logging_simple          | 15.6 us          | 12.2 us: 1.28x faster (-22%) |
+-------------------------+------------------+------------------------------+
| richards                | 124 ms          | 97.0 ms: 1.28x faster (-22%) |
+-------------------------+------------------+------------------------------+
| scimark_fft            | 652 ms          | 511 ms: 1.27x faster (-22%)  |
+-------------------------+------------------+------------------------------+
| hexiom                  | 17.4 ms          | 13.8 ms: 1.27x faster (-21%) |
+-------------------------+------------------+------------------------------+
| logging_format          | 17.1 us          | 13.5 us: 1.27x faster (-21%) |
+-------------------------+------------------+------------------------------+
| nqueens                | 174 ms          | 137 ms: 1.26x faster (-21%)  |
+-------------------------+------------------+------------------------------+
| crypto_pyaes            | 201 ms          | 160 ms: 1.26x faster (-20%)  |
+-------------------------+------------------+------------------------------+
| deltablue              | 12.6 ms          | 10.0 ms: 1.25x faster (-20%) |
+-------------------------+------------------+------------------------------+
| unpickle_pure_python    | 576 us          | 463 us: 1.24x faster (-20%)  |
+-------------------------+------------------+------------------------------+
| pickle_pure_python      | 799 us          | 644 us: 1.24x faster (-19%)  |
+-------------------------+------------------+------------------------------+
| go                      | 449 ms          | 362 ms: 1.24x faster (-19%)  |
+-------------------------+------------------+------------------------------+
| spectral_norm          | 247 ms          | 200 ms: 1.24x faster (-19%)  |
+-------------------------+------------------+------------------------------+
| scimark_monte_carlo    | 185 ms          | 151 ms: 1.23x faster (-19%)  |
+-------------------------+------------------+------------------------------+
| logging_silent          | 340 ns          | 276 ns: 1.23x faster (-19%)  |
+-------------------------+------------------+------------------------------+
| unpickle                | 23.3 us          | 19.1 us: 1.22x faster (-18%) |
+-------------------------+------------------+------------------------------+
| float                  | 200 ms          | 166 ms: 1.21x faster (-17%)  |
+-------------------------+------------------+------------------------------+
| mako                    | 26.6 ms          | 22.0 ms: 1.21x faster (-17%) |
+-------------------------+------------------+------------------------------+
| xml_etree_generate      | 159 ms          | 133 ms: 1.20x faster (-17%)  |
+-------------------------+------------------+------------------------------+
| xml_etree_process      | 128 ms          | 107 ms: 1.20x faster (-16%)  |
+-------------------------+------------------+------------------------------+
| fannkuch                | 795 ms          | 670 ms: 1.19x faster (-16%)  |
+-------------------------+------------------+------------------------------+
| chameleon              | 15.7 ms          | 13.3 ms: 1.18x faster (-15%) |
+-------------------------+------------------+------------------------------+
| scimark_sor            | 347 ms          | 294 ms: 1.18x faster (-15%)  |
+-------------------------+------------------+------------------------------+
| pathlib                | 35.7 ms          | 30.2 ms: 1.18x faster (-15%) |
+-------------------------+------------------+------------------------------+
| regex_compile          | 301 ms          | 255 ms: 1.18x faster (-15%)  |
+-------------------------+------------------+------------------------------+
| genshi_text            | 48.3 ms          | 41.2 ms: 1.17x faster (-15%) |
+-------------------------+------------------+------------------------------+
| sympy_str              | 459 ms          | 394 ms: 1.17x faster (-14%)  |
+-------------------------+------------------+------------------------------+
| genshi_xml              | 102 ms          | 87.6 ms: 1.16x faster (-14%) |
+-------------------------+------------------+------------------------------+
| 2to3                    | 540 ms          | 465 ms: 1.16x faster (-14%)  |
+-------------------------+------------------+------------------------------+
| sqlite_synth            | 4.89 us          | 4.25 us: 1.15x faster (-13%) |
+-------------------------+------------------+------------------------------+
| sympy_expand            | 704 ms          | 613 ms: 1.15x faster (-13%)  |
+-------------------------+------------------+------------------------------+
| html5lib                | 162 ms          | 141 ms: 1.15x faster (-13%)  |
+-------------------------+------------------+------------------------------+
| sympy_integrate        | 34.2 ms          | 30.0 ms: 1.14x faster (-12%) |
+-------------------------+------------------+------------------------------+
| dulwich_log            | 121 ms          | 107 ms: 1.13x faster (-11%)  |
+-------------------------+------------------+------------------------------+
| sympy_sum              | 286 ms          | 253 ms: 1.13x faster (-11%)  |
+-------------------------+------------------+------------------------------+
| xml_etree_iterparse    | 170 ms          | 152 ms: 1.12x faster (-11%)  |
+-------------------------+------------------+------------------------------+
| telco                  | 10.2 ms          | 9.14 ms: 1.11x faster (-10%) |
+-------------------------+------------------+------------------------------+
| meteor_contest          | 171 ms          | 154 ms: 1.11x faster (-10%)  |
+-------------------------+------------------+------------------------------+
| json_dumps              | 20.0 ms          | 18.0 ms: 1.11x faster (-10%) |
+-------------------------+------------------+------------------------------+
| tornado_http            | 425 ms          | 384 ms: 1.11x faster (-10%)  |
+-------------------------+------------------+------------------------------+
| xml_etree_parse        | 249 ms          | 226 ms: 1.10x faster (-9%)  |
+-------------------------+------------------+------------------------------+
| sqlalchemy_imperative  | 53.4 ms          | 49.6 ms: 1.08x faster (-7%)  |
+-------------------------+------------------+------------------------------+
| python_startup          | 13.7 ms          | 12.7 ms: 1.07x faster (-7%)  |
+-------------------------+------------------+------------------------------+
| json_loads              | 43.3 us          | 40.7 us: 1.06x faster (-6%)  |
+-------------------------+------------------+------------------------------+
| python_startup_no_site  | 9.29 ms          | 8.75 ms: 1.06x faster (-6%)  |
+-------------------------+------------------+------------------------------+
| pickle_dict            | 33.8 us          | 32.0 us: 1.06x faster (-5%)  |
+-------------------------+------------------+------------------------------+
| sqlalchemy_declarative  | 272 ms          | 258 ms: 1.05x faster (-5%)  |
+-------------------------+------------------+------------------------------+
</pre>


== Scope ==
== Scope ==
* Proposal owners:
* Proposal owners:
<!-- What work do the feature owners have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
<!-- What work do the feature owners have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
** Review, merge and build the [https://src.fedoraproject.org/rpms/python3/pull-request/133 pull request with the implementation].
** Review and merge the [https://src.fedoraproject.org/rpms/python3/pull-request/133 pull request with the implementation].
** Go through the packages that embed python in their code and see if things work correctly. Will provide a copr repository to test.
** Go through the Python C extension packages that are linked to libpython and test if things work correctly. A copr repository will be provided for testing.


* Other developers: Other developers are encouraged to test the new statically linked python3 to see if their package works as expected <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Other developers: Other developers are encouraged to test the new statically linked python3 and check if their package works as expected <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- What work do other developers have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
<!-- What work do other developers have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->


* Release engineering: [https://pagure.io/releng/issues #Releng issue number] (mass rebuild not needed, no releng impact anticipated) <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Release engineering: [https://pagure.io/releng/issue/8953 #8953] This change does not require a mass rebuild, however a rebuild of the affected packages will be required. The affected packages will be rebuilt in copr first.
<!-- Does this feature require coordination with release engineering (e.g. changes to installer image generation or update package delivery)?  Is a mass rebuild required?  include a link to the releng issue.  
The issue is required to be filed prior to feature submission, to ensure that someone is on board to do any process development work and testing, and that all changes make it into the pipeline; a bullet point in a change is not sufficient communication -->


* Policies and guidelines: No changes are required to the packaging guidelines or other documents <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Policies and guidelines: The packaging guidelines will need to be updated to explicitly mention that C extensions should not be linked to libpython, and that the python3 binary is statically linked. <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- Do the packaging guidelines or other documents need to be updated for this feature?  If so, does it need to happen before or after the implementation is done?  If a FPC ticket exists, add a link here. -->
<!-- Do the packaging guidelines or other documents need to be updated for this feature?  If so, does it need to happen before or after the implementation is done?  If a FPC ticket exists, add a link here. -->


Line 95: Line 242:


<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
N/A (not a System Wide Change)
Affected package maintainers should verify that their packages work as expected and the only impact the end users should see is a performance increase for workloads relying on Python.


== How To Test ==
== How To Test ==
Line 113: Line 260:


<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
N/A (not a System Wide Change)  
 
Copr repo with instructions: https://copr.fedorainfracloud.org/coprs/g/python/Python3_statically_linked/
 
=== Package changes test ===
The change will bring the new <code>libpython3</code> subpackage as a dependency of <code>python3-devel</code>.
 
Test that it's installed:
<pre>
$ rpm -q libpython3
</pre>
 
Test that it's uninstalled if <code>python3-devel</code> is removed:
<pre>
$ dnf remove python3-devel
</pre>
 
Test that <code>python3-libs</code> no longer includes the libpython shared library.
<pre>
$ rpm -ql python3-libs | grep libpython3
</pre>
 
=== Dynamic linker test ===
 
To check that the python3.8 program is not linked to libpython, ldd can be used. For example, Python 3.7 will still be linked to libpython:
 
<pre>
$ ldd /usr/bin/python3.7|grep libpython
libpython3.7m.so.1.0 => /lib64/libpython3.7m.so.1.0 (0x00007fbb57333000)
</pre>
 
But python3.8 will no longer be linked to libpython:
 
<pre>
$ ldd /usr/bin/python3.8|grep libpython
</pre>
 
=== Performance test ===
 
The performance speedup can be measured using the official Python benchmark suite [https://pyperformance.readthedocs.io/ pyperformance]: see [https://pyperformance.readthedocs.io/usage.html#run-benchmarks Run benchmarks].
 
=== Namespace test ===
 
The following script can be used to verify that the change is in effect:
 
<pre>
import ctypes
import sys
EMPTY_TUPLE_SINGLETON = ()
 
def get_empty_tuple(lib):
    # Call PyTuple_New(0)
    func = lib.PyTuple_New
    func.argtypes = (ctypes.c_ssize_t,)
    func.restype = ctypes.py_object
    return func(0)
def test_lib(libname, lib):
    obj = get_empty_tuple(lib)
    if obj is EMPTY_TUPLE_SINGLETON:
        print("%s: SAME namespace" % libname)
    else:
        print("%s: DIFFERENT namespace" % libname)
 
def test():
    program = ctypes.pythonapi
 
    if hasattr(sys, 'abiflags'):
        abiflags = sys.abiflags
    else:
        # Python 2
        abiflags = ''
    ver = sys.version_info
    filename = ('libpython%s.%s%s.so.1.0'
                % (ver.major, ver.minor, abiflags))
    libpython = ctypes.cdll.LoadLibrary(filename)
 
    test_lib('program', program)
    test_lib('libpython', libpython)
 
test()
</pre>
 
Output before the change:
<pre>
program: SAME namespace
libpython: SAME namespace
</pre>
 
Output after the change:
 
<pre>
program: SAME namespace
libpython: DIFFERENT namespace
</pre>


== User Experience ==
== User Experience ==
Line 126: Line 367:
  - Green has been scientifically proven to be the most relaxing color. The move to a default background color of green with green text will result in Fedora users being the most relaxed users of any operating system.
  - Green has been scientifically proven to be the most relaxing color. The move to a default background color of green with green text will result in Fedora users being the most relaxed users of any operating system.
-->
-->
Python based workloads should see a performance gain of up to 27%.


== Dependencies ==
== Dependencies ==
Line 131: Line 374:


<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
N/A (not a System Wide Change)  
While this specific change is not dependent on anything else, we would like to ensure that all the packages that link to libpython continue to work as expected.
 
Currently (30/10/2019) 118 packages on rawhide depend on libpython.
 
Result of the "repoquery --repo=rawhide --source --whatrequires 'libpython3.8.so.1.0()(64bit)' " command on Fedora Rawhide, x86_64:
 
*COPASI
*Io-language
*OpenImageIO
*YafaRay
*antimony
*blender
*boost
*calamares
*calibre
*cantor
*ceph
*clingo
*condor
*createrepo_c
*csound
*cvc4
*dionaea
*dmlite
*domoticz
*fontforge
*freecad
*gdb
*gdcm
*gdl
*getdp
*glade
*globus-net-manager
*glom
*gnucash
*gpaw
*hamlib
*hokuyoaist
*hugin
*insight
*kdevelop-python
*kicad
*kitty
*krita
*lammps
*ldns
*libCombine
*libarcus
*libarcus-lulzbot
*libbatch
*libcec
*libcomps
*libdnf
*libftdi
*libkml
*libkolabxml
*libldb
*libnuml
*libpeas
*libplist
*libreoffice
*librepo
*libsavitar
*libsbml
*libsedml
*libtalloc
*libyang
*libyui-bindings
*link-grammar
*lldb
*mathgl
*med
*mod_wsgi
*nautilus-python
*nbdkit
*nest
*netgen-mesher
*neuron
*nextpnr
*nordugrid-arc
*nwchem
*openbabel
*openscap
*opentrep
*openvdb
*pam_wrapper
*paraview
*perl-Inline-Python
*pidgin
*pitivi
*plplot
*postgresql
*pynac
*pyotherside
*pythia8
*python
*python-gstreamer1
*python-jep
*python-qt5
*python3
*qgis
*qpid-dispatch
*qpid-proton
*rdkit
*renderdoc
*rmol
*root
*samba
*scidavis
*sigil
*swift-lang
*texworks
*thunarx-python
*trademgen
*trellis
*unbound
*uwsgi
*vdr-epg-daemon
*vigra
*vim
*vrpn
*vtk
*weechat
*znc


== Contingency Plan ==
== Contingency Plan ==


<!-- If you cannot complete your feature by the final development freeze, what is the backup plan?  This might be as simple as "Revert the shipped configuration".  Or it might not (e.g. rebuilding a number of dependent packages).  If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy.  -->
<!-- If you cannot complete your feature by the final development freeze, what is the backup plan?  This might be as simple as "Revert the shipped configuration".  Or it might not (e.g. rebuilding a number of dependent packages).  If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy.  -->
* Contingency mechanism: (What to do? Who will do it?) N/A (not a System Wide Change)  <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Contingency mechanism: If issues appear that cannot be fixed in a timely manner the change can be easily reverted and will be considered again for the next fedora release. Also a proper upgrade path mechanism will be provided in case of reversion, since libpython.3.?.so will be a separate package with this change.
<!-- When is the last time the contingency mechanism can be put in place?  This will typically be the beta freeze. -->
<!-- When is the last time the contingency mechanism can be put in place?  This will typically be the beta freeze. -->
* Contingency deadline: N/A (not a System Wide Change)  <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Contingency deadline: Before the beta freeze of Fedora 32 (2020-02-25)
<!-- Does finishing this feature block the release, or can we ship with the feature in incomplete state? -->
<!-- Does finishing this feature block the release, or can we ship with the feature in incomplete state? -->
* Blocks release? N/A (not a System Wide Change), Yes/No <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Blocks release? Yes
* Blocks product? product <!-- Applicable for Changes that blocks specific product release/Fedora.next -->
* Blocks product? None


== Documentation ==
== Documentation ==
Line 147: Line 513:


<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
N/A (not a System Wide Change)
The documentation will be reflected in the changes for the python packaging guidelines.


== Release Notes ==
== Release Notes ==

Latest revision as of 13:25, 31 October 2019

Build Python 3 to statically link with libpython3.8.a for better performance

Summary

Python 3 traditionally in Fedora was built with a shared library libpython3.?.so and the final binary was dynamically linked against that shared library. This change is about creating the static library and linking the final python3 binary against it, as it provides significant performance improvement, up to 27% depending on the workload. The static library will not be shipped. The shared library will continue to exist in a separate subpackage. In essence, python3 will no longer depend on libpython.

Owner

Current status

  • Targeted release: Fedora 32
  • Last updated: 2019-10-31
  • Tracker bug: <will be assigned by the Wrangler>
  • Release notes tracker: <will be assigned by the Wrangler>

Detailed Description

When we compile the python3 package on Fedora (prior to this change), we create the libpython3.?.so shared library and the final python3 binary (/usr/bin/python3) is dynamically linked against it. However by building the libpython3.?.a static library and statically linking the final binary against it, we can achieve a performance gain of 5% to 27% depending on the workload. Link time optimizations and profile guided optimizations also have a greater impact when python3 is linked statically.

Since Python 3.8, C extensions must no longer be linked to libpython by default. Applications embedding Python now need to utilize the --embed flag for python3-config to be linked to libpython. During the Python 3.8 upgrade and rebuilds we've uncovered various cases of packages linking to libpython implicitly through various hacks within their buildsystems and fixed as many as possible. However, there are legitimate reasons to link an application to libpython and for those cases libpython should be provided so applications that embed Python can continue to do so.

This mirrors the Debian/Ubuntu way of building Python, where they offer a statically linked binary and an additional libpython subpackage. The libpython subpackage will be created and python3-devel will depend on it, so packages that embed Python will keep working.

The change was first done in Debian and Ubuntu years ago, followed by Python 3.8. manylinux1 and manylinux2010 ABI don't link C extensions to libpython either (to support Debian/Ubuntu).

By applying this change, libpython's namespace will be separated from Python's, so C extension which are still linked to libpython might experience side effects or break.

There is one exception for C extensions. If an application is linked to libpython in order to embed Python, C extensions used only within this application can continue to be linked to libpython.

Currently there is no upstream option to build the static library, as well as the shared one and statically link the final binary to it, so we have to rely on a downstream patch to achieve it. We plan to work with upstream to incorporate the changes there as well.

Before the change, python3.8 is dynamically linked to libpython3.8:

+-------------------+
|                   |
|                   |         +--------------------+
|  libpython3.8.so  <---------+ /usr/bin/python3.8 |
|                   |         +--------------------+
|                   |
+-------------------+

After the change, python3.8 is statically linked to libpython3.8:

                              +-----------------------+
                              |                       |
                              |   /usr/bin/python3.8  |
                              |                       |
+-------------------+         | +-------------------+ |
|                   |         | |                   | |
|                   |         | |                   | |
|  libpython3.8.so  |         | |  libpython3.8.a   | |
|                   |         | |                   | |
|                   |         | |                   | |
+-------------------+         | +-------------------+ |
                              +-----------------------+

As a negative side effect, when both libpython3.8.so and /usr/bin/python3.8 are installed, the filesystem footprint will be slightly increased (libpython3.8.so on Python 3.8.0, x86_64 is ~3.4M). OTOH only a very small amount of packages will depend on libpython3.8.so.

Benefit to Fedora

Python's performance will increase significantly depending on the workload. Since many core components of the OS also depend on Python this could lead to an increase in their performance as well, however individual benchmarks will need to be conducted to verify the performance gain for those components.

pyperformance results, ignoring differences smaller than 5%:

+-------------------------+------------------+------------------------------+
| Benchmark               | python38-3.8.0-1 | python38-3.8.0-666           |
+=========================+==================+==============================+
| nbody                   | 238 ms           | 174 ms: 1.36x faster (-27%)  |
+-------------------------+------------------+------------------------------+
| raytrace                | 919 ms           | 686 ms: 1.34x faster (-25%)  |
+-------------------------+------------------+------------------------------+
| scimark_lu              | 285 ms           | 215 ms: 1.33x faster (-25%)  |
+-------------------------+------------------+------------------------------+
| scimark_sparse_mat_mult | 8.20 ms          | 6.20 ms: 1.32x faster (-24%) |
+-------------------------+------------------+------------------------------+
| django_template         | 204 ms           | 156 ms: 1.31x faster (-24%)  |
+-------------------------+------------------+------------------------------+
| chaos                   | 203 ms           | 156 ms: 1.30x faster (-23%)  |
+-------------------------+------------------+------------------------------+
| logging_simple          | 15.6 us          | 12.2 us: 1.28x faster (-22%) |
+-------------------------+------------------+------------------------------+
| richards                | 124 ms           | 97.0 ms: 1.28x faster (-22%) |
+-------------------------+------------------+------------------------------+
| scimark_fft             | 652 ms           | 511 ms: 1.27x faster (-22%)  |
+-------------------------+------------------+------------------------------+
| hexiom                  | 17.4 ms          | 13.8 ms: 1.27x faster (-21%) |
+-------------------------+------------------+------------------------------+
| logging_format          | 17.1 us          | 13.5 us: 1.27x faster (-21%) |
+-------------------------+------------------+------------------------------+
| nqueens                 | 174 ms           | 137 ms: 1.26x faster (-21%)  |
+-------------------------+------------------+------------------------------+
| crypto_pyaes            | 201 ms           | 160 ms: 1.26x faster (-20%)  |
+-------------------------+------------------+------------------------------+
| deltablue               | 12.6 ms          | 10.0 ms: 1.25x faster (-20%) |
+-------------------------+------------------+------------------------------+
| unpickle_pure_python    | 576 us           | 463 us: 1.24x faster (-20%)  |
+-------------------------+------------------+------------------------------+
| pickle_pure_python      | 799 us           | 644 us: 1.24x faster (-19%)  |
+-------------------------+------------------+------------------------------+
| go                      | 449 ms           | 362 ms: 1.24x faster (-19%)  |
+-------------------------+------------------+------------------------------+
| spectral_norm           | 247 ms           | 200 ms: 1.24x faster (-19%)  |
+-------------------------+------------------+------------------------------+
| scimark_monte_carlo     | 185 ms           | 151 ms: 1.23x faster (-19%)  |
+-------------------------+------------------+------------------------------+
| logging_silent          | 340 ns           | 276 ns: 1.23x faster (-19%)  |
+-------------------------+------------------+------------------------------+
| unpickle                | 23.3 us          | 19.1 us: 1.22x faster (-18%) |
+-------------------------+------------------+------------------------------+
| float                   | 200 ms           | 166 ms: 1.21x faster (-17%)  |
+-------------------------+------------------+------------------------------+
| mako                    | 26.6 ms          | 22.0 ms: 1.21x faster (-17%) |
+-------------------------+------------------+------------------------------+
| xml_etree_generate      | 159 ms           | 133 ms: 1.20x faster (-17%)  |
+-------------------------+------------------+------------------------------+
| xml_etree_process       | 128 ms           | 107 ms: 1.20x faster (-16%)  |
+-------------------------+------------------+------------------------------+
| fannkuch                | 795 ms           | 670 ms: 1.19x faster (-16%)  |
+-------------------------+------------------+------------------------------+
| chameleon               | 15.7 ms          | 13.3 ms: 1.18x faster (-15%) |
+-------------------------+------------------+------------------------------+
| scimark_sor             | 347 ms           | 294 ms: 1.18x faster (-15%)  |
+-------------------------+------------------+------------------------------+
| pathlib                 | 35.7 ms          | 30.2 ms: 1.18x faster (-15%) |
+-------------------------+------------------+------------------------------+
| regex_compile           | 301 ms           | 255 ms: 1.18x faster (-15%)  |
+-------------------------+------------------+------------------------------+
| genshi_text             | 48.3 ms          | 41.2 ms: 1.17x faster (-15%) |
+-------------------------+------------------+------------------------------+
| sympy_str               | 459 ms           | 394 ms: 1.17x faster (-14%)  |
+-------------------------+------------------+------------------------------+
| genshi_xml              | 102 ms           | 87.6 ms: 1.16x faster (-14%) |
+-------------------------+------------------+------------------------------+
| 2to3                    | 540 ms           | 465 ms: 1.16x faster (-14%)  |
+-------------------------+------------------+------------------------------+
| sqlite_synth            | 4.89 us          | 4.25 us: 1.15x faster (-13%) |
+-------------------------+------------------+------------------------------+
| sympy_expand            | 704 ms           | 613 ms: 1.15x faster (-13%)  |
+-------------------------+------------------+------------------------------+
| html5lib                | 162 ms           | 141 ms: 1.15x faster (-13%)  |
+-------------------------+------------------+------------------------------+
| sympy_integrate         | 34.2 ms          | 30.0 ms: 1.14x faster (-12%) |
+-------------------------+------------------+------------------------------+
| dulwich_log             | 121 ms           | 107 ms: 1.13x faster (-11%)  |
+-------------------------+------------------+------------------------------+
| sympy_sum               | 286 ms           | 253 ms: 1.13x faster (-11%)  |
+-------------------------+------------------+------------------------------+
| xml_etree_iterparse     | 170 ms           | 152 ms: 1.12x faster (-11%)  |
+-------------------------+------------------+------------------------------+
| telco                   | 10.2 ms          | 9.14 ms: 1.11x faster (-10%) |
+-------------------------+------------------+------------------------------+
| meteor_contest          | 171 ms           | 154 ms: 1.11x faster (-10%)  |
+-------------------------+------------------+------------------------------+
| json_dumps              | 20.0 ms          | 18.0 ms: 1.11x faster (-10%) |
+-------------------------+------------------+------------------------------+
| tornado_http            | 425 ms           | 384 ms: 1.11x faster (-10%)  |
+-------------------------+------------------+------------------------------+
| xml_etree_parse         | 249 ms           | 226 ms: 1.10x faster (-9%)   |
+-------------------------+------------------+------------------------------+
| sqlalchemy_imperative   | 53.4 ms          | 49.6 ms: 1.08x faster (-7%)  |
+-------------------------+------------------+------------------------------+
| python_startup          | 13.7 ms          | 12.7 ms: 1.07x faster (-7%)  |
+-------------------------+------------------+------------------------------+
| json_loads              | 43.3 us          | 40.7 us: 1.06x faster (-6%)  |
+-------------------------+------------------+------------------------------+
| python_startup_no_site  | 9.29 ms          | 8.75 ms: 1.06x faster (-6%)  |
+-------------------------+------------------+------------------------------+
| pickle_dict             | 33.8 us          | 32.0 us: 1.06x faster (-5%)  |
+-------------------------+------------------+------------------------------+
| sqlalchemy_declarative  | 272 ms           | 258 ms: 1.05x faster (-5%)   |
+-------------------------+------------------+------------------------------+

Scope

  • Proposal owners:
    • Review and merge the pull request with the implementation.
    • Go through the Python C extension packages that are linked to libpython and test if things work correctly. A copr repository will be provided for testing.
  • Other developers: Other developers are encouraged to test the new statically linked python3 and check if their package works as expected
  • Release engineering: #8953 This change does not require a mass rebuild, however a rebuild of the affected packages will be required. The affected packages will be rebuilt in copr first.
  • Policies and guidelines: The packaging guidelines will need to be updated to explicitly mention that C extensions should not be linked to libpython, and that the python3 binary is statically linked.
  • Trademark approval: N/A (not needed for this Change)

Upgrade/compatibility impact

Affected package maintainers should verify that their packages work as expected and the only impact the end users should see is a performance increase for workloads relying on Python.

How To Test

Copr repo with instructions: https://copr.fedorainfracloud.org/coprs/g/python/Python3_statically_linked/

Package changes test

The change will bring the new libpython3 subpackage as a dependency of python3-devel.

Test that it's installed:

$ rpm -q libpython3

Test that it's uninstalled if python3-devel is removed:

$ dnf remove python3-devel

Test that python3-libs no longer includes the libpython shared library.

$ rpm -ql python3-libs | grep libpython3

Dynamic linker test

To check that the python3.8 program is not linked to libpython, ldd can be used. For example, Python 3.7 will still be linked to libpython:

$ ldd /usr/bin/python3.7|grep libpython
libpython3.7m.so.1.0 => /lib64/libpython3.7m.so.1.0 (0x00007fbb57333000)

But python3.8 will no longer be linked to libpython:

$ ldd /usr/bin/python3.8|grep libpython

Performance test

The performance speedup can be measured using the official Python benchmark suite pyperformance: see Run benchmarks.

Namespace test

The following script can be used to verify that the change is in effect:

import ctypes
import sys
 
EMPTY_TUPLE_SINGLETON = ()

def get_empty_tuple(lib):
    # Call PyTuple_New(0)
    func = lib.PyTuple_New
    func.argtypes = (ctypes.c_ssize_t,)
    func.restype = ctypes.py_object
    return func(0)
 
def test_lib(libname, lib):
    obj = get_empty_tuple(lib)
    if obj is EMPTY_TUPLE_SINGLETON:
        print("%s: SAME namespace" % libname)
    else:
        print("%s: DIFFERENT namespace" % libname)

def test():
    program = ctypes.pythonapi

    if hasattr(sys, 'abiflags'):
        abiflags = sys.abiflags
    else:
        # Python 2
        abiflags = ''
    ver = sys.version_info
    filename = ('libpython%s.%s%s.so.1.0'
                % (ver.major, ver.minor, abiflags))
    libpython = ctypes.cdll.LoadLibrary(filename)

    test_lib('program', program)
    test_lib('libpython', libpython)

test()

Output before the change:

program: SAME namespace
libpython: SAME namespace

Output after the change:

program: SAME namespace
libpython: DIFFERENT namespace

User Experience

Python based workloads should see a performance gain of up to 27%.

Dependencies

While this specific change is not dependent on anything else, we would like to ensure that all the packages that link to libpython continue to work as expected.

Currently (30/10/2019) 118 packages on rawhide depend on libpython.

Result of the "repoquery --repo=rawhide --source --whatrequires 'libpython3.8.so.1.0()(64bit)' " command on Fedora Rawhide, x86_64:

  • COPASI
  • Io-language
  • OpenImageIO
  • YafaRay
  • antimony
  • blender
  • boost
  • calamares
  • calibre
  • cantor
  • ceph
  • clingo
  • condor
  • createrepo_c
  • csound
  • cvc4
  • dionaea
  • dmlite
  • domoticz
  • fontforge
  • freecad
  • gdb
  • gdcm
  • gdl
  • getdp
  • glade
  • globus-net-manager
  • glom
  • gnucash
  • gpaw
  • hamlib
  • hokuyoaist
  • hugin
  • insight
  • kdevelop-python
  • kicad
  • kitty
  • krita
  • lammps
  • ldns
  • libCombine
  • libarcus
  • libarcus-lulzbot
  • libbatch
  • libcec
  • libcomps
  • libdnf
  • libftdi
  • libkml
  • libkolabxml
  • libldb
  • libnuml
  • libpeas
  • libplist
  • libreoffice
  • librepo
  • libsavitar
  • libsbml
  • libsedml
  • libtalloc
  • libyang
  • libyui-bindings
  • link-grammar
  • lldb
  • mathgl
  • med
  • mod_wsgi
  • nautilus-python
  • nbdkit
  • nest
  • netgen-mesher
  • neuron
  • nextpnr
  • nordugrid-arc
  • nwchem
  • openbabel
  • openscap
  • opentrep
  • openvdb
  • pam_wrapper
  • paraview
  • perl-Inline-Python
  • pidgin
  • pitivi
  • plplot
  • postgresql
  • pynac
  • pyotherside
  • pythia8
  • python
  • python-gstreamer1
  • python-jep
  • python-qt5
  • python3
  • qgis
  • qpid-dispatch
  • qpid-proton
  • rdkit
  • renderdoc
  • rmol
  • root
  • samba
  • scidavis
  • sigil
  • swift-lang
  • texworks
  • thunarx-python
  • trademgen
  • trellis
  • unbound
  • uwsgi
  • vdr-epg-daemon
  • vigra
  • vim
  • vrpn
  • vtk
  • weechat
  • znc

Contingency Plan

  • Contingency mechanism: If issues appear that cannot be fixed in a timely manner the change can be easily reverted and will be considered again for the next fedora release. Also a proper upgrade path mechanism will be provided in case of reversion, since libpython.3.?.so will be a separate package with this change.
  • Contingency deadline: Before the beta freeze of Fedora 32 (2020-02-25)
  • Blocks release? Yes
  • Blocks product? None

Documentation

The documentation will be reflected in the changes for the python packaging guidelines.

Release Notes