From Fedora Project Wiki

Revision as of 16:34, 29 June 2010 by Toshio (talk | contribs) (More ways to do this)

Here are some ways that you can unbundle python modules. Remember that upstreams usually feel they have a reason to bundle libraries so we need to offer them solutions that satisfy their concerns while also allowing us to unbundle for maintainance and security reasons.

Private API

When upstream considers the module an implementation detail and not something that software outside of their module should be importing we have more flexibility in how to fix things.

This example can be used when upstream just wants to make sure that a copy of a library is available (bundling is a "just in case this library is not installed"). In the rpm specfile we'll specify the Requires that pulls in the library and therefore we'll use the system library and not the bundled code.

try:
    from system.library import foo
except ImportError:
    from bundled._copy import foo

Sometimes upstream will claim that they are also making sure that the library has a compatible API or is of a recent enough version to have a bugfix. When this occurs we should make sure that our packages have the bugfix or required version and do something like this:

try:
    from system.library import foo
    if foo.__version_info__ < (1, 0):
        foo = None
except ImportError:
    foo = None

if not foo:
    from bundled._copy import foo

If there's no way to check whether the version matches (or test for the feature), we can actually remove the bundled code from the filesystem and submit code like this upstream:

try:
    from bundled._copy import foo
except ImportError:
    from system.library import foo

The bundled copy will be used by people who just download upstream's code. In the Fedora package, since we've removed the bundled/_copy it will fail to load the bundle and load the system lib instead.

Delete all bundled code?
Question for security team: Since the code is private, should we delete the bundled copies in our spec files in all these examples? The last one is the only one where it's strictly necessary in order for us not to use the code in our package.


Public API

Sometimes upstream considers the code to be part of their public API. Upstream expects people to do from bundled.copy import foo in their code. This case has a less options for taking care of it.

Let's say that bundled.copy is a single python file that looks like this:

# File: bundled/copy.py
class foo(object):
    pass

There's a variety of ways we can make this use a system lib in preference. This does the change in a single file and is appropriate when the bundled copy is small:

# File: bundled/copy.py
try:
    from system.library import foo
except ImportError:
    class foo(object):
        pass

A slight variant on the above can be used if upstream wants to do something like unittest the bundled code. By always defining foo (but as a private class) unittests can access the bundled code when necessary but normal code will use the system library:

# File: bundled/copy.py
try:
    from system.library import foo
except ImportError:
    foo = None

class _foo(object):
    pass

if not foo:
   foo = _foo

When a module is large, reindenting the whole file may not be the best idea. You can use a directory to work instead

# File: bundled/copy.py => Delete this file

# Directory: bundled/copy => Create this new directory

# File: bundled/copy/__init__.py
try:
    # Sometimes you can get away with import * here
    # Other times you have to list all the things you want to import explicitly
    from system.library import foo
except ImportError:
    from _copy import foo

# File: bundled/copy/_copy:
class foo(object):
    pass