|
|
(8 intermediate revisions by one other user not shown) |
Line 1: |
Line 1: |
| = Python =
| | {{admon/important|This page is deprecated| All Fedora Modularity Documentation has moved to the new [https://docs.pagure.org/modularity/ Fedora Modularity Documentation website] with source hosted along side the code in the [https://pagure.io/modularity Fedora Modularity website git repository]}} |
| | |
| Most of our code is written in Python, so this document will concentrate on it.
| |
| | |
| == Upstream guidelines ==
| |
| | |
| Fortunately, with PEP 8 there's an extensive official [https://www.python.org/dev/peps/pep-0008/ Style Guide for Python Code]. All new Python code you submit should conform to it, unless you have good reasons to deviate from it, [https://www.python.org/dev/peps/pep-0008/#id15 for instance readability].
| |
| | |
| Keep PEP 20, the [https://www.python.org/dev/peps/pep-0020/ Zen of Python], under your pillow.
| |
| | |
| == Keep It Simple ==
| |
| | |
| The code you write now probably needs to be touched by someone else down the road, and that someone else might be less experienced than you, or have a terrible headache and be under pressure of time. So while a particular construct may be a clever way of doing something, a simple way of doing the same thing can be and often is preferrable. If (when) complexity can't be avoided, try to isolate it: put a difficult operation into its own function, method or class, add comments liberally. If complexity can be hidden from upper layers of the code, do so.
| |
| | |
| == Python 2 and 3 ==
| |
| | |
| Python comes in two major versions nowadays:
| |
| | |
| * The legacy version 2, of which the [https://www.python.org/download/releases/2.0/ first release 2.0 came out in October 2000]. The Python project [http://legacy.python.org/dev/peps/pep-0373/ will maintain its final minor release 2.7 until 2020].
| |
| | |
| * The current version 3, its [https://www.python.org/download/releases/3.0/ first release 3.0 was published in December 2008]. At the time of writing, the current minor release is version 3.5, to be superseded by 3.6 around the end of 2016.
| |
| | |
| Version 3 is not backwards compatible to version 2. While we mainly target "the future", there are some components we have to work with that haven't yet been ported over the Python 3, most notably [https://fedorahosted.org/koji/ <code>koji</code>]. Additionally, we may also want to support the "user tools" we create on legacy systems, so we can't write code that uses all the latest features. Fortunately, many of the original Python 3 features have been back-ported to Python 2.7, so we can and should write code that is very close to writing idiomatic Python 3 but can still be run on version 2.7. Targeting older minor releases (Python 2.6 and earlier) is much more of a balancing act, so we won't aim for it.
| |
| | |
| The following sections cover areas that require some attention. The Python project itself has a great [https://docs.python.org/3/howto/pyporting.html Porting Python 2 Code to Python 3] document which goes into much detail about the differences and is worth a read, even though it mainly addresses existing Python 2 code bases.
| |
| | |
| === Absolute and relative imports ===
| |
| | |
| In Python 2, importing modules can be ambiguous when a module of that name exists in the same package and elsewhere in the module search path <code>sys.path</code>. To work around this ambiguity, programmers often resorted to adding paths private to the project to the beginning of <code>sys.path</code> to force loading modules from a project-internal location (which adds unwanted noise and can make e.g. testing code that isn't installed difficult). Python 3 introduces new syntax for import statements which makes both cases distinct, this is available since version 2.5 from the <code>__future__</code> module:
| |
| | |
| <pre>
| |
| from __future__ import absolute_import
| |
| | |
| # Import the sys module from the module search path
| |
| import sys
| |
| | |
| # Import the foo module from the same directory
| |
| from . import foo
| |
| | |
| # Import snafu from the bar module one directory above
| |
| from ..bar import snafu
| |
| </pre>
| |
| | |
| === Print function ===
| |
| | |
| Python 3 did away with <code>print</code> as a statement and introduced it as a function. In order to use it the same way in Python 2.7, add the following to the top of source code files where you use <code>print</code>:
| |
| | |
| <pre>
| |
| from __future__ import print_function
| |
| </pre>
| |
| | |
| === Numbers ===
| |
| | |
| Python 2 has two integer types, `int` which is whatever integer-type is native to the system (which has certain maximal and minimal values and can overflow) and `long` which can store arbitrary integer numbers. Python 3 only the latter type, but it's called <code>int</code>.
| |
| | |
| Dividing integer numbers using <code>/</code> truncates the result to an integer in Python 2 by default, but yields a floating point number in Python 3. In order for code to do the same thing on either version, include the following line at the top of your source files where you divide numbers, and use <code>/</code> for normal divisions and <code>//</code> for divisions that should truncate the result:
| |
| | |
| <pre>
| |
| from __future__ import division
| |
| </pre>
| |
| | |
| === Strings ===
| |
| | |
| Some consider this the main difference between Python 2 and 3: Both versions have a type for strings of bytes and strings of Unicode character points. They are called <code>str</code> and <code>unicode</code> in version 2 and <code>bytes</code> and <code>str</code> in version 3, respectively.
| |
| | |
| ==== String Literals ====
| |
| | |
| Python 2 and 3 use different ways of marking literals of the different types by default. Byte strings can have no prefix or <code>b</code> in Python 2.7, but must be prefixed in Python 3, and text strings must have the <code>u</code> prefix in Python 2 which can be and usually is omitted in Python 3:
| |
| | |
| <pre>
| |
| # a byte string in Python 2 and 3
| |
| string1 = b"abc"
| |
| | |
| # a byte string in Python 2, but a text string in Python 3
| |
| string2 = "def"
| |
| | |
| # a text string in Python 2 and 3
| |
| string3 = u"ghi"
| |
| </pre>
| |
| | |
| In order to ease writing code that is compatible between the versions, you can switch Python 2 to treat unprefixed string literals as <code>unicode</code>, the text string type, by adding this snippet to the top of the relevant source code files:
| |
| | |
| <pre>
| |
| from __future__ import unicode_literals
| |
| </pre>
| |
| | |
| ==== Explicit Encoding and Decoding ====
| |
| | |
| In Python 2, the byte and text string types are exchangeable in many places, taking the user's or system default locale into account (and sometimes failing, when the locale didn't match up with encoded data). Apart from the change in type names and how literals look like, Python 3 requires you to explicitly encode <code>str</code> and decode <code>bytes</code> objects if you need them cast into the respective other string type. It is good practice to exclusively use text strings for strings that represent text in a program and decode byte strings as early and encode text strings as late as possible at interfaces that produce or consume encoded data.
| |
| | |
| {{admon/note|Implicit string type conversion in Python 2|Python 2 lets you attempt to replace a <code>str</code> substring in a <code>unicode</code> object (or vice versa) and would attempt to cast the one into the other by encoding or decoding on the fly as needed. This piece of code won't work in Python 3:}} | |
| | |
| <pre>
| |
| from __future__ import print_function
| |
| text_string = u"Hello, world!"
| |
| print(text_string.replace("world", "gang"))
| |
| </pre>
| |
| | |
| {{admon/tip|Explicit string type conversion in Python 2 and 3|Python 3 requires explicit encoding/decoding to cast between byte and text strings. This also works in Python 2 and is preferred of course.}}
| |
| | |
| <pre>
| |
| from __future__ import print_function, unicode_literals
| |
| text_string = "Hello, world!"
| |
| print(text_string.replace(b"world".decode('utf-8'), b"gang".decode('ascii')))
| |
| </pre>
| |
| | |
| ==== String formatting ====
| |
| | |
| With version 3.6 around the corner, there are four ways to format strings in Python now:
| |
| | |
| # using the <code>%</code> operator
| |
| # using <code>string.Template</code> of [https://www.python.org/dev/peps/pep-0292/ PEP 292]
| |
| # with the <code>str.format()</code> method
| |
| # using [https://www.python.org/dev/peps/pep-0498/ PEP 498 literal string interpolation]
| |
| | |
| The last method isn't available yet in a stable Python release and will never be in Python 2, so it's not suitable for our purposes. The other three variants work in all Python versions we're interested in, formatting with <code>string.Template</code> is very rarely done however. The remaining two ways, commonly called old-style (<code>%</code> operator) and new-style (<code>str.format()</code>), are both in wide-spread use, [https://pyformat.info/ here's a site showcasing the differences between them]. New-style formatting is more powerful and often easier to read, but on the other hand can be a little more to type. From a technical point of view, this is a case of "use what works for you", but for consistency sake the new-style <code>str.format()</code> way is preferrable if you're comfortable with using it. If not, others can convert old-style to new-style formatting for you during review or when happening across it. At any rate, consistently use one way or the other in what you submit.
| |
| | |
| === Old- and New-style Classes ===
| |
| | |
| Python 2 and earlier knows two types of classes, old-style which have no base class, and new-style which have <code>object</code> as the base class. Because their behavior is slightly different in some places, and some things can't be done with old-style classes, we want to stick to new-style classes wherever possible.
| |
| | |
| The syntactical difference is that new-style classes have to explicitly be derived from <code>object</code> or another new-style class.
| |
| | |
| <pre>
| |
| # old-style classes
| |
| class OldFoo:
| |
| pass
| |
| | |
| class OldBar(OldFoo):
| |
| pass
| |
| | |
| # new-style classes
| |
| class NewFoo(object):
| |
| pass
| |
| | |
| class NewBar(NewFoo):
| |
| pass
| |
| </pre>
| |
| | |
| Python 3 only knows new-style classes and the requirement to explicitly derive from <code>object</code> was dropped. In projects that will only ever run on Python 3, it's acceptable not to explicitly derive classes without parents from <code>object</code>, but if in doubt, do it just the same.
| |
| | |
| == Idiomatic code ==
| |
| | |
| In Python, it's easy to inadvertently emulate idiomatic styles of other languages like C/C++ or Java. In cases where there are constructs "native" to the language, it's preferrable to use them.
| |
| | |
| Here are some examples:
| |
| | |
| === Looping ===
| |
| | |
| Languages like C normally use incremented indices to loop over arrays:
| |
| | |
| <pre>
| |
| float pixels[NUMBER_OF_PIXELS] = [...];
| |
| | |
| for (int i = 0; i < NUMBER_OF_PIXELS; i++)
| |
| {
| |
| do_something_with_a_pixel(pixels[i]);
| |
| }
| |
| </pre>
| |
| | |
| {{admon/warning|Looping C-style in Python|Avoid looping over indices of sequences, rather than the sequences themselves in Python.}}
| |
| | |
| Implementing the loop like this would give away that you've programmed in C or a similar language before:
| |
| | |
| <pre>
| |
| pixels = [...]
| |
| | |
| for i in range(len(pixels)):
| |
| do_something_with_a_pixel(pixels[i])
| |
| </pre>
| |
| | |
| {{admon/note|Looping over iterables in Python|In Python, you can simply iterate over many non-scalar data types.}}
| |
| | |
| Here's the "native" way to implement the above loop:
| |
| | |
| <pre>
| |
| pixels = [...]
| |
| | |
| for p in pixels:
| |
| do_something_with_a_pixel(p)
| |
| </pre>
| |
| | |
| {{admon/tip|Using <code>enumerate()</code>|If you need to keep track of the current count of looped-over items, use the <code>enumerate()</code> built-in.}}
| |
| | |
| It yields pairs of count (starting at 0 by default) and the current value like this:
| |
| | |
| <pre>
| |
| pixels = [...]
| |
| | |
| for p_no, p in enumerate(pixels, 1):
| |
| print("Working on pixel no. {}".format(p_no))
| |
| do_something_with_a_pixel(p)
| |
| </pre>
| |
| | |
| === Properties rather than explicit accessor methods ===
| |
| | |
| In order to allow future changes in how object attributes (member variables) are set, some languages encourage always using getter and/or setter methods. This is unnecessary in Python, as you can intercept access to an attribute by wrapping it into a [https://docs.python.org/2/howto/descriptor.html#properties property] when this becomes necessary. Properties allow having accessor methods without making the user of the class have to use them explicitly. This way you can validate values when an attribute is set, or translate back and forth between the interface used on the attribute and an internal representation.
| |
| | |
| ==== Validating a value when setting an attribute ====
| |
| | |
| To ensure that an <code>Employee</code> object only has positive values for its <code>salary</code> attribute, you'd put a property in its place which checks values before storing them in an attribute called e.g. <code>_salary</code>:
| |
| | |
| <pre>
| |
| class Employee(object):
| |
| | |
| @property
| |
| def salary(self):
| |
| return self._salary
| |
| | |
| @salary.setter
| |
| def salary(self, salary):
| |
| if salary <= 0:
| |
| raise ValueError("Salary must be positive.")
| |
| self._salary = salary
| |
| </pre>
| |
| | |
| {{admon/caution|Avoid recursion|In order to avoid endless recursion, you must use a different attribute than the one using the property to store actual values.}}
| |
| | |
| ==== Translating between attribute interface and internal representation ====
| |
| | |
| Take these classes of geometric primitives, <code>Point</code> and <code>Circle</code>:
| |
| | |
| <pre>
| |
| class Point(object):
| |
| def __init__(self, x, y):
| |
| self.x = x
| |
| self.y = y
| |
| | |
| class Circle(object):
| |
| def __init__(self, point, radius):
| |
| self.point = point
| |
| self.radius = radius
| |
| </pre>
| |
| | |
| If you wanted to add a <code>diameter</code> attribute, you can do so as a property which translates back and forth between it and the existing <code>radius</code> attribute:
| |
| | |
| <pre>
| |
| ...
| |
| class Circle(object):
| |
| def __init__(self, point, radius=None, diameter=None):
| |
| self.point = point
| |
| if (radius is None) == (diameter is None):
| |
| raise ValueError("Exactly one of radius or diameter must be set")
| |
| if radius is not None:
| |
| self.radius = radius
| |
| else:
| |
| self.diameter = diameter
| |
| | |
| @property
| |
| def diameter(self):
| |
| return self.radius * 2
| |
| | |
| @diameter.setter
| |
| def diameter(self, diameter):
| |
| self.radius = diameter / 2.0
| |
| ...
| |
| </pre>
| |
| | |
| Even setting <code>self.diameter</code> in the constructor goes by way of the property and therefore the setter method.
| |
| | |
| [[Category:Modularity]]
| |