Éclairage sur ...
Les développeurs Python seront particulièrement intéressés par la dernière Fedora 13:débogage Python facilité et environnement Python 3 installable en parallèle, plus prise en charge de Python par SystemTap. Nous vous présentons une interview de David Malcolm, un des développeurs responsables de ces fonctionnalités.
Une interview vidéo de David Malcom est disponible via le blog presse de Red Hat.
Interview avec David Malcolm
Parle nous de toi, d'abord.
Salut, je suis David Malcolm. Je m'intéresse à Linux depuis 10 ans environ, j'ai travaillé sur différentes chose dans la communauté GNOME. Je travaille pour Red Hat, et je suis assez chanceux pour être payé pour faire du logiciel libre (youpi!). J'ai appris le langage Python il y a quelques années et celui-ci est devenu très rapidement mon langage de programmation favori. Red Hat me paie pour rendre Python encore meilleur.
Qu'est-ce qui te plait à propos de Python ?
Il correspond très bien à ma façon de programmer : ce qui doit être simple à faire l'est, mais il est capable de réaliser les tâches complexes sans introduire de complexité inutile. Ainsi, je peux écrire un script simple pour les tâches quotidiennes, mais potentiellement développer quelque chose de plus construit.
Fedora 13 comporte trois fonctionnalités reliés à Python, commençons par l'environnement Python 3 installable en parallèle. Qu'est-ce ? Et en quoi c'est utile ?
Python 3 corrige des problèmes inhérents au langage, mais celà implique que beaucoup de choses changent entre Python 2 et Python 3. En un certain sens, vous pouvez les considérer comme différents langages.
Quand nous parlons d'un environnement Python, celui-ci repose sur trois composants: l'"interpréteur" au coeur de celui-ci, la "bibliothèque standard", et une collection de modules tiers par dessus. La bibliothèque standard est souvent décrite comme étant "piles incluses" car très riche, mais malgré cela, le besoin de modules tiers existe. Il existe des centaines, si ce n'est des milliers de modules, certains ayant besoin d'autres modules, et la plupart envisage de passer à Python 3.
Donc beaucoup de développeurs Python devront faire face à la décision de passer à Python 3 - "Est-ce que l'environnement Python 2 ou Python 3 me fournit les modules dont j'ai besoin ?"
Python fournit un outil nommé "2to3" capable convertir automatique une bonne partie du code Python 2 en code Python 3,à condition de respecter certaines règles. Malheuresement, il n'est pas très clair quels sont les modules qui ont été portés et ceux qui ont besoins de l'être. Certains nouveaux modules sont directement écrit pour Python 3, d'autres modules pré-existants le supporte déjà, d'autres encore viennent juste de commencer le portage.
Et la réponse de Fedora est "Bon, on aura les deux."
Dans Fedora 13, nous fournirons deux environnements Python, un environnement Python 2 et un environnement Python 3.
Et vous pouvez utilisez simultanément Python 2 et Python 3 - pas besoin de choisir l'un ou l'autre.
Je ne suis pas certain de combien de [packages|paquets] nous avons pour Python 2 dans Fedora, mais il y en a beaucoup.
A note for our readers who may be Python developers - Python 2 is the existing Python stack in Fedora, so if you've been creating or running Python code in Fedora, Python 2 is what you have been using.
For the Python 3 one, we've tried to provide RPM packages of Python code known to work with Python 3. One approach we could have followed was to simply run "2to3" on everything, but doing that you have no guarantee that the end-result actually does what it's meant to.
So these packages in F13 have been tested to work with Python 3?
Yes. If you see a "python3-foo" RPM in Fedora 13, you know that it should actually work. We're not just throwing them over the wall; we've gone through various modules, picked the ones that are known to work (possibly requiring steps to make them do so e.g. "2to3"), and tested them.
And we're doing this in part because we need the Python 3 stuff ourselves.
We use Python 2 extensively within Fedora. Much of Fedora's web infrastructure is written in Python, and the system tools like the updater ("yum"), the installer ("anaconda"), and a slew of graphical config tools ("system-config-*"). My hope is that for Fedora 14 we can start cutting over some of our tools to Python 3.
What kind of development had to take place in order to make this possible?
We had to make some cleanups to RPM to support multiple Python stacks; I added some tests to the "rpmlint" tool for this. I helped port RPM's Python bindings so that they can work with Python 3 (this is in rpm-4.8.0 IIRC).
One other thing I did was write a tool to help people port their C extension modules. One nice thing about Python is that it makes it very easy to write wrapper code that bridges between Python and C, and there's a lot of this code around. Unfortunately it needs some changes between Python 2 and Python 3. I ran into this porting RPM's python bindings. Half the work requires thought, but the other half is fairly mindless, once you get the hang of it.
So I wrote a tool to help with the mindless parts, which I called 2to3c, in homage to the 2to3 tool. John Palmieri used this to help him port the DBus python bindings.
Nice. I see the download/usage instructions - looks like it's a pretty new project that's looking for testers/feedback/help.
Yes, it's rather bleeding edge right now. Help would be most welcome!
So hopefully we now have an excellent Python 3 platform in Fedora 13: I believe we have a well-tuned build of Python 3, and a good selection of add-on modules available via RPM. This should be useful to people looking to port their code or to learn the language; arguably Python 3 is easier to learn than Python 2; a lot of unnecessary complexity was removed.
What's the best set of instructions for people going "cool, how do I start?"
https://fedoraproject.org/wiki/Features/Python3F13#How_To_Test. I think that section could be improved.
We'll mark those as needing attention, and move on. If anyone would like to help with our Python 3 documentation, please feel free to edit the page! David, any last comments on parallel-installable?
It's something that people have wanted for a while. There have been a few proposals on the matter on the mailing list.
Ensuring that it was independent of the Python 2 stack was the most important detail, so that we can be sure we don't break it. "Don't cross the streams!" (How is this looking?)
You made a Ghostbusters reference. We're all good. Moving on to SystemTap probes! So... I've done a bit of Python development, but I look at the feature description for this and I'm confused. What is this?
So, Systemtap is a tracing/probing/monitoring tool. The idea is (metaphor alert!) that you can stick probes under the hood of the engine and see what's going on. In the past, most of the places where you could probe were in the kernel. For Fedora 13, I've added places to the Python 2 and Python 3 runtimes that you can monitor - specifically, Python function calls. So you can write scripts that watch for calls to a particular module, or watch for calls of a particular Python function, across the whole computer, or just in a given process.
Can you give some examples?
As examples, I provide precanned scripts. I've written a "top"-like tool that shows you all python function calls per-second across the whole system, and [another that] shows you the function call and return hierarchy for all Python that's running. These ought to be useful as is, and people can write their own function calls using systemtap's mini-language.
What sort of Python programmers might care most immediately about this? Are there particular types of projects that this is good for?
I showed [my scripts] to Paul Frields (Ed. note: Paul is a relatively new Python programmer) running on a program that he wrote and his eyes lit up. It's a great teaching tool: you can see what your code is doing, directly.
So it's something that's made to be helpful for programming novices.
One other use case: a busy Python-based website could use this for profiling, see what parts are getting used a lot.
Are there any other technical details we should know?
I should mention that this relates to work done by Sun/Apple on DTrace, which is an analogue to SystemTap. There have been some patches to add this support to Python floating about on the upstream bug tracker for a while - for DTrace, Mark Wielaard added some partial DTrace compatibility to Systemtap. So it looks like (during the Python build) that we're running DTrace, but actually it's all shimming into Systemtap.
I'm still trying to figure out how a Normal Python Programmer would get started with this coolness.
I think a pair of screencasts is the way to go, showing rather than telling.
Ok - we'll make a note to make those screencasts. (Ed. note: watch for more Python on Fedora 13 material coming out soon!) On to debugging?
Yay!
Tell us about "Easier Python Debugging." What does that mean?
One of the great things about Python is how easy it is to wrap external libraries (e.g. written in C).
What this means is that if you have some code that's written in another language - C is a common example - that you want to interface with in the Python code you're writing - Python makes it easy to do that. You can have your C code and your Python code "talk" to each other by writing a little bit of Python code to go around the C.
The downside of this is that if one of these libraries has a bug, then that bug takes out the whole of the Python process, without giving you a nice Exception/traceback.
I found an example of a... not-nice Exception/traceback from when this kind of thing happens.
Since we added the ABRT tool, I see a lot of Python crashes - which typically aren't crashes in Python itself, they're crashes in the libraries. I've spent a lot of time debugging these things, and I wanted to make my life easier.
For example, in Fedora 12 (I believe), we shipped GTK-2.18, which contained Alex Larsson's bug rewrite of how GTK writes stuff to the screen, greatly reducing on-screen flicker. But the downside is that a few applications broke. An example turned out to be the "istanbul" screencast-recording tool; figuring that out was "fun."
Python has long had a set of macros - small libraries - for gdb, the gnu debugger, that let you connect to a running (or dying) python process and debug what's going on, but they're fiddly to use and they assume the process is only "lightly broken." For example, they add a "pyo" command, for printing python objects. In theory, it's equivalent to "print" in Python on that object, but if the object is internally corrupt, if you run it, you'll merely get another crash.
The other big problem is that the macros really assume you're proficient with gdb and know your way around the insides of Python. So I started looking for a better way of doing this.
In Fedora 12 (I believe), Fedora gained a shiny new version of gdb. Various people worked on improving C++ debugging, but one of the by-products of that was that gdb 7 now has the ability to be extended using Python. A bunch of Red Hatters added this; it's now possible to write Python code that hooks into the debugger, to pretty-print data types.
What I did was use this to write Python code that knows about the insides of Python itself, so you now have Python code running inside the gdb process, which knows how to scrape data out of another dying process. The practical upshot is that it's now possible to attach to an already-running Python process with gdb and type:
py-list
...which will show you the python source code that's currently running,
py-bt
...which will show you a Python-level backtrace,
py-up
...which will take you up the call stack, and
py-down
...which will take you down the call stack. And when you print data, it will tell you what the data is, in a meaningful way. So rather than being told the hexadecimal address of where the object is stored in RAM, gdb should tell you that e.g. you have a [1, 2, 3]. Plus, now if ABRT, the Automatic Bug Reporting Tool, detects a crash of a python process, the report should automatically the file/line information at the Python level and the values of all of the Python vars, rather than just hexadecimal noise.
Sounds like another getting-started screencast we should make.
The caveat is that it works well on i686, but less well on x86_64; it ought to work on Python 3, but I think there are some bugs there. I've set it up so that if you install python-debuginfo, it should all Just Work. I think I still have some testing to do on Python 3 for this, so I'd recommend trying it out on python 2, with i686.
Please file bug reports against "python" and "python3" as appropriate - this stuff lives in the -debuginfo subpackages of those src.rpms. If you see a Python traceback inside gdb, then that's likely a bug in my code; please file a bug if you do see this. The code tries to be robust in the face of arbitrary breakage of the process being debugged - we are trying to debug crashes, after all!
Now, this feature is something that was originally made for Fedora - this is the first place it's come out?
Yes. I also recently got this code into upstream, into Python's SVN repository, and it's likely to be in Python 2.7 when that comes out, though it works fine with 2.6.
In other words, the Python community liked the work you were doing so much they decided to make it part of the Python language itself. This is a nice example of Fedora being a place where innovation happens in free software, then goes upstream to benefit the rest of the open source ecosystem.
I believe Debian and Ubuntu have a version of my patch, though I believe their version of gdb doesn't have all of the patches needed to fully support all the extension commands (though the prettyprinting should work for them).
I'm guessing that testing and feedback is the most helpful thing people can do for this feature.
Yes. Please test. I've tried to make it robust, but there are plenty of surprising ways in which a complicated program + libraries can fail - so if you see Python tracebacks inside gdb, please do file bugs. Also, suggestions for ways of making Python easier to debug would be good. For Fedora 14 I want to take this further, e.g. maybe adding python-level breakpoints to gdb.
One nice thing about this feature is that although it's quite "low-level", the code is written in Python, so a Python developer with an idea for making this better may well be able to do so directly. I have a very keen, not-at-all-vested interest in making Python easier to debug!
Thanks, David. By the way, what do you do when you're not hacking?
Hanging out with my wife and cat, puttering about in our garden.
Sounds like the good life.
When it's not raining!
Thanks for taking the time to talk with us, David!
Thank you!