No edit summary |
|||
Line 42: | Line 42: | ||
The latest version of the checker can now detect reference-counting bugs, along with paths through code that doesn't properly handle errors from the Python extension API, and I've already used it to patch some significant memory leaks. | The latest version of the checker can now detect reference-counting bugs, along with paths through code that doesn't properly handle errors from the Python extension API, and I've already used it to patch some significant memory leaks. | ||
== Benefit to Fedora == | == Benefit to Fedora == | ||
Line 57: | Line 51: | ||
== Scope == | == Scope == | ||
<!-- What work do the developers have to accomplish to complete the feature in time for release? Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?--> | <!-- What work do the developers have to accomplish to complete the feature in time for release? Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?--> | ||
My hope was to integrate this with Fedora's packaging, so that all C extension modules packaged for Python 2 and Python 3 can be guaranteed free of such errors (by adding hooks to the python-devel and python3-devel packages). | |||
Unfortunately it's not possible to get the signal:noise ratio good enough in time for Fedora 17 for that. | |||
The plan now is to automate running it on all of the C extension modules in Fedora 17, and to analyze the results. Initially bugs would be filed against the tool itself (gcc-python-plugin), and I would then triage them; genuine bugs would be reassigned to the appropriate components, and I'd try to fix the high-value ones, sending fixes upstream. However, this is a large task, and I'm likely to need help from package owners and other Python developers. False positives would thus remain as bugs in the checker itself, and I'd work on fixing them. | |||
Work to be done: | |||
* there's a gcc-4.7 incompatibility that will need a couple of days to fix | |||
* automate running it on all code | |||
* go through the results, fixing the bugs in the checker itself, and reporting/fixing the real bugs that it finds. | |||
== How To Test == | == How To Test == | ||
Line 97: | Line 77: | ||
3. What are the expected results of those actions? | 3. What are the expected results of those actions? | ||
--> | --> | ||
It's not clear that we need this section; the feature covers a distro-wide bug-fixing push. | |||
I | I *have* written an extensive selftest suite for the checker itself, which is run when it is built. | ||
== User Experience == | == User Experience == | ||
<!-- If this feature is noticeable by its target audience, how will their experiences change as a result? Describe what they will see or notice. --> | <!-- If this feature is noticeable by its target audience, how will their experiences change as a result? Describe what they will see or notice. --> | ||
Non-technical end-users of Fedora should see no difference (other than more a robust operating system). | Non-technical end-users of Fedora should see no difference (other than more a robust operating system). | ||
For examples of the output from the checker, see: | For examples of the output from the checker, see: | ||
Line 114: | Line 90: | ||
== Dependencies == | == Dependencies == | ||
<!-- What other packages (RPMs) depend on this package? Are there changes outside the developers' control on which completion of this feature depends? In other words, completion of another feature owned by someone else and might cause you to not be able to finish on time or that you would need to coordinate? Other upstream projects like the kernel (if this is not a kernel feature)? --> | <!-- What other packages (RPMs) depend on this package? Are there changes outside the developers' control on which completion of this feature depends? In other words, completion of another feature owned by someone else and might cause you to not be able to finish on time or that you would need to coordinate? Other upstream projects like the kernel (if this is not a kernel feature)? --> | ||
This is implemented via a [[Features/GccPythonPlugin|GCC plugin that embeds Python]]; the checker itself is implemented in Python. | |||
== Contingency Plan == | == Contingency Plan == | ||
<!-- If you cannot complete your feature by the final development freeze, what is the backup plan? This might be as simple as "None necessary, revert to previous release behaviour." Or it might not. If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy. --> | <!-- If you cannot complete your feature by the final development freeze, what is the backup plan? This might be as simple as "None necessary, revert to previous release behaviour." Or it might not. If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy. --> | ||
Given that this "Feature" is essentially a bug-sweep (using a new tool), we'll do as much as we can by the deadline. Any that's been done is an improvement to Fedora, but if the amount doesn't look impressive, we can drop this as a feature. | |||
== Documentation == | == Documentation == | ||
<!-- Is there upstream documentation on this feature, or notes you have written yourself? Link to that material here so other interested developers can get involved. --> | <!-- Is there upstream documentation on this feature, or notes you have written yourself? Link to that material here so other interested developers can get involved. --> | ||
Upstream documentation: http:/ | Upstream documentation: http://gcc-python-plugin.readthedocs.org/en/latest/cpychecker.html | ||
== Release Notes == | == Release Notes == |
Revision as of 03:41, 24 January 2012
Static Analysis of Python Reference Counts
Summary
I've written a static analysis tool that can detect reference-counting errors made in Python extension modules written in C. We'll run the tool on all such code in Fedora 17 and make an effort to fix as many problems as time allows.
Owner
- Name: Dave Malcolm
- Email: dmalcolm@redhat.com
Current status
- Targeted release: Fedora 17
- Last updated: 2012-01-23
- Percentage of completion: 30%
The code works, and has found real bugs, but still contains bugs itself. It's only been run on a small subset of the Python code in Fedora.
Major TODO items remaining:
- there's a gcc-4.7 incompatibility that will need a couple of days to fix
- automate running it on all code
- go through the results, fixing the bugs in the checker itself, and reporting/fixing the real bugs that it finds.
Detailed Description
This is the continuation of the "Static Analysis of CPython Extensions" Fedora 16 feature.
Python makes it relatively easy to write wrapper code for C and C++ libraries, acting as a "glue" from which programs can be created.
Unfortunately, such wrapper code must manually manage the reference-counts of objects, and mistakes here can lead to /usr/bin/python leaking memory or segfaulting. There's also plenty of code out there that doesn't check for errors.
In Fedora 16, we shipped an initial version of a static analysis tool I've written (gcc-with-cpychecker), implementing some basic checks.
The latest version of the checker can now detect reference-counting bugs, along with paths through code that doesn't properly handle errors from the Python extension API, and I've already used it to patch some significant memory leaks.
Benefit to Fedora
Fedora is already a great environment for doing Python development - having a good-quality static analysis tool integrated into Fedora's build system for python extension modules will make Fedora even more compelling for Python developers. (Naturally the tool will be Free Software, and thus usable on other platforms; but we'll have it first).
The presence of the tool should also make it easier to fix certain awkward bugs, and make it easier to support secondary CPU architectures.
Scope
My hope was to integrate this with Fedora's packaging, so that all C extension modules packaged for Python 2 and Python 3 can be guaranteed free of such errors (by adding hooks to the python-devel and python3-devel packages).
Unfortunately it's not possible to get the signal:noise ratio good enough in time for Fedora 17 for that.
The plan now is to automate running it on all of the C extension modules in Fedora 17, and to analyze the results. Initially bugs would be filed against the tool itself (gcc-python-plugin), and I would then triage them; genuine bugs would be reassigned to the appropriate components, and I'd try to fix the high-value ones, sending fixes upstream. However, this is a large task, and I'm likely to need help from package owners and other Python developers. False positives would thus remain as bugs in the checker itself, and I'd work on fixing them.
Work to be done:
- there's a gcc-4.7 incompatibility that will need a couple of days to fix
- automate running it on all code
- go through the results, fixing the bugs in the checker itself, and reporting/fixing the real bugs that it finds.
How To Test
It's not clear that we need this section; the feature covers a distro-wide bug-fixing push.
I *have* written an extensive selftest suite for the checker itself, which is run when it is built.
User Experience
Non-technical end-users of Fedora should see no difference (other than more a robust operating system).
For examples of the output from the checker, see: http://dmalcolm.livejournal.com/6560.html
Dependencies
This is implemented via a GCC plugin that embeds Python; the checker itself is implemented in Python.
Contingency Plan
Given that this "Feature" is essentially a bug-sweep (using a new tool), we'll do as much as we can by the deadline. Any that's been done is an improvement to Fedora, but if the amount doesn't look impressive, we can drop this as a feature.
Documentation
Upstream documentation: http://gcc-python-plugin.readthedocs.org/en/latest/cpychecker.html
Release Notes
Fedora now ships with a gcc-with-cpychecker
variant of GCC, which adds additional compile-time checks to Python extension modules written in C, detecting various common problems (e.g. reference counting mistakes). This variant is itself written in Python.