From Fedora Project Wiki
(→‎Scope: Add identified packages)
m (internal link cleaning)
 
(88 intermediate revisions by 11 users not shown)
Line 1: Line 1:
<!-- All fields on this form are required to be accepted by FESCo.
= Systemtap Static Probes =
We also request that you maintain the same order of sections so that all of the feature pages are uniform.  -->
 
<!-- The actual name of your feature page should look something like: Features/YourFeatureName.  This keeps all features in the same namespace -->
 
= Feature Name =
Systemtap Static Probes for userspace programs.


== Summary ==
== Summary ==
Systemtap allows event tracing of programs when they have static probes inserted. This allows for tracing specifics of an application on a higher level that is meaningful to the application user so they don't have to know the exact source code details for tracing what is happening.
Systemtap allows event tracing of programs when they have static probes inserted. This allows for tracing specifics of an application on a higher level that is meaningful to the application user so they don't have to know the exact source code details for tracing what is happening. Language runtimes can benefit from this by exposing events that make sense to users of those languages/runtimes.


== Owner ==
== Owner ==
Line 18: Line 12:


== Current status ==
== Current status ==
* Targeted release: [[Releases/11 | Fedora 11]]  
* Targeted release: [[Releases/13 | Fedora 13]]  
* Last updated: January 19 2009
* Last updated: 24 Mar 2009
* Percentage of completion: 5%
* Percentage of completion: 100%.
* systemtap 1.2-1 is now available.
* java (since 1:1.6.0-21.b16 and since 34.b17 also jni and jstack support), postgresql (since 8.3.6-4) and python (2.6.4-19) have had static probes enabled.
* Tracking bug: https://bugzilla.redhat.com/show_bug.cgi?id=546295
* See under scope for individual package status.
 
=== TODO ===
 
* Add examples for java, postgresql and tcl below, like done for python.
 
=== DONE ===
* Identified deficiencies upstream:
  * [[http://sourceware.org/bugzilla/show_bug.cgi?id=10013 PR10013]] support ENABLED sdt probe macro - Fixed in 1.1
  * [[http://sourceware.org/bugzilla/show_bug.cgi?id=10601 PR10601]] user-space deref/registers in loc2c (i386 mainly, but could affect x86_64 and other arches also) - Fixed in 1.1
* elfutils cfi reading bugs https://bugzilla.redhat.com/show_bug.cgi?id=563528 - Fixed, new package pushed


== Detailed Description ==
== Detailed Description ==
Line 29: Line 37:
<!-- What is the benefit to the platform?  If this is a major capability update, what has changed?  If this is a new feature, what capabilities does it bring? Why will Fedora become a better distribution or project because of this feature?-->
<!-- What is the benefit to the platform?  If this is a major capability update, what has changed?  If this is a new feature, what capabilities does it bring? Why will Fedora become a better distribution or project because of this feature?-->


It will be easier for developers and users to observe what is really happening on their system on a higher (application) level.
It will be easier for developers and users to observe what is really happening on their system on a higher (application or language) level.


== Scope ==
== Scope ==
Line 36: Line 44:
* Work with upstream to identify any issues with the new capabilities while we activate probes in packages.
* Work with upstream to identify any issues with the new capabilities while we activate probes in packages.
* Package new version of Systemtap (including new subpackage systemtap-sdt-devel).
* Package new version of Systemtap (including new subpackage systemtap-sdt-devel).
* Identify packages that already include static user probes (postgresql, java-1.6.0-openjdk, mysql, xorg-x11-server, ...)
* Identify packages that already include static user probes (see below)
* Work with package maintainer to enable them in the Fedora build spec file.
* Work with package maintainer to enable them in the Fedora build spec file.
* Add documentation on enabled probes and how to use them with a systemtap tapset.
* Add documentation on enabled probes and how to use them with a systemtap tapset.
Line 42: Line 50:
Currently identified packages:
Currently identified packages:


* postgresql: Able to build something that works with current rpm. ([http://www.postgresql.org/docs/8.2/interactive/dynamic-trace.html upstream docs])
=== postgresql ===
* xorg-x11-server: Need tweaks to systemtap to gen proper header from .d file. ([http://people.freedesktop.org/~alanc/dtrace/ upstream docs])
Tracking bug: https://bugzilla.redhat.com/show_bug.cgi?id=488941
* java-1.6.0-openjdk: Probably likewise, has .d files in there. ([http://java.sun.com/javase/6/docs/technotes/guides/vm/dtrace.html upstream docs])
Already able to build something that works with current rpm.
* mysql 6.0.8. Really new alpha version from mysql.com has probes. However, the version in fedora 5.0.67 doesn't it built in. A backport would be required.
Documentation: [http://www.postgresql.org/docs/8.2/interactive/dynamic-trace.html upstream docs]
Example: [http://gnu.wildebeest.org/diary/2009/02/24/systemtap-09-markers-everywhere/ example trace]
Screencast: [http://people.redhat.com/wcohen/postgresql_example.ogv video presentation]
 
=== java-1.6.0-openjdk ===
Tracking bug: https://bugzilla.redhat.com/show_bug.cgi?id=498109
* [http://java.sun.com/javase/6/docs/technotes/guides/vm/dtrace.html upstream docs]
* Static probes ready, plus hotspot tapset, jni tapset and java backtraces all done..
 
=== tcl ===
Tracking bug: https://bugzilla.redhat.com/show_bug.cgi?id=489017
* tcl-8.4.16+: Has a single generic/tclDTrace.d file.
* [http://wiki.tcl.tk/19923 upstream docs]
* Needs implementation of probe_ENABLED(), which has been implemented. http://sourceware.org/bugzilla/show_bug.cgi?id=10013


To investigate:
=== Python ===
We're tracking our Python work in our downstream bugzilla as [https://bugzilla.redhat.com/show_bug.cgi?id=545179 bug 545179].


* Apache, Ruby, PHP, Sendmail, Python, others?
* '''DONE''': Our Python 2 and Python 3 builds contain: (from [http://koji.fedoraproject.org/koji/buildinfo?buildID=155535 python-2.6.4-19.fc13] and [http://koji.fedoraproject.org/koji/buildinfo?buildID=155875 python3-3.1.1-25.fc13] onwards)
** a tapset providing these probepoints:
*** <code>python.function.entry</code>
*** <code>python.function.return</code>
** built with systemtap patches that add the static markers that implement the above to the libpython2.6 and libpython3.1 shared libraries
** contains an example of usage added to docs in the python-libs and python3-libs subpackages, logging all Python function calls/return hierarchically across the whole system or for one process
** dmalcolm has tested the example script on a rawhide box and verified that it works on i686 for both Python runtimes.
** dmalcolm has done initial testing of [https://bugzilla.redhat.com/show_bug.cgi?id=545179#c14 the performance of the python 2 patch] using [http://code.google.com/p/unladen-swallow/wiki/Benchmarks the Unladen Swallow benchmark suite], initial indications suggest the patch we're using imposes negligible performance cost for the case when the probe points are compiled in but aren't in active use
** dmalcolm has added another example script "pyfuntop.stp" which is a top-like view of python function calls (not yet tested in rpm context)
* '''TODO''':
** Double-check generated machine code
** Test with and without probes, on both architectures, with both python 2 and python 3, and with multilib installs on 64-bit
** Test "pyfuntop.stp"
** More documentation
** Send this work upstream (we have taken an out-of-tree patch to the core adding DTrace static markers ([http://bugs.python.org/issue4111 upstream RFE 4111]), reworked the patch to enable it to work with SystemTap, fixed a performance issue, added a tapset to make the markers easy to use, and written an example script that uses the resulting probe points, and ported the patch to python 3).
** Address error handling within the Python 3 probe.
** Ideas for additional probe points:
*** function calls/returns (this is what the dtrace probe has)
*** GIL events: instrument the raw function to claim/release of the lock, then capture the times at which it happens, then render stats
*** threads starting/stopping
*** bytecode execution metrics: e.g. trace individual bytecodes; how often does LOAD_GLOBAL get invoked
*** exceptions being thrown
*** exceptions being handled (e.g. for tracking down exactly where code is "swallowing" an error)
*** unhandled exceptions
*** arenas being claimed/freed
*** dictionaries switching to inefficient form: http://lewk.org/blog/python-dictionary-optimizations
*** _warnings.c: do_warn()  (e.g. whole-system python3 warnings for all python 2 running on your system)
*** py-level backtraces


== How To Test ==
<!-- This does not need to be a full-fledged document.  Describe the dimensions of tests that this feature is expected to pass when it is done.  If it needs to be tested with different hardware or software configurations, indicate them.  The more specific you can be, the better the community testing can be.


Remember that you are writing this how to for interested testers to use to check out your feature - documenting what you do for testing is OK, but it's much better to document what *I* can do to test your feature.
See also Mark's blog post about [http://gnu.wildebeest.org/diary/2009/12/07/fudcon-success-systemtap-meets-python our initial work on Python/SystemTap at FUDCon Toronto]
 
=== Notes ===
It seems as if several of the above were dtrace-instrumented in code that was never merged into the upstream versions of the package, but instead represented as run-time add-ons or private patches for Solaris distributions.  Disappointing, but perhaps we can do better and engage the respective upstream teams.  This will of course take time and panache.


A good "how to test" should answer these four questions:
At least the patches tend to be very small so we have some freedom to choose between approaches (adding STAP_PROBE/whatever hooks directly to the core upstream code; or fedora local patches; or add-on shared libraries like for php/httpd).


0. What special hardware / data / etc. is needed (if any)?
Another approach worth considering is adding tapsets that map process.mark() events to process.function/statement() to approximate the dtrace out-of-tree patches.
1. How do I prepare my system to test this feature? What packages
need to be installed, config files edited, etc.?
2. What specific actions do I perform to check that the feature is
working like it's supposed to?
3. What are the expected results of those actions?


-->
== How To Test ==
Whether systemtap and static markers are working in general can be tested by installing systemtap, kernel-debuginfo and the systemtap-testsuite. Running sudo make installcheck in /usr/share/systemtap/testsuite


{{admon/important | Needs to be expanded |
When applications get static markers enabled we should add them to a testing page listing:
* Need a specific pacakage list.
* Package install instructions.
* For each package enabling probes a reference to the probe names.
* Setup and sample run of the application
* A reference to the probe names.
* And an simple example stap invocation listing markers that can be enabled.
* And an simple example stap invocation listing markers that can be enabled.
}}
 
'''Question''': Is there a convention/template for adding such test pages for test days?
<br>'''Answer''': [[QA/Test_Days/Create]]


== User Experience ==
== User Experience ==


For packages that have static probes enabled users will be able to trace high-level events, like for example database transactions, through stap.
For packages that have static probes enabled users will be able to trace high-level events, like for example database transactions, or method tracing in virtual machines through stap.


== Dependencies ==
== Dependencies ==
Line 81: Line 130:


* A new version of systemtap with the systemtap-sdt-devel subpackage.
* A new version of systemtap with the systemtap-sdt-devel subpackage.
* A new version of elfutils that provides access to the new gcc debuginfo, in particular the new cfi encodings.
* Any package wishing to expose existing probes in its (upstream) sources depending on systemtap-sdt-devel and adding an --enable-dtrace or equivalent to its spec file.
* Any package wishing to expose existing probes in its (upstream) sources depending on systemtap-sdt-devel and adding an --enable-dtrace or equivalent to its spec file.


Line 96: Line 146:


While working on this feature this section will be expanded to list packages that have probe points enabled and pointers to (upstream) package documentation on the probe names and semantics like for postgresql http://www.postgresql.org/docs/8.2/static/dynamic-trace.html
While working on this feature this section will be expanded to list packages that have probe points enabled and pointers to (upstream) package documentation on the probe names and semantics like for postgresql http://www.postgresql.org/docs/8.2/static/dynamic-trace.html
=== Python ===
The following Systemtap probe points have been added to Fedora 13's Python 2 and Python 3 packages:
{|
! Probe point !! Parameters !! Overview !! Example of usage
|-
| <code>python.function.entry</code> ||
* str filename
* str funcname
* int lineno
|| Indicates that execution of a Python function has begun ||
<pre>stap \
  -e'probe python.function.entry {log(filename);}' \
  -c yum help</pre>
|-
| <code>python.function.return</code> ||
* str filename
* str funcname
* int lineno
|| Indicates that the Python runtime has returned from a function || Probing modules visited as the python runtime starts up: <pre>stap \
-e'probe python.function.return {log(filename);}' \
-c "python -c 'pass'"</pre>
|}
Sample scripts that use these probe points have been added to the <code>python-libs</code> and <code>python3-libs</code> subpackages.
==== Tracing the hierarchy of Python function calls ====
<code>systemtap-example.stp</code> shows the hierarchy of function calls and returns within a python process (or across the whole system)
Here's an example of running it (in verbose mode) to trace what happens during the invocation of a python script (the <code>yum</code> tool, as it happens).
<pre>
# stap -v /usr/share/doc/python-libs-2.6.4/systemtap-example.stp -c yum help
Pass 1: parsed user script and 66 library script(s) using
20224virt/12248res/2040shr kb, in 150usr/10sys/160real ms.
Pass 2: analyzed script: 2 probe(s), 14 function(s), 2 embed(s), 2 global(s)
using 25184virt/14572res/3380shr kb, in 20usr/0sys/14real ms.
Pass 3: using cached
/root/.systemtap/cache/5a/stap_5a80297603ac4434b77b22e6f4127f00_5903.c
Pass 4: using cached
/root/.systemtap/cache/5a/stap_5a80297603ac4434b77b22e6f4127f00_5903.ko
Pass 5: starting run.
    0 yum(23287): => <module> in /usr/lib/python2.6/site.py:59
  439 yum(23287):  => <module> in /usr/lib/python2.6/os.py:22
  1021 yum(23287):  => <module> in /usr/lib/python2.6/posixpath.py:11
  1146 yum(23287):    => <module> in /usr/lib/python2.6/stat.py:4
  1163 yum(23287):    <= <module> in /usr/lib/python2.6/stat.py:94
  1272 yum(23287):    => <module> in /usr/lib/python2.6/genericpath.py:5
  1292 yum(23287):    <= <module> in /usr/lib/python2.6/genericpath.py:85
  1483 yum(23287):    => <module> in /usr/lib/python2.6/warnings.py:1
  1677 yum(23287):    => <module> in /usr/lib/python2.6/linecache.py:6
  1698 yum(23287):    <= <module> in /usr/lib/python2.6/linecache.py:68
(etc)
</pre>
The Python 3 version of the probe point sends the strings back to SystemTap in UTF-8 encoding.  For example, if you create a python script with an "interesting" name:
<pre>
# echo 'print("Yaarrr!")' > ☠.py
# cat ☠.py
print("Yarrrrr!")
# stap -v /usr/share/doc/python3-libs-3.1.1/systemtap-example.stp -c "python3 ☠.py"
(copious output snipped)
    0 python3(28262): => <module> in ☠.py:1
    22 python3(28262): <= <module> in ☠.py:1
</pre>
{{admon/tip|Unicode filenames|For the curious, the filename of that python script in Unicode is:
<pre>
U+2620 SKULL AND CROSSBONES
U+002E FULL STOP
U+0070 LATIN SMALL LETTER P
U+0079 LATIN SMALL LETTER Y
</pre>
}}
==== "top" for Python function calls ====
<code>pyfuntop.stp</code> gives a "top"-like view of Python function calls per second, either across the whole system, or for a given python process.
Here's an example of running it:
<pre>
[david@fedora13 ~]$ stap /usr/share/doc/python3-libs-3.1.2/pyfuntop.stp
</pre>
and the output at one instant (as it happens, showing PackageKit reading update information from a yum .xml file, I believe):
<pre>
  PID                                                                        FILENAME  LINE                      FUNCTION  CALLS
10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py  1156                      _fixname  5831
10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py  1149                      _fixtext  5468
10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py    749                        _encode  5468
10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py  1046                        _flush  4050
10802                                                  /usr/lib64/python2.6/string.py    308                          join  3687
10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py  1184                          _data  3686
10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py  1064                          data  3686
10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py  1075                          start  2025
10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py    190                      __init__  2025
10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py    285                        append  2025
10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py    726                      iselement  2025
10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py  1187                          _end  2025
10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py  1091                            end  2025
10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py  1175                    _start_list  2024
10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py  1244                          feed      4
</pre>


== Release Notes ==
== Release Notes ==


<!-- The Fedora Release Notes inform end-users about what is new in the release. Examples of past release notes are here: http://docs.fedoraproject.org/release-notes/ -->
Systemtap has been extended to support user space tracing, and in particular to support static (dtrace compatible) markers enabled in various programs in Fedora 12. This enables users, developers and administrators a high level overview of what is going on with their system or deep down in a specific program or subsystem.
<!-- The release notes also help users know how to deal with platform changes such as ABIs/APIs, configuration or data file formats, or upgrade concerns.  If there are any such changes involved in this feature, indicate them here.  You can also link to upstream documentation if it satisfies this need.  This information forms the basis of the release notes edited by the documentation team and shipped with the release. -->
 
Systemtap comes with a tutorial, a language reference manual, a tapsets reference and an examples directory under /usr/share/doc/systemtap-?.?/


* '''FIXME''': Should have some notes about the feature and which packages were enabled when finished.
* '''TODO''': Should have a list of which packages were enabled with markers when finished.


== Comments and Discussion ==
== Comments and Discussion ==


* See [[Talk:Features/SystemtapStaticProbes]] <!-- This adds a link to the "discussion" tab associated with your page.  This provides the ability to have ongoing comments or conversation without bogging down the main feature page -->
* See [[Talk:Features/SystemtapStaticProbes]]




----
[[Category:FeatureAcceptedF13]]


[[Category:FeaturePageIncomplete]]
<!-- When your feature page is completed and ready for review -->
<!-- When your feature page is completed and ready for review -->
<!-- remove Category:FeaturePageIncomplete and change it to Category:FeatureReadyForWrangler -->
<!-- remove Category:FeaturePageIncomplete and change it to Category:FeatureReadyForWrangler -->
<!-- After review, the feature wrangler will move your page to Category:FeatureReadyForFesco... if it still needs more work it will move back to Category:FeaturePageIncomplete-->
<!-- After review, the feature wrangler will move your page to Category:FeatureReadyForFesco... if it still needs more work it will move back to Category:FeaturePageIncomplete-->
<!-- A pretty picture of the page category usage is at: https://fedoraproject.org/wiki/Features/Policy/Process -->
<!-- A pretty picture of the page category usage is at: [[Features/Policy/Process]] -->
[[Category:Python]]

Latest revision as of 08:09, 18 September 2016

Systemtap Static Probes

Summary

Systemtap allows event tracing of programs when they have static probes inserted. This allows for tracing specifics of an application on a higher level that is meaningful to the application user so they don't have to know the exact source code details for tracing what is happening. Language runtimes can benefit from this by exposing events that make sense to users of those languages/runtimes.

Owner

  • email: mjw@redhat.com

Current status

  • Targeted release: Fedora 13
  • Last updated: 24 Mar 2009
  • Percentage of completion: 100%.
  • systemtap 1.2-1 is now available.
  • java (since 1:1.6.0-21.b16 and since 34.b17 also jni and jstack support), postgresql (since 8.3.6-4) and python (2.6.4-19) have had static probes enabled.
  • Tracking bug: https://bugzilla.redhat.com/show_bug.cgi?id=546295
  • See under scope for individual package status.

TODO

  • Add examples for java, postgresql and tcl below, like done for python.

DONE

  • Identified deficiencies upstream:
 * [PR10013] support ENABLED sdt probe macro - Fixed in 1.1
 * [PR10601] user-space deref/registers in loc2c (i386 mainly, but could affect x86_64 and other arches also) - Fixed in 1.1

Detailed Description

By packaging a new version of systemtap, that enables programs that already have static dtrace probe markers in their sources and by making those packages build depend on the new systemtap-sdt-devel package and recompiling them with probe points enabled, users of those packages will be able to trace any high level events that these packages provide.

Benefit to Fedora

It will be easier for developers and users to observe what is really happening on their system on a higher (application or language) level.

Scope

  • Work with upstream to identify any issues with the new capabilities while we activate probes in packages.
  • Package new version of Systemtap (including new subpackage systemtap-sdt-devel).
  • Identify packages that already include static user probes (see below)
  • Work with package maintainer to enable them in the Fedora build spec file.
  • Add documentation on enabled probes and how to use them with a systemtap tapset.

Currently identified packages:

postgresql

Tracking bug: https://bugzilla.redhat.com/show_bug.cgi?id=488941 Already able to build something that works with current rpm. Documentation: upstream docs Example: example trace Screencast: video presentation

java-1.6.0-openjdk

Tracking bug: https://bugzilla.redhat.com/show_bug.cgi?id=498109

  • upstream docs
  • Static probes ready, plus hotspot tapset, jni tapset and java backtraces all done..

tcl

Tracking bug: https://bugzilla.redhat.com/show_bug.cgi?id=489017

Python

We're tracking our Python work in our downstream bugzilla as bug 545179.

  • DONE: Our Python 2 and Python 3 builds contain: (from python-2.6.4-19.fc13 and python3-3.1.1-25.fc13 onwards)
    • a tapset providing these probepoints:
      • python.function.entry
      • python.function.return
    • built with systemtap patches that add the static markers that implement the above to the libpython2.6 and libpython3.1 shared libraries
    • contains an example of usage added to docs in the python-libs and python3-libs subpackages, logging all Python function calls/return hierarchically across the whole system or for one process
    • dmalcolm has tested the example script on a rawhide box and verified that it works on i686 for both Python runtimes.
    • dmalcolm has done initial testing of the performance of the python 2 patch using the Unladen Swallow benchmark suite, initial indications suggest the patch we're using imposes negligible performance cost for the case when the probe points are compiled in but aren't in active use
    • dmalcolm has added another example script "pyfuntop.stp" which is a top-like view of python function calls (not yet tested in rpm context)
  • TODO:
    • Double-check generated machine code
    • Test with and without probes, on both architectures, with both python 2 and python 3, and with multilib installs on 64-bit
    • Test "pyfuntop.stp"
    • More documentation
    • Send this work upstream (we have taken an out-of-tree patch to the core adding DTrace static markers (upstream RFE 4111), reworked the patch to enable it to work with SystemTap, fixed a performance issue, added a tapset to make the markers easy to use, and written an example script that uses the resulting probe points, and ported the patch to python 3).
    • Address error handling within the Python 3 probe.
    • Ideas for additional probe points:
      • function calls/returns (this is what the dtrace probe has)
      • GIL events: instrument the raw function to claim/release of the lock, then capture the times at which it happens, then render stats
      • threads starting/stopping
      • bytecode execution metrics: e.g. trace individual bytecodes; how often does LOAD_GLOBAL get invoked
      • exceptions being thrown
      • exceptions being handled (e.g. for tracking down exactly where code is "swallowing" an error)
      • unhandled exceptions
      • arenas being claimed/freed
      • dictionaries switching to inefficient form: http://lewk.org/blog/python-dictionary-optimizations
      • _warnings.c: do_warn() (e.g. whole-system python3 warnings for all python 2 running on your system)
      • py-level backtraces


See also Mark's blog post about our initial work on Python/SystemTap at FUDCon Toronto

Notes

It seems as if several of the above were dtrace-instrumented in code that was never merged into the upstream versions of the package, but instead represented as run-time add-ons or private patches for Solaris distributions. Disappointing, but perhaps we can do better and engage the respective upstream teams. This will of course take time and panache.

At least the patches tend to be very small so we have some freedom to choose between approaches (adding STAP_PROBE/whatever hooks directly to the core upstream code; or fedora local patches; or add-on shared libraries like for php/httpd).

Another approach worth considering is adding tapsets that map process.mark() events to process.function/statement() to approximate the dtrace out-of-tree patches.

How To Test

Whether systemtap and static markers are working in general can be tested by installing systemtap, kernel-debuginfo and the systemtap-testsuite. Running sudo make installcheck in /usr/share/systemtap/testsuite

When applications get static markers enabled we should add them to a testing page listing:

  • Package install instructions.
  • Setup and sample run of the application
  • A reference to the probe names.
  • And an simple example stap invocation listing markers that can be enabled.

Question: Is there a convention/template for adding such test pages for test days?
Answer: QA/Test_Days/Create

User Experience

For packages that have static probes enabled users will be able to trace high-level events, like for example database transactions, or method tracing in virtual machines through stap.

Dependencies

  • A new version of systemtap with the systemtap-sdt-devel subpackage.
  • A new version of elfutils that provides access to the new gcc debuginfo, in particular the new cfi encodings.
  • Any package wishing to expose existing probes in its (upstream) sources depending on systemtap-sdt-devel and adding an --enable-dtrace or equivalent to its spec file.

Contingency Plan

Even if all the tracing will not work, packages that are converted to provide static probes will not be impacted since the probe points have (near) zero overhead, so in the worse case some packages were recompiled to enable the feature, but users will still not be able to use it.

Documentation

The upstream wiki is the best description for now http://sourceware.org/systemtap/wiki/UsingStaticUserMarkers the systemtap list has an example on converting a package http://sourceware.org/ml/systemtap/2009-q1/msg00140.html

While working on this feature this section will be expanded to list packages that have probe points enabled and pointers to (upstream) package documentation on the probe names and semantics like for postgresql http://www.postgresql.org/docs/8.2/static/dynamic-trace.html

Python

The following Systemtap probe points have been added to Fedora 13's Python 2 and Python 3 packages:

Probe point Parameters Overview Example of usage
python.function.entry
  • str filename
  • str funcname
  • int lineno
Indicates that execution of a Python function has begun
stap \
  -e'probe python.function.entry {log(filename);}' \
  -c yum help
python.function.return
  • str filename
  • str funcname
  • int lineno
Indicates that the Python runtime has returned from a function Probing modules visited as the python runtime starts up:
stap \
-e'probe python.function.return {log(filename);}' \
-c "python -c 'pass'"

Sample scripts that use these probe points have been added to the python-libs and python3-libs subpackages.

Tracing the hierarchy of Python function calls

systemtap-example.stp shows the hierarchy of function calls and returns within a python process (or across the whole system)

Here's an example of running it (in verbose mode) to trace what happens during the invocation of a python script (the yum tool, as it happens).

# stap -v /usr/share/doc/python-libs-2.6.4/systemtap-example.stp -c yum help
Pass 1: parsed user script and 66 library script(s) using
20224virt/12248res/2040shr kb, in 150usr/10sys/160real ms.
Pass 2: analyzed script: 2 probe(s), 14 function(s), 2 embed(s), 2 global(s)
using 25184virt/14572res/3380shr kb, in 20usr/0sys/14real ms.
Pass 3: using cached
/root/.systemtap/cache/5a/stap_5a80297603ac4434b77b22e6f4127f00_5903.c
Pass 4: using cached
/root/.systemtap/cache/5a/stap_5a80297603ac4434b77b22e6f4127f00_5903.ko
Pass 5: starting run.
     0 yum(23287): => <module> in /usr/lib/python2.6/site.py:59
   439 yum(23287):  => <module> in /usr/lib/python2.6/os.py:22
  1021 yum(23287):   => <module> in /usr/lib/python2.6/posixpath.py:11
  1146 yum(23287):    => <module> in /usr/lib/python2.6/stat.py:4
  1163 yum(23287):    <= <module> in /usr/lib/python2.6/stat.py:94
  1272 yum(23287):    => <module> in /usr/lib/python2.6/genericpath.py:5
  1292 yum(23287):    <= <module> in /usr/lib/python2.6/genericpath.py:85
  1483 yum(23287):    => <module> in /usr/lib/python2.6/warnings.py:1
  1677 yum(23287):     => <module> in /usr/lib/python2.6/linecache.py:6
  1698 yum(23287):     <= <module> in /usr/lib/python2.6/linecache.py:68
(etc)

The Python 3 version of the probe point sends the strings back to SystemTap in UTF-8 encoding. For example, if you create a python script with an "interesting" name:

# echo 'print("Yaarrr!")' > ☠.py
# cat ☠.py
print("Yarrrrr!")
# stap -v /usr/share/doc/python3-libs-3.1.1/systemtap-example.stp -c "python3 ☠.py"
(copious output snipped)
     0 python3(28262): => <module> in ☠.py:1
    22 python3(28262): <= <module> in ☠.py:1
Unicode filenames
For the curious, the filename of that python script in Unicode is:
U+2620 SKULL AND CROSSBONES
U+002E FULL STOP
U+0070 LATIN SMALL LETTER P
U+0079 LATIN SMALL LETTER Y

"top" for Python function calls

pyfuntop.stp gives a "top"-like view of Python function calls per second, either across the whole system, or for a given python process.

Here's an example of running it:

[david@fedora13 ~]$ stap /usr/share/doc/python3-libs-3.1.2/pyfuntop.stp

and the output at one instant (as it happens, showing PackageKit reading update information from a yum .xml file, I believe):

   PID                                                                         FILENAME   LINE                       FUNCTION  CALLS
 10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py   1156                       _fixname   5831
 10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py   1149                       _fixtext   5468
 10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py    749                        _encode   5468
 10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py   1046                         _flush   4050
 10802                                                   /usr/lib64/python2.6/string.py    308                           join   3687
 10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py   1184                          _data   3686
 10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py   1064                           data   3686
 10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py   1075                          start   2025
 10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py    190                       __init__   2025
 10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py    285                         append   2025
 10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py    726                      iselement   2025
 10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py   1187                           _end   2025
 10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py   1091                            end   2025
 10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py   1175                    _start_list   2024
 10802                                    /usr/lib64/python2.6/xml/etree/ElementTree.py   1244                           feed      4

Release Notes

Systemtap has been extended to support user space tracing, and in particular to support static (dtrace compatible) markers enabled in various programs in Fedora 12. This enables users, developers and administrators a high level overview of what is going on with their system or deep down in a specific program or subsystem.

Systemtap comes with a tutorial, a language reference manual, a tapsets reference and an examples directory under /usr/share/doc/systemtap-?.?/

  • TODO: Should have a list of which packages were enabled with markers when finished.

Comments and Discussion