Fedora Static Analysis Special Interest Group (SIG)
(Note that this SIG is merely tentative for now)
Goal and Scope
TBD
dmalcolm is interested in making it easy to run static code analysis tools on all of Fedora, and having a sane system for getting useful information from the firehose of data that doing so is likely to generate. See http://lists.fedoraproject.org/pipermail/devel/2012-December/175232.html
See also the Formal Methods SIG with which there's clearly an overlap.
Mission and Plan
TBD
dmalcolm's plan and status:
- http://lists.fedoraproject.org/pipermail/devel/2012-December/175232.html ("Dealing with static code analysis in Fedora")
- http://lists.fedoraproject.org/pipermail/devel/2013-January/176872.html ("results of FUDcon Lawrence hackfest" i.e. firehose and mock-with-analysis)
- http://lists.fedoraproject.org/pipermail/devel/2013-January/177633.html ("tweaks to data model and added cpychecker support")
- http://lists.fedoraproject.org/pipermail/devel/2013-February/178045.html ("some UI ideas")
Members
- Dave Malcolm
- Richard W.M. Jones
- Josh Bressers
- Alek Paunov (DB tasks)
- Ondrej Vasik
- Kamil Dudka
- Benjamin De Kosnik
- Jerry James
- Athos Ribeiro
Communication
TBD; Fedora's main devel list for now
Tasks
Packaging
Static Code Analysis tools already in Fedora
TODO
- gcc - arguably we should pay more attention to the compiler warnings that gcc already generates: sometimes it's correctly pointing out a bug.
- clang static analyzer (in Fedora as "clang-analyzer" subpackage of "llvm")
- cpychecker (part of gcc-python-plugin)
- flawfinder (that page has a great list of links to other static analysis tools)
- cppcheck Cppcheck is a static analysis tool for C/C++ code.
- sparse - a Semantic Parser for C, primarily used by kernel developers.
- Frama-C - (in Fedora as "frama-c" package)
Package Want List
TODO
"firehose"
dmalcolm: for Fedora 17 I attempted to run all of the Python C extension code in Fedora through my cpychecker cpychecker tool. I want to repeat this analysis, but this time to capture the results in a database.
This is 4 parts:
- IN PROGRESS: firehose: an interchange format so we can capture results from all static analyzers in a consistent format. This consists of:
- an XML serialization format (with a RELAX NG schema)
- a Python module for in-memory creation/manipulation
- parsers for converting analysis results into the common format:
- DONE: gcc warnings
- DONE: cppcheck warnings (v2 of its XML output format)
- DONE: clang-static-analyzer (the .plist output format)
- IN PROGRESS(dmalcolm): cpychecker warnings (patching cpychecker so that internally it uses the above python API's classes)
- others?
- handle analyzer failures (where an analyzer choked and all or part of a source file failed; nice to capture where the failure happened).
- IN PROGRESS(dmalcolm): mock-with-analysis (need better name?): a way of doing a mock rebuild of a src.rpm with minimal effect on the main build, whilst injects a side-effect of running static analyzers on each c/c++ file compiled (other languages?), and drops firehose XML files into the chroot as results as it goes, so that they can be slurped into a database
- IN PROGRESS(dmalcolm): gccinvocation: a Python module for parsing GCC command lines, for use by mock-with-analysis
- TODO: make all of the above more robust
- TODO: "firehose-ui": a db and web UI for summarizing reviewing results from many analyzers across many packages, with nice workflows
- TODO: gluing all of the above together and deploying it.
- having a team that comes up with filters that achieve a decent signal:noise ratio, so that J Random package maintainer doesn't have to wade through so much noise
Tasks seeking volunteers
C/C++ Hackers
- Patching cppcheck so that it provides richer output (to make it easier to find duplicate error reports across runs of the tool). Specifically, we're using version 2 of the XML format. We'd like it to emit:
- the name of the function in which each problem is found (rather than just the line number)
- CWE codes for the errors. sgrubb did some work on this in the past, but it didn't get as far as an upstream patch. See http://sourceforge.net/apps/phpbb/cppcheck/viewtopic.php?f=4&t=322&p=1686&hilit=cwe#p1686
- Patching clang-analyzer so that it provides richer output (to make it easier to find duplicate error reports across runs of the tool). Specifically, we're using the plist format. We'd like it to emit:
- the name of the function in which each problem is found (rather than just the line number)
- the internal ID of the test that found the problem (e.g. "core.AttributeNonNull")
- CWE codes for the errors
Python web developers
- Building a web UI for all of this.
Python developers
- Making mock-with-analysis more robust
Packagers
- Packaging "firehose" (as python-firehose)
- Packaging "gccinvocation" (as python-gccinvocation)
- Packaging "mock-with-analysis"
- Testing "mock-with-analysis" on your own packages (expect breakage for now!)
Talk to dmalcolm if you're interested in hacking on any of the above.