[Python-Dev] Static analysis of CPython using coccinelle/spatch
David Malcolm
dmalcolm at redhat.com
Mon Nov 16 21:27:53 CET 2009
Has anyone else looked at using Coccinelle/spatch[1] on CPython source
code?
It's a GPL-licensed tool for matching semantic patterns in C source
code. It's been used on the Linux kernel for detecting and fixing
problems, and for autogenerating patches when refactoring
(http://coccinelle.lip6.fr/impact_linux.php). Although it's implemented
in OCaml, it is scriptable using Python.
I've been experimenting with using it on CPython code, both on the core
implementation, and on C extension modules.
As a test, I've written a validator for the mini-language used by
PyArg_ParseTuple and its variants. My code examines the types of the
variables passed as varargs, and attempts to check that they are
correct, according to the rules here
http://docs.python.org/c-api/arg.html (and in Python/getargs.c)
It can detect this old error (fixed in svn r34931):
buggy.c:12:socket_htons:Mismatching type of argument 1 in ""i:htons"":
expected "int *" but got "unsigned long *"
Similarly, it finds the deliberate error in xxmodule.c:
xxmodule.c:207:xx_roj:unknown format char in "O#:roj": '#'
(Unfortunately, when run on the full source tree, I see numerous
messages, and as far as I can tell, the others are false positives)
You can see the code here:
http://fedorapeople.org/gitweb?p=dmalcolm/public_git/check-cpython.git;a=tree
and download using anonymous git in this manner:
git clone git://fedorapeople.org/home/fedora/dmalcolm/public_git/check-cpython.git
The .cocci file detects invocations of PyArg_ParseTuple and determines
the types of the arguments. At each matching call site it invokes
python code, passing the type information to validate.py's
validate_types.
(I suspect it's possible to use spatch to detect reference counting
antipatterns; I've also attempted 2to3 refactoring of c code using
semantic patches, but so far macros tend to get in the way).
Alternatively, are there any other non-proprietary static analysis tools
for CPython?
Thoughts?
Dave
[1] http://coccinelle.lip6.fr/
More information about the Python-Dev
mailing list