[Python-Dev] Static analysis of CPython using coccinelle/spatch
dmalcolm at redhat.com
Mon Nov 16 21:27:53 CET 2009
Has anyone else looked at using Coccinelle/spatch on CPython source
It's a GPL-licensed tool for matching semantic patterns in C source
code. It's been used on the Linux kernel for detecting and fixing
problems, and for autogenerating patches when refactoring
(http://coccinelle.lip6.fr/impact_linux.php). Although it's implemented
in OCaml, it is scriptable using Python.
I've been experimenting with using it on CPython code, both on the core
implementation, and on C extension modules.
As a test, I've written a validator for the mini-language used by
PyArg_ParseTuple and its variants. My code examines the types of the
variables passed as varargs, and attempts to check that they are
correct, according to the rules here
http://docs.python.org/c-api/arg.html (and in Python/getargs.c)
It can detect this old error (fixed in svn r34931):
buggy.c:12:socket_htons:Mismatching type of argument 1 in ""i:htons"":
expected "int *" but got "unsigned long *"
Similarly, it finds the deliberate error in xxmodule.c:
xxmodule.c:207:xx_roj:unknown format char in "O#:roj": '#'
(Unfortunately, when run on the full source tree, I see numerous
messages, and as far as I can tell, the others are false positives)
You can see the code here:
and download using anonymous git in this manner:
git clone git://fedorapeople.org/home/fedora/dmalcolm/public_git/check-cpython.git
The .cocci file detects invocations of PyArg_ParseTuple and determines
the types of the arguments. At each matching call site it invokes
python code, passing the type information to validate.py's
(I suspect it's possible to use spatch to detect reference counting
antipatterns; I've also attempted 2to3 refactoring of c code using
semantic patches, but so far macros tend to get in the way).
Alternatively, are there any other non-proprietary static analysis tools
More information about the Python-Dev