[Python-Dev] Static analysis of CPython using coccinelle/spatch

David Malcolm dmalcolm at redhat.com
Mon Nov 16 21:27:53 CET 2009


Has anyone else looked at using Coccinelle/spatch[1] on CPython source
code?

It's a GPL-licensed tool for matching semantic patterns in C source
code. It's been used on the Linux kernel for detecting and fixing
problems, and for autogenerating patches when refactoring
(http://coccinelle.lip6.fr/impact_linux.php).  Although it's implemented
in OCaml, it is scriptable using Python.

I've been experimenting with using it on CPython code, both on the core
implementation, and on C extension modules.

As a test, I've written a validator for the mini-language used by
PyArg_ParseTuple and its variants.  My code examines the types of the
variables passed as varargs, and attempts to check that they are
correct, according to the rules here
http://docs.python.org/c-api/arg.html (and in Python/getargs.c)

It can detect this old error (fixed in svn r34931):
buggy.c:12:socket_htons:Mismatching type of argument 1 in ""i:htons"":
expected "int *" but got "unsigned long *"

Similarly, it finds the deliberate error in xxmodule.c:
xxmodule.c:207:xx_roj:unknown format char in "O#:roj": '#'

(Unfortunately, when run on the full source tree, I see numerous
messages, and as far as I can tell, the others are false positives)

You can see the code here:
http://fedorapeople.org/gitweb?p=dmalcolm/public_git/check-cpython.git;a=tree
and download using anonymous git in this manner:
git clone git://fedorapeople.org/home/fedora/dmalcolm/public_git/check-cpython.git

The .cocci file detects invocations of PyArg_ParseTuple and determines
the types of the arguments.  At each matching call site it invokes
python code, passing the type information to validate.py's
validate_types.

(I suspect it's possible to use spatch to detect reference counting
antipatterns; I've also attempted 2to3 refactoring of c code using
semantic patches, but so far macros tend to get in the way).

Alternatively, are there any other non-proprietary static analysis tools
for CPython?

Thoughts?
Dave
 
[1] http://coccinelle.lip6.fr/



More information about the Python-Dev mailing list