On Tue Nov 25 2014 at 2:27:46 PM Dave Halter <davidhalter88@gmail.com> wrote:
2014-11-25 12:29 GMT+01:00 Stefan Bucur <stefan.bucur@gmail.com>:


On Tue Nov 18 2014 at 2:01:46 PM Dave Halter <davidhalter88@gmail.com> wrote:
Hi Stefan

I'm playing with this as well in Jedi. I'm pretty far with flow analysis and AttributeErrors. (This includes everything you mention above except integer division by zero). Would be easy to implement in Jedi, though. I just have different priorities, at the moment.

If you have some time on your hands you can watch my EuroPython talk about this: https://www.youtube.com/watch?v=DfVHSw0iOsk I'm also glad to skype (gaukler_) if you're interested. Jedi is not well known for doing static analysis. But it's my goal to change this now.

Thanks Dave for the pointer. I watched your talk and had a look at Jedi's code — this is quite nice and would indeed be great to better expose the static analysis potential of the framework.

What I'm trying to achieve is something a bit different. My goal is to reuse as much as possible the "implicit specs" of the interpreter itself (CPython), as opposed to implementing them again in my analysis tool.

We already have an execution engine that uses the interpreter to automatically explore multiple paths through a piece of Python code. You can read here the academic paper, with case studies for Python and Lua:  http://dslab.epfl.ch/pubs/chef.pdf

Stefan, I haven't read the full paper (but most of it). Interesting stuff! Are you doing any "argument propagation" or are you simply looking at a function without argument knowledge?

There is no explicit handling of the Python semantics in Chef (our tool)—this is the major advantage of the technique. The Python interpreter simply runs the code under test inside the symbolic x86 VM. An example may help clarify this (see below).
 

How do you does your output look like?

The output is a set of test cases that comprise an automatically-generated test suite for the program under test. These test cases capture all the execution paths (including corner cases & buggy cases) discovered automatically during execution.
 
Can you give us a real world example (maybe simplejson)? Would be really interesting.

Say we want to test the code in the simplejson package.

Traditionally, one would write a suite of unit tests that exercise a predetermined set of input-output pairs. For instance, you'd have to think of what inputs would best cover the package behavior, i.e., valid JSON, invalid JSON, empty strings, slightly malformed JSON, and so on. This is quite tedious and one may very well miss obscure corner cases.

However, with Chef, we write instead so called "symbolic tests", which use Chef's API. Here is a simple example for simplejson, say simplejson_test.py:

import importlib
import simplejson
import sys
from chef import light

class SimpleJSONTest(light.SymbolicTest):
    def setUp(self):
        pass
        
    def runTest(self):
        simplejson.loads(self.getString("input", '\x00'*15))

if __name__ == "__main__":
    light.runFromArgs(SimpleJSONTest, arg_list=sys.argv)


This piece of code does several things:
* It encapsulates the test functionality in a test class that derives from light.SymbolicTest (as opposed to Python's own TestCase).
* Instead of defining a particular JSON string to pass to simplejson, it asks the framework to construct one, according to some specs. In this case, it asks for a string that will be referred to as "input", of 15 characters, with a default value of null characters everywhere.
* When the script is executed, the framework instantiates the symbolic test and runs it.

You can run simplejson_test.py in two modes:
1) In symbolic mode, the test runs inside the symbolic virtual machine. The call to SimpleJSONTest.getString(...) returns a special "symbolic string", which taints the returned variable and causes the execution of the interpreter to "fork" (akin to a process fork) everytime a branch depending on the symbolic string is encountered during execution. This works everywhere inside the interpreter -- either for Python-level branches, or for native branches inside C extension modules -- because the symbolic VM runs at x86 level.
Every time the execution forks, the framework generates a new "test case" -- a concrete assignment to the symbolic string that would cause the interpreter to execute precisely the newly discovered path (e.g., the '{ "key": 1}' string).

2) In replay mode, all the test cases generated in symbolic mode can be replayed via the same simplejson_test.py file. In this mode, the SimpleJSONTest.getString(...) call returns one of the concrete input assignments generated and checks that the replayed output matches the output observed in symbolic mode (this is to weed out nondeterminism).


 

BTW: I remember asking the same question that you asked here in the beginning: "What are your most typical mistakes that you would like a static analysis tool pick up?" https://github.com/davidhalter/jedi/issues/408 The first maybe 10 answers might interest you.

Oh, this is excellent! Thanks a lot!


Cheers,
Stefan



I'm now working on extending this technique to other types of analyses and I'm trying to determine the most relevant types of analyses and checks for Python. So far, I conclude that type inference would serve multiple purposes and is on the top of my list.

Stefan


2014-11-17 18:18 GMT+01:00 Stefan Bucur <stefan.bucur@gmail.com>:
I'm developing a Python static analysis tool that flags common programming errors in Python programs. The tool is meant to complement other tools like Pylint (which perform checks at lexical and AST level) by going deeper with the code analysis and keeping track of the possible control flow paths in the program (path-sensitive analysis).

For instance, a path-sensitive analysis detects that the following snippet of code would raise an AttributeError exception:

if object is None: # If the True branch is taken, we know the object is None
  object.doSomething() # ... so this statement would always fail

I wanted first to tap into people's experience and get a sense of what common pitfalls in the language & its standard library such a static checker should look for. Just as an example of what I mean, here [1] is a list of static checks for the C++ language, as part of the Clang static analyzer project.

My preliminary list of Python checks is quite rudimentary, but maybe could serve as a discussion starter:

* Proper Unicode handling (for 2.x)
  - encode() is not called on str object
  - decode() is not called on unicode object
* Check for integer division by zero
* Check for None object dereferences

Thanks a lot,
Stefan Bucur



_______________________________________________
code-quality mailing list
code-quality@python.org
https://mail.python.org/mailman/listinfo/code-quality