[code-quality] RedBaron, a bottom-up refactoring lib/tool for python

Laurent Peuch cortex at worlddomination.be
Fri Nov 14 13:05:53 CET 2014


Hello everyone,

Someone has suggested me to talk about the project I'm working on
right now on this mailing list because this has a lot of chances to
interest you.

This tool is an answer to a frustration that I've had while trying to
build tools for python and projects I was working on. While there is
already good capacities in python to analyse code (ast.py, astroid
(while it wasn't out at that time)), the "write code that modify
source code" was really missing (in my opinion and my knowledge of the
existing tools).

I wanted a very intuitive and easy to use library that allows me to
query and modify my source code only in the place I wanted to modify
it without touching the rest of the code. So I've built what can be
describe as "the BeautifulSoup of python source code".

To do so, I've built what can be called "a lossless AST" for python
(designed to be used by humans), an AST that satisfy this equation:

    source_code == ast_to_source(source_to_ast(source_code))

It produces json-serializable python data structures (because data
structures are easier to use and don't hide anything from you).

And now the part that should interest you more: on top on that AST,
I've built an high level "query and modification" library that wraps
those AST nodes into objects. I've put a lot of efforts in making this
library intuitive and very easy to use while removing you the burden
of having to deal with low level details. This "BeautifulSoup of the
python source code" is called Redbaron.

It looks like this:

    from redbaron import RedBaron

    # simple API

    # pass string
    red = RedBaron("some_value = 42")

    # get string back
    red.dumps()

Queries are like BeautifulSoup:

    red.find("int", value=4)
    red.find_all("def", name="stuff")

(You can pass lambda/regex/special syntaxe for globs/regex etc... to
queries, they should be powerful enough for the vast majorities of
your needs).

Nodes modification is very simple: just pass source code stored in
string and "voilà":

    red = RedBaron("some_value = 42")
    red[0].value = "1 + 1"  # some_value = 1 + 1

    red = RedBaron("def stuff():\n    plop")
    red[0].value = "some_code"  # def stuff():\n    some_code

    # notice that the input is correctly formatting, indented and it
    # also takes care of not breaking the next node indentation
    # works too with decorators and where you expect it to works

(It is possible to pass it ast datastructure or RedBaron objects
to).

And I've made an abstraction on top of "list of things" so you don't
have to take care about when you need to put a separator (for eg: a
"," in a list):

    red = RedBaron("[1, 2, 3]")
    red[0].append("plop")  # [1, 2, 3, plop]

    # want to add a new django app to INSTALLED_APPS? just do:
    red.find("assignment", target=lambda x: x.dumps() == "INSTALLED_APPLICATIONS").value.append("'another_app'")
    # notice that the formatting of the list is detected

    # want to add "@profile" to every function of the root level for
    # line_profiler?
    red('def', recursive=False).map(lambda x: x.decorators.insert(0, '@profile'))

    # and remove them
    red("decorator", lambda x: x.dumps() == "@decorator").map(lambda x: x.parent.parent.decorators.remove(x))

    # convert every "print a" to "logger.debug(a)
    red('print', value=lambda x: len(x) == 1).map(lambda x: x.replace('logger.debug(%s)' % x.value.dumps())

    # and print a, b, c to logger.debug("%s %s %s" % (a, b, c))
    red('print', value=lambda x: len(x) == 1).map(lambda x: x.replace('logger.debug("%s" % (%s))' % (" ".join('%s' * len(x.value)))

Both library and fully tested (more than 2000 tests in total), fully
*documented* (with lots of examples) and under freesoftware licences.
I consider RedBaron to be in alpha stage, it is already very stable
but a significant number of edge cases are probably not handled yet.

Important point: RedBaron is not and will not do static analysis,
I'm probably going to integrate (or integrate RedBaron into) a tool
that already do that like astroid or rope.

Links:

* RedBaron tutorial: https://redbaron.readthedocs.org/en/latest/tuto.html
* RedBaron documentation: https://redbaron.readthedocs.org
* RedBaron source code: https://github.com/psycojoker/redbaron

* Baron (the AST) source code: https://github.com/psycojoker/baron
* Baron documentation: https://baron.readthedocs.org

I hope that I have trigger your interest and I'm very
interested by your feedback,

Have a nice day and thanks for your time,

PS: I've only been aware of the capacities of lib2to3 since 2 months
and was very unhappy to discover it so late (I've spent months or
googling before deciding to start this project), I'll probably swap my
parser with lib2to3 one in the future.

-- 

Laurent Peuch -- Bram


More information about the code-quality mailing list