[Python-ideas] Enabling access to the AST for Python code

Fri May 22 05:00:15 CEST 2015

LOn May 21, 2015, at 19:44, Steven D'Aprano <steve at pearwood.info> wrote:
> 
>> On Thu, May 21, 2015 at 06:51:34PM -0700, Andrew Barnert via Python-ideas wrote:
>>> On May 21, 2015, at 18:18, Ben Hoyt <benhoyt at gmail.com> wrote:
>>> 
>>> (I know that there's the "ast" module and ast.parse(), which can give
>>> you an AST given a *source string*, but that's not very convenient
>>> here.)
>> 
>> Why not? Python modules are distributed as source.
> 
> *Some* Python modules are distributed as source. Don't forget that 
> byte-code only modules are officially supported.
> 
> Functions may also be constructed dynamically, at runtime. Closures may 
> have source code available for them, but functions and methods 
> constructed with exec (such as those in namedtuples) do not.
> 
> Also, the interactive interpreter is a very powerful tool, but it 
> doesn't record the source code of functions you type into it.
> 
> So there are at least three examples where the source is not available 
> at all.

By comparison, code objects that don't carry around their AST including everything running in any version of Python except maybe a future version that'll be out in a year and a half, if this idea gets accepted, and probably only in CPython.

Plus, I'm pretty sure people would demand the ability to not waste memory and disk space on ASTs when they don't need them, so they still wouldn't be always available.

> Ben also talks about *convenience*: `func.ast` will always be 
> more convenient than:
> 
> import ast
> import parse
> ast.parse(inspect.getsource(func))
> 
> not to mention the wastefulness of parsing something which has already 
> been parsed before.

> On the other hand, keeping the ast around even when 
> it's not used wastes memory, so this is a classic time/space trade off.
> 
> 
>> You can pretty 
>> easily write an import hook to intercept module loading at the AST 
>> level and transform it however you want.
> 
> Let's have a look at yours then, that ought to only take a minute or 
> three :-) 

Does "import macropy" count? That only took me a second or three. :)

Certainly a _lot_ easier than hacking the CPython source, even for something as trivial as adding a new member to the code object and finding all the places to attach the AST.

> (That's my definition of "pretty easily".)
> 
> I think that the majority of Python programmers have no idea that you 
> can even write an import hook at all, let alone how to do it.

Sure, because they have no need to do so. But it's very easy to learn. Especially after the changes in 3.3 and again in 3.4.

During the discussion on Unicode operators that turned into a discussion on a Unicode empty set literal, I suggested an import hook, someone (possibly you?) challenged me to write one if it was so easy, and it took me under half an hour to learn the 3.4 system and implement one. (I'm sure it would be a lot faster this time. But probably not on my phone...) All that work to improve the import system really did pay off. 

By comparison, hacking in new syntax to CPython to play with operator sectioning yesterday took me about four hours.

And of course anyone can download and use my import hook to get the empty set literal in any standard Python 3.4 or later, but anyone who wants to use my operator sectioning hacks has to clone my fork and build and install a new interpreter.