[Python-Dev] Rough idea for adding introspection information for builtins

Larry Hastings larry at hastings.org
Tue Mar 19 05:45:09 CET 2013


The original impetus for Argument Clinic was adding introspection 
information for builtins--it seemed like any manual approach I came up 
with would push the builtins maintenance burden beyond the pale.

Assuming that we have Argument Clinic or something like it, we don't 
need to optimize for ease of use from the API end--we can optimize for 
data size.  So the approach writ large: store a blob of data associated 
with each entry point, as small as possible. Reconstitute the 
appropriate inspect.Signature on demand by reading that blob.

Where to store the data?  PyMethodDef is the obvious spot, but I think 
that structure is part of the stable ABI.  So we'd need a new 
PyMethodDefEx and that'd be a little tiresome.  Less violent to the ABI 
would be defining a new array of pointers-to-introspection-blobs, 
parallel to the PyMethodDef array, passed in via a new entry point.


On to the representation.  Consider the function

    def foo(arg, b=3, *, kwonly='a'):
         pass

I considered four approaches, each listed below along with its total 
size if it was stored as C static data.

1. A specialized bytecode format, something like pickle, like this:

    bytes([ PARAMETER_START_LENGTH_3, 'a', 'r', 'g',
       PARAMETER_START_LENGTH_1, 'b', PARAMETER_DEFAULT_LENGTH_1, '3',
       KEYWORD_ONLY,
       PARAMETER_START_LENGTH_6, 'k', 'w', 'o', 'n', 'l', 'y',
    PARAMETER_DEFAULT_LENGTH_3, '\'', 'a', '\'',
       END
       ])

Length: 20 bytes.

2. Just use pickle--pickle the result of inspect.signature() run on a 
mocked-up signature, just store that.   Length: 130 bytes. (Assume a 
two-byte size stored next to it.)

3. Store a string that, if eval'd, would produce the inspect.Signature.  
Length: 231 bytes.  (This could be made smaller if we could assume "from 
inspect import *" or "p = inspect.Parameter" or something, but it'd 
still be easily the heaviest.)

4. Store a string that looks like the Python declaration of the 
signature, and parse it (Nick's suggestion).  For foo above, this would 
be "(arg,b=3,*,kwonly='a')".  Length: 23 bytes.

Of those, Nick's suggestion seems best.  It's slightly bigger than the 
specialized bytecode format, but it's human-readable (and 
human-writable!), and it'd be the easiest to implement.


My first idea for implementation: add a "def x" to the front and ": 
pass" to the end, then run it through ast.parse.  Iterate over the tree, 
converting parameters into inspect.Parameters and handling the return 
annotation if present.  Default values and annotations would be turned 
into values by ast.eval_literal.  (It wouldn't surprise me if there's a 
cleaner way to do it than the fake function definition; I'm not familiar 
with the ast module.)

We'd want one more mild hack: the DSL will support positional 
parameters, and inspect.Signature supports positional parameters, so 
it'd be nice to render that information.  But we can't represent that in 
Python syntax (or at least not yet!), so we can't let ast.parse see it.  
My suggestion: run it through ast.parse, and if it throws a SyntaxError 
see if the problem was a slash.  If it was, remove the slash, reprocess 
through ast.parse, and remember that all parameters are positional-only 
(and barf if there are kwonly, args, or kwargs).


Thoughts?


//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20130318/1e7492be/attachment.html>


More information about the Python-Dev mailing list