[Python-Dev] The docstring hack for signature information has to go

Larry Hastings larry at hastings.org
Mon Feb 3 15:43:31 CET 2014

A quick summary of the context: currently in CPython 3.4, a builtin 
function can publish its "signature" as a specially encoded line at the 
top of its docstring.  CPython internally detects this line inside 
PyCFunctionObject.__doc__ and skips past it, and there's a new getter at 
PyCFunctionObject.__text_signature__ that returns just this line.  As an 
example, the signature for os.stat looks like this:

     sig=($module, path, *, dir_fd=None, follow_symlinks=True)

The convention is, if you have this signature, you shouldn't have your 
docstring start with a handwritten signature like 3.3 and before.  
help() on a callable displays the signature automatically if it can, so 
if you *also* had a handwritten signature, help() would show two 
signatures.  That would look dumb.


So here's the problem.  Let's say you want to write an extension that 
will work with Python 3.3 and 3.4, using the stable ABI.  If you don't 
add this line, then in 3.4 you won't have introspection information, 
drat.  But if you *do* add this line, your docstring will look mildly 
stupid in 3.3, because it'll have this unsightly "sig=(" line at the 
top.  And it *won't* have a nice handwritten docstring.  (And if you 
added both a sig= signature *and* a handwritten signature, in 3.4 it 
would display both.  That would also look dumb.)

I can't figure out any way to salvage this "first line of the docstring" 
approach.  So I think we have to abandon it, and do this the hard way: 
extend the PyMethodDef structure.  I propose three different 
variations.  I prefer B, but I'm guessing Guido would prefer the YAGNI 
approach, which is A:

A: We create a PyMethodDefEx structure with an extra field: "const char 
*signature".  We add a new METH_SIGNATURE (maybe just METH_SIG?) flag to 
the flags, indicating that this is an extended structure.  When 
iterating over the PyMethodDefs, we know how far to advance the pointer 
based on this flag.

B: Same as A, but we add three unused pointers (void *reserved1 etc) to 
PyMethodDefEx to give us some room to grow.

C: Same as A, but we add two fields to PyMethodDefEx.  The second new 
field identifies the "version" of the structure, telling us its size 
somehow.  Like the lStructSize field of the OPENFILENAME structure in 
Win32.  I suspect YAGNI.


But that only fixes part of the problem.  Our theoretical extension that 
wants to be binary-compatible with 3.3 and 3.4 still has a problem: how 
can they support signatures?  They can't give PyMethodDefEx structures 
to 3.3, it will blow up.  But if they don't use PyMethodDefEx, they 
can't have signatures.

Solution: we write a function (which users would have to copy into their 
extension) that gives a PyMethodDefEx array to 3.4+, but converts it 
into a PyMethodDef array for 3.3.  The tricky part there: what do we do 
about the docstring?  The convention for builtins is to have the first 
line(s) contain a handwritten signature.  But you *don't* want that if 
you provide a signature, because help() will read that signature and 
automatically render this first line for you.

I can suggest four options here, and of these I like P best:

M: Don't do anything.  Docstrings with real signature information and a 
handwritten signature in the docstring will show two signatures in 3.4+, 
docstrings without any handwritten signature won't display their 
signature in help in 3.3.  (Best practice for modules compiled for 3.4+ 
is probably: skip the handwritten signature.  Users would have to do 
without in 3.3.)

N: Leave the handwritten signature in the docstring, then when 
registering for 3.4+ add a second flag called METH_33_COMPAT that means 
"when displaying help for this function, don't automatically generate 
that first line."

O: Have the handwritten signature in the docstring.  When registering 
the function for 3.3, have the PyMethodDef docstring point to the it 
starting at the signature.  When registering the function for 3.4+, have 
the docstring in the PyMethodDefEx point to the first byte after the 
handwritten signature.  Note that automatically skipping the signature 
with a heuristic is mildly complicated, so this may be hard to get right.

P: Have the handwritten signature in the docstring, and have separate 
static PyMethodDef and PyMethodDefEx arrays.  The PyMethodDef docstring 
points to the docstring like normal.  The PyMethodDefEx docstring field 
points to the first byte after the handwritten signature.  This makes 
the registration "function" very simple: if it's 3.3 or before, use the 
PyMethodDef array, if it's 3.4+ use the PyMethodDefEx array.  (Argument 
Clinic could theoretically automate coding some or all of this.)

It's late and my brain is only working so well.  I'd be interested in 
other approaches if people can suggest something good.

Sorry about the mess,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140203/5a474973/attachment.html>

More information about the Python-Dev mailing list