[Python-Dev] The docstring hack for signature information has to go
larry at hastings.org
Mon Feb 3 15:43:31 CET 2014
A quick summary of the context: currently in CPython 3.4, a builtin
function can publish its "signature" as a specially encoded line at the
top of its docstring. CPython internally detects this line inside
PyCFunctionObject.__doc__ and skips past it, and there's a new getter at
PyCFunctionObject.__text_signature__ that returns just this line. As an
example, the signature for os.stat looks like this:
sig=($module, path, *, dir_fd=None, follow_symlinks=True)
The convention is, if you have this signature, you shouldn't have your
docstring start with a handwritten signature like 3.3 and before.
help() on a callable displays the signature automatically if it can, so
if you *also* had a handwritten signature, help() would show two
signatures. That would look dumb.
So here's the problem. Let's say you want to write an extension that
will work with Python 3.3 and 3.4, using the stable ABI. If you don't
add this line, then in 3.4 you won't have introspection information,
drat. But if you *do* add this line, your docstring will look mildly
stupid in 3.3, because it'll have this unsightly "sig=(" line at the
top. And it *won't* have a nice handwritten docstring. (And if you
added both a sig= signature *and* a handwritten signature, in 3.4 it
would display both. That would also look dumb.)
I can't figure out any way to salvage this "first line of the docstring"
approach. So I think we have to abandon it, and do this the hard way:
extend the PyMethodDef structure. I propose three different
variations. I prefer B, but I'm guessing Guido would prefer the YAGNI
approach, which is A:
A: We create a PyMethodDefEx structure with an extra field: "const char
*signature". We add a new METH_SIGNATURE (maybe just METH_SIG?) flag to
the flags, indicating that this is an extended structure. When
iterating over the PyMethodDefs, we know how far to advance the pointer
based on this flag.
B: Same as A, but we add three unused pointers (void *reserved1 etc) to
PyMethodDefEx to give us some room to grow.
C: Same as A, but we add two fields to PyMethodDefEx. The second new
field identifies the "version" of the structure, telling us its size
somehow. Like the lStructSize field of the OPENFILENAME structure in
Win32. I suspect YAGNI.
But that only fixes part of the problem. Our theoretical extension that
wants to be binary-compatible with 3.3 and 3.4 still has a problem: how
can they support signatures? They can't give PyMethodDefEx structures
to 3.3, it will blow up. But if they don't use PyMethodDefEx, they
can't have signatures.
Solution: we write a function (which users would have to copy into their
extension) that gives a PyMethodDefEx array to 3.4+, but converts it
into a PyMethodDef array for 3.3. The tricky part there: what do we do
about the docstring? The convention for builtins is to have the first
line(s) contain a handwritten signature. But you *don't* want that if
you provide a signature, because help() will read that signature and
automatically render this first line for you.
I can suggest four options here, and of these I like P best:
M: Don't do anything. Docstrings with real signature information and a
handwritten signature in the docstring will show two signatures in 3.4+,
docstrings without any handwritten signature won't display their
signature in help in 3.3. (Best practice for modules compiled for 3.4+
is probably: skip the handwritten signature. Users would have to do
without in 3.3.)
N: Leave the handwritten signature in the docstring, then when
registering for 3.4+ add a second flag called METH_33_COMPAT that means
"when displaying help for this function, don't automatically generate
that first line."
O: Have the handwritten signature in the docstring. When registering
the function for 3.3, have the PyMethodDef docstring point to the it
starting at the signature. When registering the function for 3.4+, have
the docstring in the PyMethodDefEx point to the first byte after the
handwritten signature. Note that automatically skipping the signature
with a heuristic is mildly complicated, so this may be hard to get right.
P: Have the handwritten signature in the docstring, and have separate
static PyMethodDef and PyMethodDefEx arrays. The PyMethodDef docstring
points to the docstring like normal. The PyMethodDefEx docstring field
points to the first byte after the handwritten signature. This makes
the registration "function" very simple: if it's 3.3 or before, use the
PyMethodDef array, if it's 3.4+ use the PyMethodDefEx array. (Argument
Clinic could theoretically automate coding some or all of this.)
It's late and my brain is only working so well. I'd be interested in
other approaches if people can suggest something good.
Sorry about the mess,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev