[Python-Dev] The docstring hack for signature information has to go
larry at hastings.org
Tue Feb 4 02:29:06 CET 2014
On 02/03/2014 09:46 AM, Guido van Rossum wrote:
> Can you summarize why neither of the two schemes you tried so far worked?
In the first attempt, the signature looked like this:
The "(arguments)" part of the string was 100% compatible with Python
syntax. So much so that I didn't write my own parser. Instead, I would
take the whole line, strip off the \n, prepend it with "def ", append it
with ": pass", and pass in the resulting string to ast.parse().
This had the advantage of looking great if the signature was not
mechanically separated from the rest of the docstring: it looked like
the old docstring with the handwritten signature on top.
The problem: false positives. This is also exactly the traditional
format for handwritten signatures. The function in C that mechanically
separated the signature from the rest of the docstring had a simple
heuristic: if the docstring started with "<name-of-function>(", it
assumed it had a valid signature and separated it from the rest of the
docstring. But most of the functions in CPython passed this test, which
resulted in complaints like "help(open) eats first line":
I opened an issue, writing a long impassioned plea to change this syntax:
Which we did.
In the second attempt, the signature looked like this:
In other words, the same as the first attempt, but with "sig=" instead
of the name of the function. Since you never see docstrings that start
with "sig=" in the wild, the false positives dropped to zero.
I also took the opportunity to modify the signature slightly. Signatures
were a little inconsistent about whether they specified the "self"
parameter or not, so there were some complicated heuristics in
inspect.Signature about when to keep or omit the first argument. In the
new format I made this more explicit: if the first argument starts with
a dollar sign ("$"), that means "this is a special first argument" (self
for methods, module for module-level callables, type for class methods
and __new__). That removed all the guesswork from inspect.Signature;
now it works great. (In case you're wondering: I still use ast.parse to
parse the signature, I just strip out the "$" first.)
I want to mention: we anticipate modifying the syntax further in 3.5,
adding square brackets around parameters to indicate "optional groups".
This all has caused no problems so far. But my panicky email last night
was me realizing a problem we may see down the road. To recap: if a
programmer writes a module using the binary ABI, in theory they can use
it with different Python versions without modification. If this
programmer added Python 3.4+ compatible signatures, they'd have to
insert this "sig=(" line at the top of their docstring. The downside:
Python 3.3 doesn't understand that this is a signature and would happily
display it to the user as part of help().
> How bad would it be if we decided to just live with it or if we added
> a new flag bit (only recognized by 3.4) to disambiguate corner-cases?
A new flag might solve the problem cheaply. Let's call it METH_SIG, set
in the flags portion of the PyMethodDef. It would mean "This docstring
contains a computer-readable signature". One could achieve source
compatibility with 3.3 easily by adding "#ifndef METH_SIG / #define
METH_SIG 0 / #endif"; the next version of 3.3 could add that itself. We
could then switch back to the original approach of
"<name-of-function>(", so the signature would look presentable when
displayed to the user. It would still have the funny dollar-sign, a la
"$self" or "$module" or "$type", but perhaps users could live with
that. Though perhaps this time maybe the end delimiter should be two
newlines in a row, so that we can text-wrap long signature lines to
enhance their readability if/when they get shown to users.
I have two caveats:
A: for binary compatibility, would Python 3.3 be allergic to this
unfamiliar flag in PyMethodDef? Or does it ignore flags it doesn't
explicitly look for?
B: I had to modify four (or was it five?) different types in Python to
add support for mechanically separating the __text_signature__. Although
all of them originally started with a PyMethodDef structure, I'm not
sure that all of them carry the "flags" parameter around with them. We
might have to add a "flags" to a couple of these. Fortunately I believe
they're all part of Py_LIMITED_API.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev