[Python-Dev] The docstring hack for signature information has to go
Guido van Rossum
guido at python.org
Tue Feb 4 05:19:26 CET 2014
Hmm... I liked the original scheme because it doesn't come out so badly if
some tool doesn't special-case the first line of the docstring at all. (I
have to fess up that I wrote such a tool for a limited case not too long
ago, and I wrote it to search for a blank line if the docstring starts with
<methodname> followed by '('.)
Adding a flag sounds harmless, all the code I could find that looks at them
just checks whether specific flags it knows about are set.
But why do you even need a flag? Reading issue 20075 where the complaint
started, it really feels that the change was an overreaction to a very
minimal problem. A few docstrings appear truncated. Big deal. We can
rewrite the ones that are reported as broken (either by adjusting the
docstring to not match the patter or by adjusting it to match the pattern
better, depending on the case). Tons of docstrings contain incorrect info,
we just fix them when we notice the issue, we don't declare the language
On Mon, Feb 3, 2014 at 5:29 PM, Larry Hastings <larry at hastings.org> wrote:
> On 02/03/2014 09:46 AM, Guido van Rossum wrote:
> Can you summarize why neither of the two schemes you tried so far worked?
> In the first attempt, the signature looked like this:
> The "(arguments)" part of the string was 100% compatible with Python
> syntax. So much so that I didn't write my own parser. Instead, I would
> take the whole line, strip off the \n, prepend it with "def ", append it
> with ": pass", and pass in the resulting string to ast.parse().
> This had the advantage of looking great if the signature was not
> mechanically separated from the rest of the docstring: it looked like the
> old docstring with the handwritten signature on top.
> The problem: false positives. This is also exactly the traditional format
> for handwritten signatures. The function in C that mechanically separated
> the signature from the rest of the docstring had a simple heuristic: if the
> docstring started with "<name-of-function>(", it assumed it had a valid
> signature and separated it from the rest of the docstring. But most of the
> functions in CPython passed this test, which resulted in complaints like
> "help(open) eats first line":
> I opened an issue, writing a long impassioned plea to change this syntax:
> Which we did.
> In the second attempt, the signature looked like this:
> In other words, the same as the first attempt, but with "sig=" instead of
> the name of the function. Since you never see docstrings that start with
> "sig=" in the wild, the false positives dropped to zero.
> I also took the opportunity to modify the signature slightly. Signatures
> were a little inconsistent about whether they specified the "self"
> parameter or not, so there were some complicated heuristics in
> inspect.Signature about when to keep or omit the first argument. In the
> new format I made this more explicit: if the first argument starts with a
> dollar sign ("$"), that means "this is a special first argument" (self for
> methods, module for module-level callables, type for class methods and
> __new__). That removed all the guesswork from inspect.Signature; now it
> works great. (In case you're wondering: I still use ast.parse to parse the
> signature, I just strip out the "$" first.)
> I want to mention: we anticipate modifying the syntax further in 3.5,
> adding square brackets around parameters to indicate "optional groups".
> This all has caused no problems so far. But my panicky email last night
> was me realizing a problem we may see down the road. To recap: if a
> programmer writes a module using the binary ABI, in theory they can use it
> with different Python versions without modification. If this programmer
> added Python 3.4+ compatible signatures, they'd have to insert this "sig=("
> line at the top of their docstring. The downside: Python 3.3 doesn't
> understand that this is a signature and would happily display it to the
> user as part of help().
> How bad would it be if we decided to just live with it or if we added a
> new flag bit (only recognized by 3.4) to disambiguate corner-cases?
> A new flag might solve the problem cheaply. Let's call it METH_SIG, set
> in the flags portion of the PyMethodDef. It would mean "This docstring
> contains a computer-readable signature". One could achieve source
> compatibility with 3.3 easily by adding "#ifndef METH_SIG / #define
> METH_SIG 0 / #endif"; the next version of 3.3 could add that itself. We
> could then switch back to the original approach of "<name-of-function>(",
> so the signature would look presentable when displayed to the user. It
> would still have the funny dollar-sign, a la "$self" or "$module" or
> "$type", but perhaps users could live with that. Though perhaps this time
> maybe the end delimiter should be two newlines in a row, so that we can
> text-wrap long signature lines to enhance their readability if/when they
> get shown to users.
> I have two caveats:
> A: for binary compatibility, would Python 3.3 be allergic to this
> unfamiliar flag in PyMethodDef? Or does it ignore flags it doesn't
> explicitly look for?
> B: I had to modify four (or was it five?) different types in Python to add
> support for mechanically separating the __text_signature__. Although all
> of them originally started with a PyMethodDef structure, I'm not sure that
> all of them carry the "flags" parameter around with them. We might have to
> add a "flags" to a couple of these. Fortunately I believe they're all part
> of Py_LIMITED_API.
> Python-Dev mailing list
> Python-Dev at python.org
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev