[Doc-SIG] which characters to use for docstring markup

Sun, 08 Apr 2001 15:20:33 EDT

> But you counted single characters.  I grepped for '[A-Z]<' and found
> none that occurred in docstrings.  (The actual re should be
> r'\B[A-Z]<'; I believe the POD rules ask for a single upper case
> letter before the <.

Well, presumably the occurance of '[A-Z]{' will be comperably small.
However, it's not just the open delimiters that we have to worry about.
You can't include a close delimiter in a colored region.  For example,
if you want to put "x > y" in a "code" region, then you can't::

   C<x > y>

There's no way for it to know that the first ">" isn't a close delimiter.
Similarly for bold, etc.  Also, there's a question of how context-
sensitive we want our delimiters to be.  It may confuse people that
they can say::

   x<y

but not::

   X<Y

I would be happier just reserving '{' and '}', and saying that they're
always delimiters when they occur in a paragraph (except possibly if 
backslashed somehow, but let's not get sidetracked into backslashing).

> Now, there's one significant use of [A-Z]< that might trip us up: the
> regular expression syntax (?P<...>...).  I certainly could see this
> being useful in docstrings for methods that take regular expression
> argument.  

This may be important, but I see it as less important than the issues
of using < and > to mean less-than and greater-than.

> There's also one use of [A-Z]{: \N{...} means something in
> Unicode literal syntax.

I agree that there will be cases where any character gets used.  But
I would argue that, in these cases, we should either use literal 
blocks (do you really need to say "\N{...}" in a paragraph? Maybe..)
or use some sort of backslashing.  (But again, let's come back to the
discussions of backslashing).

> I don't like `...`, because (a) it means something very specific in
> Python (and in the Unix shell), (b) it's hard to distinguish from
> '...' in some fonts, and (c) except for the `...` Python and shell
> notation, I expect ` to be closed with '.

I'm leaning more towards the L{...} syntax anyway.  Although I would
argue against b on the count that, if you're viewing it in a non-parsed
form, then you're viewing it in your source-code editor, and 
presumably you chose a font for your source-code editor in which you
can distinguish "'" and "`", since they mean very different things
in Python.  Even if you did choose a different font for your docstring
comments, you'd still like to be able to distinghish "'x'" and "`x`" 
when you read a doctest block.. so presumably you'd pick a font in
which you can..?

> > 4. You should keep in mind that any of these characters will be used
> >    in the docstring for *something* (well, actually, I was surprised
> >    to see a backspace in a docstring..).
> 
> Where?

In the sre module docstring, if I remember correctly.

> I still like C<code> and *multi word emph* better. :-)

I was thinking of these as mutually exclusive.  If we're going to use
C<code> or C{code} or whatever, we might as well use E<emph>.  No
need to go removing even more characters from docstring writers' 
repetoires.

(Incidentally, multiword emph can be somewhat dangerous..  Especially
if you let people have docstrings like "*hi" and "bye*", where the
'*'s get parsed as normal asterisks..  People will get confused..  The
problem is made worse if *multi word emphs* can span lines.. But then,
if they can't, then word-rewrapping won't work as expected. etc...)

-Edward