[Doc-SIG] Re: Attribute docstrings

Sun May 16 20:30:26 EDT 2004

Felix Wiemann wrote:
>> Another problem is that it's currently impossible to assign a new value
>> to an integer's docstring.
>> 
>>    >>> a = 5
>>    >>> a.__doc__ = "new docstring"
>> 
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in ?
>> AttributeError: 'int' object attribute '__doc__' is read-only

It's important to note that values and variables are fundamentally 
different things.  Strictly speaking, only *values* can have docstrings 
(i.e., description strings accessible via .__doc__).  It will *never* be 
possible to access variable docstrings via the variables themselves; as 
Beni noted, the only reasonable place to store description strings for 
variables (if we want them to be accessible via inspection) is in the 
enclosing object (either in its docstring or in some other attribute).

To avoid confusion, let's call these variable-docstrings "pseudo- 
docstrings", to distinguish them from real docstrings (which are
accessible via .__doc__).

Beni Cherniavsky wrote:
> For all these reasons, I propose the following extremely unmagical
> solution: all bare-string statements are simply concatenated together to
> form this scope's docstring.  

I've been going back and forth on whether I like this idea.  I agree 
with Felix that writing each variable name twice is somewhat ugly.

The main advantage of Beni's proposal is its simplicity.  A number of 
subtle issues came up when I was writing the pseudo-docstrings extractor 
for epydoc.  E.g., what do you do with...

     x,y = 1,2
     """docstring"""

     x = y = 0
     """docstring"""

     x[0] = 5
     """docstring"""

     x.y = 5
     """docstring"""

     x.y = z = 1,2
     """docstring"""

     self.x = 10
     """docstring"""

(If you're curious, epydoc will write a shared description for the first 
two cases, ignore the 3rd and 4th case, and treat the last case as an 
instance variable, assuming it's in a class's __init__ and 'self' is 
equal to __init__'s first parameter name.)

Using Beni's proposal, the programmer can handle these cases as they see 
fit.  E.g.:

     """:Parameters x,y: docstring for x and y."""
     x,y = 1,2

> Well, not simply - at least a newline (or
> two?) should be inserted because most docstrings don't end with one.

You'd also have to magically deduce the indentation for the first line 
of the docstring.  And note that in the examples you gave, 
inspect.getdoc() will *not* give you what you want.  E.g., the 
pesudo-docstring:

     """c
         This is Aclass.c's docstring."""

would get translated into: "c\nThis is Aclass.c's docstring" (note the 
lack of indentation on the second line).  But leaving the indentation 
as-is will result in invalid ReST.  This seems like a potential 
show-stopper to me, because it would basically mean that you need to add 
an extra CR for each attribute docstring:

     """
     c
         This is Aclass.c's docstring."""

Besides looking ugly, this makes a 3-line docstring out of something 
that really should have been a one-liner.

> I had to pull the docstring for self.i out of `__init__`; this is
> somewhat unfortunate.  

If you made the rule "any bare string literal is appended to the 
containing namespace object (module or class) docstring," then you could 
put instance variable docstrings inside of __init__ methods.  (Note that 
it doesn't make any sense to document a function's variable, because 
functions aren't namespaces.)

> OTOH I'm not sure the documentation of the 
> attribute's *purpose* belongs with the initial assignment to it.  

But the good thing about pseudo-docstrings is that it lets the 
*programmer* decide where it's appropriate to document variables. 
Sometimes this will be in the containing object's docstring, sometimes 
this will be next to one of the assignments.

Despite its appealing simplicity, my main problems with Beni's proposal 
are...
     - Duplication of variable names
     - It should be possible to write variable docstrings in one line
       (or less; see the psuedo-docstring for z, below).

For comparison, I'm currently planning to extend epydoc to accept 3 
formats for pseudo-docstrings:

     x = 12
     """pseudo-docstring for x"""

     #: pseudo-docstring for y
     y = 20

     z = 99     #: pseudo-docstring for z

Where the "#:" sequence is used to disambiguate psuedo-docstrings from 
normal comments (I picked this sequence because it looked reasonable to 
me, but I'd be open to changing it if there's a reason to).

Of course, these pseudo-docstrings won't be accessible via inspection; 
but I don't see that as a huge problem, as long as...
   - The tools that need them (e.g., epydoc/pydoc) can get at them.
   - Programmers *also* have the option of documenting variables in
     the containing namespace object.

And if these formats become standard, then we can add a module the the 
stdlib to extract them.

-Edward