[Doc-SIG] Cross-reference proposal

Edward Welbourne Edward Welbourne <eddyw@lsl.co.uk>
Tue, 8 Feb 2000 16:00:19 +0000

>> Note that simple bare words are never interpreted as references.
> Why not?  Of course this may go wrong, if you use very common words
> like 'a', 'is', 'so' ... as identifiers.  But I don't believe, that
> this will happen too much.

umm ... I believe the reverse.
  * A string-processing function with an argument which decides whether
    to capitalise the string is almost certain to use the verb
    capitalise (in its Natural Language sense) in the course of its
    account of the argument which it will, naturally, call `capitalise'.
  * Many I/O routines use buffers whose size the caller can
    control via a parameter, usually called buffer; an account of whose
    meaning will routinely involve refering (in the NL sense) to the
    buffer whose size is controlled by the argument.
  * ...

It is *in general* Good Practice to name parameters (especially those
which the caller is expected to give in name=value form) using plain
words which fit close to the NL words relating to the purpose of the
parameter.  Any account of the purpose of the parameter is,
consequently, liable to use the NL sense of the word as well as using
the word to refer to the parameter.  IMO, this forces us to have some
equivalent of the *emphasis* markup.

As to *what* should serve in this role: well, it's going to enclose
things which are to be read as lumps of python code in the doc string,
within which any identifiers are to be hyperlinked to the definition of
the object referenced by that identifier; so we can't go using any
character which can appear in an inline-in-the-doc code fragment
(e.g. regarding 'quoted text' as code fragment is unacceptable, because
my code fragment may wish to include a 'quoted string').

I don't think $, @ and ! are good candidates, but they could work;
however, how do folk feel about # as a marker ?  Since we're in a
doc-string, it doesn't have any special meaning; nor do I think it
sensible to include an end-of-line-comment in an inline-in-the-doc code
fragment (as opposed to a code fragment displayed using a Code: block or
>>>).  Thus #any.valid(python + ' code') is guaranteed(to, work)#.

Then we get:

        file ! string -- the name of the file to be opened.

        mode ! string -- the file access mode string ...  If #mode#'s
        first letter is #'r'#, #file# must exist.

        buffer ! int or None [None] -- size of the I/O buffer to use.  A
        size #< 0# indicates that no buffering should be done (raw I/O).
        If #buffer# is given as #None#, a `sensible' buffer size will be

and only treat words in the description (after --) as special if
explicitly marked as such.  Of course, since this is the Arguments:
block, treating the key (before !) as special comes naturally without
needing to mark it using #.

        IOError -- raised if #file# does not exist, #mode# is
        unrecognised or if #buffer#'s value is inappropriate for the
        given #file# and #mode#.

This fits the `easy to type' requirement; how do folk feel about `easy
to read' ?  Note that I, at least, find it has one bonus under `easy to
read'; I know when a word is used in its NL sense and when it is used to
refer to something in the python code; that makes it easier to
understand what I read.

As to creating cross-references: identifiers appearing inside #...# or
in the body of a Code: block can be sought using the namespace lookup
machinery python would use from within the body of the thing whose
doc-string we're parsing; where the lookup finds something with a
doc-string, it is natural to generate an href to that doc-string (or,
rather, what it will become when parsed).  This should only need to
happen for things marked as code.

Does this fit the requirement for clear and simple rules ?  Summary:

  * Within text (not within Code: &c. blocks), # is used as delimiter
    for (start and end of) in-text code fragments.

  * Code:, Example: and >>> blocks and in-text code fragments are
    parsed; identifiers within them are looked up as if by the
    interpreter when executing the suite which the doc-string begins; if
    these lookups yield something to which the generated docs can make
    an HREF, they do so.

Note that,

 * in #open('peter', 'r')#, peter does not appear as an identifier (it
   appears as a string) so no attempt is made to look peter up in any

 * use of #code(fragments)# is viable within comments, either in real
   python code (outside the doc-string) or in Code: &c. blocks.

Given the latter, it would make sense for the parse-and-href idiom to
ignore end-of-line comments in Code: &c. blocks, but to recognise code
fragments in such comments and subject these to parse-and-href.

> [Ka Ping's] rules sound a little bit complicated.
they are amenable to some simplification, though.

Actually, starting /* and ending */ would do fine ... the funny thing
is, in C, that */ would be the obvious way to *start* comments, since it
cannot appear in valid code.  As ever, K&R didn't do the obvious.