[Doc-SIG] docstring grammar
Wed, 01 Dec 1999 12:43:19 +0100
Edward Welbourne wrote:
> > Since  is only used for lists in Python, we could
> > define the RE '\[[a-zA-Z0-9_.]+\]' for our purposes and
> > raise an exception in case the enclosed reference cannot
> > be mapped to a symbol in the global namespace (note: no
> > whitespace, no commas) which either evaluates to a function,
> > method, module or reference object.
> umm ... hang on, two things seem stirred up here. The proposal I
> remember from ages ago and tried to echo has [token] and the token
> doesn't have to be intelligible to the python engine: elsewhere in the
> doc string, we'll have
> [token] reference text
> which the parsed docstring uses to decode each use of [token] that
> appeared in the docstring.
Right, but we extended the lookup notion to what David summarized
in a recent post:
I believe that the namespaces looked up
1) the local namespace of the docstring -- i.e., the set of keywords
defined in the "References" keyword block in the current docstring.
2) the global namespace of the docstrings -- i.e. the set of keywords
defined in the "References" keyword block in the MODULE docstring.
3) The global Python namespace for that module
4) Some namespace corresponding to builtins & unimported modules, yet
+ I would like to add:
The looked up object will only be converted to a reference
if it is either an object having a doc string, or a reference
object (these are created through the Reference: section).
In case this condition is not met, either a warning is
issued or the [token] text is taken as is.
+ modify the RE to include hyphens:
Given the above, [None] would then either cause a warning or
be left in the doc string with no further magic applied.
Other uses of square brackets would have to include at least
one of the characters not allowed by the above RE, e.g. spaces.
This makes mixing [references] and [ code, examples ] very
simple and straight forward.
As always, the details of how to convert the reference to
markup should be left to a reference engine. We should focus
on tokenizing first and only then start thinking about
what to do with those tokens... e.g. automagically convert
them to HTML anchors or whatever.
AFAICT, we have these tokens and symbols:
A Keyword is a case-sensitive string which:
- starts a paragraph
- matches '^ *[a-zA-Z_]+[\-a-zA-Z_0-9]*: +'
(Python identifiers with the addition of hyphens and which end
with a : and one or more spaces)
A Keyword Block is a paragraph of text starting with
a Keyword and followed by Single Line Text or a Text Block.
A Reference is a case-sensitive string which:
- matches '\[[a-zA-Z0-9_.-]+\]'
(lookup as indicated above is left to the reference engine to
Single Line Text:
Single Line Text is all remaining text on the current line.
A Text Block is a paragraph of indented text.
A Bullet Block is a paragraph of indented text using
a bullet character as first non-whitespace character at the
A line of text matching <RE for "name(args,kws) -> returns -- does">
One or more lines of whitespace text.
All Blocks may be nested (is this true?). Nesting is indicated by
Anthing missing ?
Y2000: 30 days left
Python Pages: http://www.lemburg.com/python/