Assuming this analysis is correct, I am inclined to scrap my original proposal, shelve my current prototype code, and work instead with Jim Fulton and whoever else to discuss whether and how to modify StructuredText's
David Ascher wrote: format
and code to extend it to the needs which the DOC-SIG expressed. The value in this approach is:
Cool! David, we would be glad to have this happen. We like StructuredText, and would love to see it widely used and available. And we have very limited time right now, which means we would be very happy to have someone like you/the python community making harmonious changes. I think jim and/or i could field your questions/suggestions, to try to keep us in agreement. Re the specifics in your message:
StructuredText as XML without a DTD:
Unlike the grammar which was discussed on the doc-sig, StructuredText does not allow us to specify special syntax for special kinds of text, such as: signature blocks/tooltip descriptions, doctest.py code, and keyworded paragraphs. This is a lack of feature rather than a flaw, and could maybe be fixed by adding a postprocessor which would be Python-internal-doc specific.
I'm not entirely clear about what you're referring to, but we're thinking about a slightly different structure for the StructuredText object that may admit this sort of thing. A base class would just be responsible for parsing paragraphs (including distinguishing bullet paragraphs that aren't newline separated - yay!), then a document-formatting class, maybe abstract, that recognizes generic document features like headers, emphasis, links, etc. The document formatting class could be subclassed by a renderering class, eg an html-formatter or a docstring-formatter. It so happens Jim has an intern looking at doing this restructuring as a fun-ish (relief from C coding) side project. It's not clear we'll get it done, but it is clear we'd be happy to see it, and i think it would allow a whole lot more customizability.
The use of single quotes to markup inline code (as in 'x') can be surprising. Many current docstrings use 'x' to refer to the *string* containing the character x, not the variable x. In StructuredText, the quotes would dissappear in the rendering. With practice, the current scheme could be used but users would have to learn to write '"x"' to have their intent carry through to the renderer. This is probably my biggest problem with StructuredText because I don't know how to fix it while maintaining compatibility. Related questions for Jim or Ken are 1) How does StructuredText parse ''? 2) How can one have a single quote in verbatim text?
I can't answer all your questions. We'd be interested to know what you suggest as an alternative - how would you indicate brief <code> passages? It's possible the generic formatter would be settable to one or the other, or it'd recognize both ticked-code and whatever you want and the renderer would choose which to transform to code, or we'd have an almost generic formatter which could be derived into different flavors of generic formatters, each of which did the respective thing for recognizing short <code> passages.
The tagging of underlined text with _'s is suboptimal. Underlines shouldn't be used from a typographic perspective (underlines were designed to be used in manuscripts to communicate to the typesetter that the text should be italicized -- no well-typeset book ever uses underlines), and conflict with double-underscored Python variable names (__init__ and the like), which would get truncated and underlined when that effect is not desired. Note that while *complete* markup would prevent that truncation ('__init__'), I think of docstring markups much like I think of type annotations -- they should be optional and above all do no harm. In this case the underline markup does harm.
Durn - i did that, and regret it. We're really not wedded to use of underscores for underlined text.
The requirement that a paragraph end with the word example or examples or :: goes against my natural style, as I often do not want such word or punctuation before a "displayed" paragraph. Furthermore, the spec currently doesn't say how the renderer is supposed to process the :: -- is it displayed as two colons, one, or none? If the two colons are not displayed by the renderer, then my objection is diminished, although I would have preferred a markup which is local to the paragraph which is affected, not the previous one (cut and paste errors follow too easily). In some versions in the past at least, both colons were displayed. I'll leave that as an open question, as additional markup could provide an alternative which would suit me.r, which does the actual
rendering. I think we agree that it'd be nice to have the example cue in the text, rather than before it, but we don't have a suggestion of a natural/uncluttered way to do that. As for the current :: format, to double colon is supposed to be rendered as a double colon.
Missing features:
[references to entities other than URLs; define namespaces for reference targets]
Sounds cool. I think.-)
[list items without requring blank lines]
Yay!
[tagged paragraphs]
Might '::' example-style paragraphs fit into this category? Maybe provisions for special handling of the tag cue (E.g., throwing away "Example:") would enable this to solve our examples problem.
Technical points:
[regex has been deprecated]
I think we would be ok with switchover to re if we had some kind of caching FormattedDocument, which may be in the works. But we use StructuredText enough in zope that performance is an issue - and re doesn't have the performance yet! Whoops - gotta run! Looking forward to more on this... Ken Manheimer klm@digicool.com
participants (1)
-
Ken Manheimer