From pearu@cens.ioc.ee Sun Dec 1 11:33:18 2002 From: pearu@cens.ioc.ee (Pearu Peterson) Date: Sun, 1 Dec 2002 13:33:18 +0200 (EET) Subject: [Doc-SIG] Including files Message-ID: Hi, I am using Docutils in creating a reference manual (for F2PY project) and found that it would be nice if rst formatted text files could have some support for including other rst formatted text files as well as various source codes to the document. The equivalent hooks in LaTeX would be \input{..} and \verbatiminput{..} commands. Typical usage cases: * If a rst-text file tends to become very large (in the sense that its printed version has, say, 20 or more pages) then factoring it to different files would ease maintaining such documents. * Including example source codes. Currently one has to maintain two copies of source codes, one as a source file and one typed (copied) into the rst-text file. I am not sure what would be appropiate Docutils hooks for emulating LaTeX \input or \verbatiminput commands but may be something like the following: * Using :: .. input:: filename in a rst-text file is equivalent to a situation as if the ``.. input:: filename`` part is replaced by the contents of ``filename``, possibly taking into account also indentation level. Possible variations for ``input``:: file insert include fileinput .. * Using :: .. verbatim:: filename is equivalent to including the contents of ``filename`` as a literate block to the current rst-text file. Possible variations for ``verbatim``:: source verbatiminput .. What do you think? Pearu From goodger@python.org Sun Dec 1 15:51:25 2002 From: goodger@python.org (David Goodger) Date: Sun, 01 Dec 2002 10:51:25 -0500 Subject: [Doc-SIG] Including files In-Reply-To: Message-ID: Pearu Peterson wrote: > What do you think? The "include" directive is already implemented. For included reStructuredText source files use:: .. include:: file.txt For literal block inclusions (example code, etc.), use:: .. include:: module.py :literal: Make sure you're using the latest code from CVS or the snapshot. See for details. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From pearu@cens.ioc.ee Sun Dec 1 16:04:41 2002 From: pearu@cens.ioc.ee (Pearu Peterson) Date: Sun, 1 Dec 2002 18:04:41 +0200 (EET) Subject: [Doc-SIG] Including files In-Reply-To: Message-ID: On Sun, 1 Dec 2002, David Goodger wrote: > Pearu Peterson wrote: > > What do you think? > > The "include" directive is already implemented. Thanks! Pearu From goodger@python.org Thu Dec 5 03:16:59 2002 From: goodger@python.org (David Goodger) Date: Wed, 04 Dec 2002 22:16:59 -0500 Subject: [Doc-SIG] looking for prior art Message-ID: I have begun work on a Python source Reader component for Docutils. I expect the work to go slowly, as there is lots to absorb, much earlier work to study and learn from, and little spare time to devote. I'm trying to keep it as simple as possible, mostly for my own benefit (lest my brain explode). I've looked over the HappyDoc code and Tony "Tibs" Ibbs' PySource prototype. HappyDoc uses the stdlib "parser" module to parse Python modules into abstract syntax trees (ASTs), but that seems difficult and fragile, the ASTs being so low-level. Tibs' prototype uses the much higher-level ASTs built by the stdlib "compiler" module, which are much easier to understand. I've decided to use the "compiler" module also. My first stumbling block is in parsing assignments. I want to extract the right-hand side (RHS) of assignments straight from the source. In his prototype, Tibs rebuilds the RHS from the AST, but that seems rather roundabout and the results may not match the source perfectly (equivalent, but not character-for-character). I think using the "tokenize" module in parallel with "compiler" may allow the code to extract the raw RHS text, as well as other raw text that doesn't make it verbatim to the AST. So, is there any prior art out there? Any pointers or advice? -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From bac@OCF.Berkeley.EDU Thu Dec 5 07:04:26 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Wed, 4 Dec 2002 23:04:26 -0800 (PST) Subject: [Doc-SIG] looking for prior art In-Reply-To: References: Message-ID: [David Goodger] > So, is there any prior art out there? Any pointers or advice? > How does PyChecker do it? I would guess by reading the bytecode, but you never know. I would guess using regexes would be the best if you just want to read the source. The ``tokenize`` module has all the regexes and they might be available independently from the methods in the module. -Brett From guido@python.org Thu Dec 5 09:02:09 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 05 Dec 2002 04:02:09 -0500 Subject: [Doc-SIG] looking for prior art In-Reply-To: Your message of "Wed, 04 Dec 2002 23:04:26 PST." References: Message-ID: <200212050902.gB5929m11117@pcp02138704pcs.reston01.va.comcast.net> > [David Goodger] > > > So, is there any prior art out there? Any pointers or advice? [Brett Cannon] > How does PyChecker do it? I would guess by reading the bytecode, but you > never know. It reads the bytecode, but PyChecker 2.0 will read the source. The bytecode is often hard to use; it also changes between versions. > I would guess using regexes would be the best if you just want to read the > source. The ``tokenize`` module has all the regexes and they might be > available independently from the methods in the module. I recommend using the tokenizer module directly. See pyclbr.py (in current CVS; it used to have its own set of regexps) for an example of how to do this. --Guido van Rossum (home page: http://www.python.org/~guido/) From doug@hellfly.net Thu Dec 5 14:04:47 2002 From: doug@hellfly.net (Doug Hellmann) Date: Thu, 5 Dec 2002 09:04:47 -0500 Subject: [Doc-SIG] looking for prior art In-Reply-To: References: Message-ID: <200212051404.JAA08933@branagan.hellfly.net> On Wednesday 04 December 2002 10:16 pm, David Goodger wrote: > > I've looked over the HappyDoc code and Tony "Tibs" Ibbs' PySource > prototype. HappyDoc uses the stdlib "parser" module to parse Python modules > into abstract syntax trees (ASTs), but that seems difficult and fragile, > the ASTs being so low-level. Tibs' prototype uses the much higher-level > ASTs built by the stdlib "compiler" module, which are much easier to > understand. I've decided to use the "compiler" module also. I'm pretty sure HappyDoc was written before the compiler module was generally available, but I'm not sure. I've only had to make a few minor modifications to it in the past, since the language syntax hasn't evolved that far. I'm working on a major overhaul of HappyDoc anyway, so now might be the time to rewrite the parsing stuff to use the compiler module. If you're interested in collaborating, let me know. Doug From goodger@python.org Fri Dec 6 02:45:14 2002 From: goodger@python.org (David Goodger) Date: Thu, 05 Dec 2002 21:45:14 -0500 Subject: [Doc-SIG] looking for prior art In-Reply-To: <200212051404.JAA08933@branagan.hellfly.net> Message-ID: Doug Hellmann wrote: > I'm pretty sure HappyDoc was written before the compiler module was > generally available I suspected as much. Either that, or you're a glutton for punishment ;-) > I've only had to make a few minor modifications to it in the past, > since the language syntax hasn't evolved that far. That's good to know. Still, the parser.suite() approach seems a lot harder. > I'm working on a major overhaul of HappyDoc anyway, so now might be > the time to rewrite the parsing stuff to use the compiler module. > If you're interested in collaborating, let me know. I am, definitely. What I'd like to do is to take a module, read in the text, run it through the module parser (using compiler.py and tokenize.py) and produce a high-level AST full of nodes that are interesting from an auto-documentation standpoint. For example, given this module (x.py):: # comment """Docstring""" """Additional docstring""" __docformat__ = 'reStructuredText' a = 1 """Attribute docstring""" class C(Super): """C's docstring""" class_attribute = 1 """class_attribute's docstring""" def __init__(self, text=None): """__init__'s docstring""" self.instance_attribute = (text * 7 + ' whaddyaknow') """instance_attribute's docstring""" def f(x, y=a*5, *args): """f's docstring""" return [x + item for item in args] f.function_attribute = 1 """f.function_attribute's docstring""" The module parser should produce a high-level AST, something like this (in pseudo-XML_):: comment Docstring (I'll leave out the lineno's) Additional docstring 'reStructuredText' 1 Attribute docstring C's docstring 1 class_attribute's docstring __init__'s docstring (text * 7 + ' whaddyaknow') class_attribute's docstring f's docstring 1 f.function_attribute's docstring compiler.parse() provides most of what's needed for this AST. I think that "tokenize" can be used to get the rest, and all that's left is to hunker down and figure out how. We can determine the line number from the compiler.parse() AST, and a get_rhs(lineno) method would provide the rest. The Docutils Python reader component will transform this AST into a Python-specific doctree, and then a `stylist transform`_ would further transform it into a generic doctree. Namespaces will have to be compiled for each of the scopes, but I'm not certain at what stage of processing. It's very important to keep all docstring processing out of this, so that it's a completely generic and not tool-specific. For an overview see: http://docutils.sf.net/pep-0258.html#python-source-reader For very preliminary code see: http://docutils.sf.net/docutils/readers/python/moduleparser.py For tests and example output see: http://docutils.sf.net/test/test_readers/test_python/test_parser.py I have also made some simple scripts to make "compiler", "parser", and "tokenize" output easier to read. They use input from the test_parser.py module above. See showast, showparse, and showtok in: http://docutils.sf.net/test/test_readers/test_python/ .. _pseudo-XML: http://docutils.sf.net/spec/doctree.html#pseudo-xml .. _stylist transform: http://docutils.sf.net/spec/pep-0258.html#stylist-transforms -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From doug@hellfly.net Fri Dec 6 13:27:49 2002 From: doug@hellfly.net (Doug Hellmann) Date: Fri, 6 Dec 2002 08:27:49 -0500 Subject: [Doc-SIG] looking for prior art In-Reply-To: References: Message-ID: <200212061327.IAA10440@branagan.hellfly.net> On Thursday 05 December 2002 9:45 pm, David Goodger wrote: > Doug Hellmann wrote: > > I'm pretty sure HappyDoc was written before the compiler module was > > generally available > > I suspected as much. Either that, or you're a glutton for punishment > ;-) Well, I didn't say that wasn't true. :-) I actually started with some sample code included in the Python source distribution, so it wasn't too hard to extend it and come up with a useful parser. > > I've only had to make a few minor modifications to it in the past, > > since the language syntax hasn't evolved that far. > > That's good to know. Still, the parser.suite() approach seems a lot > harder. If you're starting from scratch, I would definitely recommend trying the compiler module first. > > I'm working on a major overhaul of HappyDoc anyway, so now might be > > the time to rewrite the parsing stuff to use the compiler module. > > If you're interested in collaborating, let me know. > > I am, definitely. What I'd like to do is to take a module, read in > the text, run it through the module parser (using compiler.py and > tokenize.py) and produce a high-level AST full of nodes that are > interesting from an auto-documentation standpoint. For example, given > this module (x.py):: [...] > compiler.parse() provides most of what's needed for this AST. I think > that "tokenize" can be used to get the rest, and all that's left is to > hunker down and figure out how. We can determine the line number from > the compiler.parse() AST, and a get_rhs(lineno) method would provide > the rest. Does compiler include comments? I had to write a separate parser to pull comments out. > The Docutils Python reader component will transform this AST into a > Python-specific doctree, and then a `stylist transform`_ would further > transform it into a generic doctree. Namespaces will have to be > compiled for each of the scopes, but I'm not certain at what stage of > processing. Why perform all of those transformations? Why not go from the AST to a generic doctree? Or, even from the AST to the final output? > It's very important to keep all docstring processing out of this, so > that it's a completely generic and not tool-specific. Definitely. Doug From mwh@python.net Fri Dec 6 13:41:11 2002 From: mwh@python.net (Michael Hudson) Date: 06 Dec 2002 13:41:11 +0000 Subject: [Doc-SIG] looking for prior art In-Reply-To: Doug Hellmann's message of "Fri, 6 Dec 2002 08:27:49 -0500" References: <200212061327.IAA10440@branagan.hellfly.net> Message-ID: <2my973b6yw.fsf@starship.python.net> Doug Hellmann writes: > If you're starting from scratch, I would definitely recommend trying the > compiler module first. Amen. [...] > Does compiler include comments? No. tokenize.py does, though. I don't know how hard it would be to turn the output of tokenize.py into something like the output of compiler/transformer.py, but with comments. SPARK may be your friend... Cheers, M. -- Two things I learned for sure during a particularly intense acid trip in my own lost youth: (1) everything is a trivial special case of something else; and, (2) death is a bunch of blue spheres. -- Tim Peters, 1 May 1998 From goodger@python.org Sat Dec 7 02:47:58 2002 From: goodger@python.org (David Goodger) Date: Fri, 06 Dec 2002 21:47:58 -0500 Subject: [Doc-SIG] looking for prior art In-Reply-To: <200212061327.IAA10440@branagan.hellfly.net> Message-ID: Doug Hellmann wrote: > Does compiler include comments? I had to write a separate parser to > pull comments out. As Michael said, no. That's another reason for using compiler and tokenize in parallel. >> The Docutils Python reader component will transform this AST into a >> Python-specific doctree, and then a `stylist transform`_ would >> further transform it into a generic doctree. Namespaces will have >> to be compiled for each of the scopes, but I'm not certain at what >> stage of processing. > > Why perform all of those transformations? Why not go from the AST > to a generic doctree? Or, even from the AST to the final output? I want the docutils.readers.python.moduleparser.parse_module() function to produce a standard documentation-oriented AST that can be used by any tool. We can develop it together without having to compromise on the rest of our design (i.e., HappyDoc doesn't have to be made to work like Docutils, and vice-versa). It would be a higher-level version of what compiler.py provides. The Python reader component transforms this generic AST into a Python-specific doctree (it knows about modules, classes, functions, etc.), but this is specific to Docutils and cannot be used by HappyDoc or others. The stylist transform does the final layout, converting Python-specific structures ("class" sections, etc.) into a generic doctree using primitives (tables, sections, lists, etc.). This generic doctree does *not* know about Python structures any more. The advantage is that this doctree can be handed off to any of the output writers to create any output format we like. The latter two transforms are separate because I want to be able to have multiple independent layout styles (multiple runtime-selectable "stylist transforms"). Each of the existing tools (HappyDoc, pydoc, epydoc, Crystal, etc.) has its own fixed format. I personally don't like the tables-based format produced by these tools, and I'd like to be able to customize the format easily. That's the goal of stylist transforms, which are independent from the Reader component itself. One stylist transform could produce HappyDoc-like output, another could produce output similar to module docs in the Python library reference manual, and so on. It's for exactly this reason: >> It's very important to keep all docstring processing out of this, >> so that it's a completely generic and not tool-specific. ... but it goes past docstring processing. It's also important to keep style decisions and tool-specific data transforms out of this module parser. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From doug@hellfly.net Sat Dec 7 13:46:28 2002 From: doug@hellfly.net (Doug Hellmann) Date: Sat, 7 Dec 2002 08:46:28 -0500 Subject: [Doc-SIG] looking for prior art In-Reply-To: References: Message-ID: <200212071346.IAA12102@branagan.hellfly.net> On Friday 06 December 2002 9:47 pm, David Goodger wrote: > Doug Hellmann wrote: > > > > Why perform all of those transformations? Why not go from the AST > > to a generic doctree? Or, even from the AST to the final output? > > I want the docutils.readers.python.moduleparser.parse_module() > function to produce a standard documentation-oriented AST that can be > used by any tool. We can develop it together without having to > compromise on the rest of our design (i.e., HappyDoc doesn't have to > be made to work like Docutils, and vice-versa). It would be a > higher-level version of what compiler.py provides. That part makes sense. > The Python reader component transforms this generic AST into a > Python-specific doctree (it knows about modules, classes, functions, > etc.), but this is specific to Docutils and cannot be used by HappyDoc > or others. The stylist transform does the final layout, converting > Python-specific structures ("class" sections, etc.) into a generic > doctree using primitives (tables, sections, lists, etc.). This > generic doctree does *not* know about Python structures any more. The > advantage is that this doctree can be handed off to any of the output > writers to create any output format we like. Ah. I handled that differently in HappyDoc. Instead of building another data structure, I set up the API for the formatters to have methods that do things like start/end a (sub)section, start/end a list, etc. The primary implementation is an HTML formatter that produces tables, but there are other formatters. The docset is then responsible for calling the right formatter method when it wants it. Having the docset and formatter separate makes things more complicated than I expected, so in HappyDoc 3.0 there will just be one plugin system. There is a new scanner which walks the input directory building a tree of scanned files, doing optional special processing for each based on mimetype. For text/x-python files, the file is parsed and information about classes, etc. are extracted. The output formatter walks the resulting tree, also doing mimetype-based processing for each file. HTML and image files will be copied from input to output. Text files are converted using the docstring converter, and the parse results from Python modules are used to generate new HTML output files. I've got the scanner done, and am working on the output formatter code now. Doug From fantasai@escape.com Sat Dec 14 04:48:21 2002 From: fantasai@escape.com (fantasai) Date: Fri, 13 Dec 2002 23:48:21 -0500 Subject: [Doc-SIG] reST block quotes Message-ID: <3DFAB815.70501@escape.com> hmm.. it's been awhile. ok, so there's a problem with the blockquote syntax that was one of the first things I noticed: The syntax relies exclusively on indentation. This means one can't use indentation for other things--like indenting sections to make the document structure easier to grasp. The other problem is that attributions don't seem to be recognized. It would be nice to put uris in HTML's 'cite' attribute and mark up just regular attributions as such. | An Attribution identifies the source to whom a | BlockQuote or Epigraph is ascribed. -- http://www.docbook.org/tdg/en/html/attribution.html So, I'd like to have reST take something like that (URIs might need something to distinguish them from, say, people) and translate it into appropriate markup. An option could require the pipe quoting or another symbol (e.g. '>') and just treat indented blocks as regular text. (I'd gotten the symbol-quoting part to work last year, but ran into some trouble with the attribution (I was using a different syntax) and put everything aside for later.) So, what do you think? ~fantasai From goodger@python.org Sat Dec 14 15:22:41 2002 From: goodger@python.org (David Goodger) Date: Sat, 14 Dec 2002 10:22:41 -0500 Subject: [Doc-SIG] reST block quotes In-Reply-To: <3DFAB815.70501@escape.com> Message-ID: fantasai wrote: > ok, so there's a problem with the blockquote syntax > that was one of the first things I noticed: The syntax > relies exclusively on indentation. That's not a problem, that's a *feature*. :) > This means one can't use indentation for other things--like > indenting sections to make the document structure easier to grasp. Not true. Indentation is used in many local contexts, such as list items. In the documentation philosophy embodied in reStructuredText, section structure through indentation is a *misfeature*. See "Section Structure via Indentation" in , and "Questions & Answers", item 3, in . > The other problem is that attributions don't seem to > be recognized. It would be nice to put uris in HTML's > 'cite' attribute and mark up just regular attributions > as such. > > | An Attribution identifies the source to whom a > | BlockQuote or Epigraph is ascribed. > > -- http://www.docbook.org/tdg/en/html/attribution.html I don't see this as a problem either. It's new functionality. What would the result look like? > So, I'd like to have reST take something like that > (URIs might need something to distinguish them from, > say, people) and translate it into appropriate markup. What would that be? > An option could require the pipe quoting or another > symbol (e.g. '>') and just treat indented blocks as > regular text. > > (I'd gotten the symbol-quoting part to work last year, > but ran into some trouble with the attribution (I was > using a different syntax) and put everything aside for > later.) > > So, what do you think? I think this may be an appropriate use of a directive. Directives offer an easy way to experiment with new features without requiring new general syntax. If a feature is useful enough and has appropriate and unambiguous syntax, it could become a general feature. Please flesh out the spec more. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From fantasai@escape.com Sat Dec 14 20:11:55 2002 From: fantasai@escape.com (fantasai) Date: Sat, 14 Dec 2002 15:11:55 -0500 Subject: [Doc-SIG] reST block quotes References: Message-ID: <3DFB908B.1030208@escape.com> David Goodger wrote: > fantasai wrote: >> >>The [blockquote] syntax relies exclusively on indentation. >> >>This means one can't use indentation for other things--like >>indenting sections to make the document structure easier to grasp. > > Not true. Indentation is used in many local contexts, such as list > items. In the documentation philosophy embodied in reStructuredText, > section structure through indentation is a *misfeature*. See "Section > Structure via Indentation" in > , and "Questions & > Answers", item 3, in . I *know* that, and that's why I'm particularly glad reST doesn't use indentation for structure like STX does. But because of the blockquote syntax, it doesn't let me use indentation as text formatting. Document structure is easier to see if the sections are indented. I would like to be able to indent sections of reST without triggering *any* markup construct. >>The other problem is that attributions don't seem to >>be recognized. It would be nice to put uris in HTML's >>'cite' attribute and mark up just regular attributions >>as such. >> >> | An Attribution identifies the source to whom a >> | BlockQuote or Epigraph is ascribed. >> >> -- http://www.docbook.org/tdg/en/html/attribution.html > > > I don't see this as a problem either. It's new functionality. I apologize for my inappropriate use of English vocabulary. > What would the result look like?
An Attribution identifies the source to whom a BlockQuote or Epigraph is ascribed.
>>An option could require the pipe quoting or another >>symbol (e.g. '>') and just treat indented blocks as >>regular text. >> > I think this may be an appropriate use of a directive. Think what may be an appropriate use of a directive? Attribution recognition or quoted blockquote recognition or not recognizing purely indented blocks as blockquotes? > Please flesh out the spec more. An indented block in which each line begins with the same sequence of spaces+(>, |, #) is recognized as a blockquote. It may be optionally followed by blank lines and an attribution. The attribution begins with two dashes and a space (-- ) which must be indented at least to the preceding blockquote's quote character. The attribution may be multiple lines, but must be indented at least three spaces from the first dash. # quoted text # quoted text -- attribution attribution attribution attribution If a line in the attribution consists entirely of opening and closing angle brackets with a sequence of URI characters in between, the line is taken out of the attribution text and the URI sequence is put in the blockquote's 'cite' attribute. All other attribution content is parsed as inline content and placed in the attribution element, which is a child of the blockquote. In HTML, the attribution corresponds to
attribution text
So, this: | An Attribution identifies the source to whom a | BlockQuote or Epigraph is ascribed. -- DocBook: The Definitive Guide would result in this:
An Attribution identifies the source to whom a BlockQuote or Epigraph is ascribed.
DocBook: The Definitive Guide
From goodger@python.org Sun Dec 15 00:38:46 2002 From: goodger@python.org (David Goodger) Date: Sat, 14 Dec 2002 19:38:46 -0500 Subject: [Doc-SIG] reST block quotes In-Reply-To: <3DFB908B.1030208@escape.com> Message-ID: [fantasai] >>>The [blockquote] syntax relies exclusively on indentation. >>> >>>This means one can't use indentation for other things--like >>>indenting sections to make the document structure easier to grasp. [David Goodger] >> Not true. Indentation is used in many local contexts, such as list >> items. In the documentation philosophy embodied in reStructuredText, >> section structure through indentation is a *misfeature*. See "Section >> Structure via Indentation" in >> , and "Questions & >> Answers", item 3, in . [fantasai] > I *know* that, and that's why I'm particularly glad reST > doesn't use indentation for structure like STX does. But > because of the blockquote syntax, it doesn't let me use > indentation as text formatting. Document structure is > easier to see if the sections are indented. I would like > to be able to indent sections of reST without triggering > *any* markup construct. So you'd like to be able to turn off block quote recognition? It could be done, probably without too much pain, but care would have to be taken with other uses of indentation (list items, definition lists, etc.). >>> The other problem is that attributions don't seem to >>> be recognized. It would be nice to put uris in HTML's >>> 'cite' attribute and mark up just regular attributions >>> as such. >>> >>> | An Attribution identifies the source to whom a >>> | BlockQuote or Epigraph is ascribed. >>> >>> -- http://www.docbook.org/tdg/en/html/attribution.html >> >> I don't see this as a problem either. It's new functionality. > > I apologize for my inappropriate use of English vocabulary. :) >> What would the result look like? > >
> An Attribution identifies the source to whom a BlockQuote > or Epigraph is ascribed. >
I tried looking at this code in a browser, MSIE 5.1.4/MacOS. Not state of the art, but the best I have at hand. The "cite" attribute didn't actually do anything. What is it supposed to do? (How is a user agent supposed to render a "cite" attribute?) >>> An option could require the pipe quoting or another >>> symbol (e.g. '>') and just treat indented blocks as >>> regular text. >>> >> I think this may be an appropriate use of a directive. > > Think what may be an appropriate use of a directive? > Attribution recognition or quoted blockquote recognition > or not recognizing purely indented blocks as blockquotes? Question: are the first two related to the last? If so, how? Any of those could be, but I was specifically referring to attribution recognition and possibly quoted blockquote recognition. Something similar to the "quoted blockquote recognition" idea has already been documented, although as a literal block alternative; see and search for "per-line quoting". A "quoted-blockquote" directive could easily be constructed: Some ordinary text. .. quoted-blockquote:: | Block quote text | goes here. (Although I'm not sure if this is what you mean, or has any value.) The "attribution recognition" idea could be done with a "cite" (or whatever) directive, something like this: Some ordinary text. A block quote. .. cite:: This is some citation text, ending with a URI. A "cite" directive might only be valid inside a block quote, and would add a "cite" attribute to the block quote element itself. If it was useful or popular enough, it could grow special syntax. I don't know if "--" at the beginning of the paragraph is enough though; I already use that style and would be surprised if hyperlinks after "--" disappeared from the rendered form. As for turning off indentation->blockquotes, that could be a pragma-type directive, but would require some changes to the parser to support it. I'm not convinced of its usefulness. Can you provide some use cases? >> Please flesh out the spec more. Thank you. > So, this: > > | An Attribution identifies the source to whom a > | BlockQuote or Epigraph is ascribed. > > -- DocBook: The Definitive Guide > > > would result in this: > >
> An Attribution identifies the source to whom a BlockQuote > or Epigraph is ascribed. >
DocBook: The Definitive Guide
>
The markup seems problematic to me. There are two separate constructs there, which aren't obviously related. What if there's a quoted block without an attribution? What about an attribution without a quoted block? If they were joined into one construct, it would be easier to digest: | An Attribution identifies the source to whom a | BlockQuote or Epigraph is ascribed. | | -- DocBook: The Definitive Guide | This syntax would make a block quote difficult to maintain, almost as bad as a grid table. I don't see its value. Use cases? -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From fantasai@escape.com Sun Dec 15 04:47:23 2002 From: fantasai@escape.com (fantasai) Date: Sat, 14 Dec 2002 23:47:23 -0500 Subject: [Doc-SIG] reST block quotes References: Message-ID: <3DFC095B.5040009@escape.com> David Goodger wrote: > [fantasai] > > So you'd like to be able to turn off block quote recognition? It > could be done, probably without too much pain, but care would have to > be taken with other uses of indentation (list items, definition > lists, etc.). Certainly. >>>What would the result look like? >> >>
>> An Attribution identifies the source to whom a BlockQuote >> or Epigraph is ascribed. >>
> > I tried looking at this code in a browser, MSIE 5.1.4/MacOS. Not > state of the art, but the best I have at hand. The "cite" attribute > didn't actually do anything. What is it supposed to do? (How is a > user agent supposed to render a "cite" attribute?) What the UA is supposed to do isn't really specified. Mozilla provides access to it through the Properties item on the context menu. The URI can, however, also be made available with stylesheets and/or scripting. >>>>An option could require the pipe quoting or another >>>>symbol (e.g. '>') and just treat indented blocks as >>>>regular text. >>> >>>I think this may be an appropriate use of a directive. >> >>Think what may be an appropriate use of a directive? >>Attribution recognition or quoted blockquote recognition >>or not recognizing purely indented blocks as blockquotes? > > Question: are the first two related to the last? If so, how? Disabling blockquotes altogether isn't a good idea, so if one were to disable indent block -> blockquote, one should provide an alternative syntax. Recognizing an attribution syntax is independent of either of those. > Something similar to the "quoted blockquote recognition" idea > has already been documented, although as a literal block > alternative; see and > search for "per-line quoting". It wouldn't interfere, as that requires a literal block start sequence. That literal block example, btw, really should be handled as two blockquotes, one inside the other, since that's what it *is*. > A "quoted-blockquote" directive could easily be constructed: > > Some ordinary text. > > .. quoted-blockquote:: > > | Block quote text > | goes here. > > (Although I'm not sure if this is what you mean, or has any value.) Adding a directive like that defeats the purpose. I might as well just write .. blockquote:: Block quote text goes here. There's no need for a symbol because it's already distinguished from a merely indented block. The quoting syntax I'm using, though, is very common and so it's non-intrusive as well as intuitive and unambiguous. (It also allows cut & paste from emails without modification.) > A "cite" directive might only be valid inside a block quote, and would > add a "cite" attribute to the block quote element itself. If it was > useful or popular enough, it could grow special syntax. I don't know > if "--" at the beginning of the paragraph is enough though; I already > use that style and would be surprised if hyperlinks after "--" > disappeared from the rendered form. Why would hyperlinks disappear? Inline markup is recognized after the "-- ". > As for turning off indentation->blockquotes, that could be a > pragma-type directive, but would require some changes to the parser to > support it. I'm not convinced of its usefulness. Can you provide > some use cases? Yeah. I just hand-converted an HTML file to plaintext today to post to a mailing list. I indented every section underneath its header. e.g. Heading paragraph Subheading paragraph paragraph Subheading example paragraph I would like to be able to do that in an reST doc. >> | An Attribution identifies the source to whom a >> | BlockQuote or Epigraph is ascribed. >> >> -- DocBook: The Definitive Guide >> >> >>would result in this: >> >>
>> An Attribution identifies the source to whom a BlockQuote >> or Epigraph is ascribed. >>
DocBook: The Definitive Guide
>>
> > The markup seems problematic to me. There are two separate constructs > there, which aren't obviously related. What if there's a quoted block > without an attribution? If there's no attribution, it doesn't get an attribution. Parsing continues as usual. > What about an attribution without a quoted block? No special treatment. It will be handled as it is now. > If they were joined into one construct, it would be easier to > digest: > > | An Attribution identifies the source to whom a > | BlockQuote or Epigraph is ascribed. > | > | -- DocBook: The Definitive Guide > | That could be construed as quoting the citation. That is, the quoted text has an attribution, and you're quoting that attribution. I think this would actually be more difficult to parse, because one would have to know whether this "attribution" is the last block of the blockquote to determine whether or not it gets parsed as an attribution. ~fantasai From goodger@python.org Sun Dec 15 15:16:57 2002 From: goodger@python.org (David Goodger) Date: Sun, 15 Dec 2002 10:16:57 -0500 Subject: [Doc-SIG] reST block quotes In-Reply-To: <3DFC095B.5040009@escape.com> Message-ID: [fantasai] >>> Think what may be an appropriate use of a directive? >>> Attribution recognition or quoted blockquote recognition >>> or not recognizing purely indented blocks as blockquotes? [David Goodger] >> Question: are the first two related to the last? If so, how? [fantasai] > Disabling blockquotes altogether isn't a good idea, so > if one were to disable indent block -> blockquote, one > should provide an alternative syntax. So, if the "diabling blockquotes" proposal is *not* accepted, the alternative syntax wouldn't be required? If that's not the case, please justify them independently. I see potential value for quoted blockquote recognition in an email context (more below), but not for disabling ordinary blockquotes. >> Something similar to the "quoted blockquote recognition" idea >> has already been documented, although as a literal block >> alternative; see and >> search for "per-line quoting". > > It wouldn't interfere, as that requires a literal block > start sequence. That literal block example, btw, really > should be handled as two blockquotes, one inside the > other, since that's what it *is*. Take another look: """Generalize the "literal block" construct to allow blocks with a per-line quoting to avoid indentation?""" Last three words are the whole point. > The quoting syntax I'm using, though, is very common and so > it's non-intrusive as well as intuitive and unambiguous. > (It also allows cut & paste from emails without modification.) Since early on there's been talk about an "Email Reader", which would handle quoted segments of messages, signatures, and other email-specific constructs. Perhaps that's where effort should be directed? I wouldn't want to accept a proposal that was incompatible with an Email Reader (even a future potential Email Reader). >> A "cite" directive might only be valid inside a block quote, and >> would add a "cite" attribute to the block quote element itself. If >> it was useful or popular enough, it could grow special syntax. I >> don't know if "--" at the beginning of the paragraph is enough >> though; I already use that style and would be surprised if >> hyperlinks after "--" disappeared from the rendered form. > > Why would hyperlinks disappear? Inline markup is recognized > after the "-- ". If the hyperlink is subsumed into the
's "cite" attribute, in all but the most cutting-edge browsers (if then) it's as good as invisible. If something so drastic is happening to data, I think the markup should be much more explicit and distinctive. >> As for turning off indentation->blockquotes, that could be a >> pragma-type directive, but would require some changes to the parser >> to support it. I'm not convinced of its usefulness. Can you >> provide some use cases? > > Yeah. I just hand-converted an HTML file to plaintext > today to post to a mailing list. I indented every section > underneath its header. e.g. > > Heading > > paragraph > > Subheading > > paragraph > > paragraph > > Subheading > > example > > paragraph > > I would like to be able to do that in an reST doc. You can do that with current reST; you would end up with nested block quotes. I don't get it. Can you show me an example of how you'd like to apply this concept to reStructuredText sources? >>> | An Attribution identifies the source to whom a >>> | BlockQuote or Epigraph is ascribed. >>> >>> -- DocBook: The Definitive Guide >>> >>> >>> would result in this: >>> >>>
>> cite="http://www.docbook.org/tdg/en/html/attribution.html"> >>> An Attribution identifies the source to whom a BlockQuote >>> or Epigraph is ascribed. >>>
DocBook: The Definitive >>> Guide
>>>
>> >> The markup seems problematic to me. There are two separate >> constructs there, which aren't obviously related. ... >> What about an attribution without a quoted block? > > No special treatment. It will be handled as it is now. How is it handled now? >> If they were joined into one construct, it would be easier to >> digest: >> >> | An Attribution identifies the source to whom a >> | BlockQuote or Epigraph is ascribed. >> | >> | -- DocBook: The Definitive Guide >> | > > That could be construed as quoting the citation. That is, > the quoted text has an attribution, and you're quoting > that attribution. The block quote part could require quote char + whitespace, and the attribution could omit the whitespace. That would disambiguate it: | An Attribution identifies the source to whom a | BlockQuote or Epigraph is ascribed. | |-- DocBook: The Definitive Guide | But I still don't see the value of the attribution proposal, as opposed to simply rendering the attribution as an ordinary paragraph. I'd like to see a concrete example where the results would be *useful*. HTML's
doesn't seem to have universal support. How should attributions be handled for current browsers (back to Netscape 4)? Even more fundamentally, how should attributions be marked up in the Docutils internal doctree? I.E., what changes to the "block_quote" element in spec/docutils.dtd? > I think this would actually be more > difficult to parse, because one would have to know > whether this "attribution" is the last block of the > blockquote to determine whether or not it gets parsed > as an attribution. That's easy to know. What's not so easy (programmatically or to the human eye) is to link two successive elements that don't have a logical containing context. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From cben@techunix.technion.ac.il Sun Dec 15 17:13:39 2002 From: cben@techunix.technion.ac.il (Beni Cherniavsky) Date: Sun, 15 Dec 2002 19:13:39 +0200 (IST) Subject: [Doc-SIG] reST block quotes In-Reply-To: Message-ID: On 2002-12-15, David Goodger wrote: > I wouldn't want to accept a proposal that was incompatible > with an Email Reader (even a future potential Email Reader). >[snip] > > after the "-- ". Then note that "-- " is the standard singnature separtor. Since then it's alone on the line, this is not an issue, just a point to document. Also note that I use " -- " for a long dash -- probably a LaTeX-induced habit; I saw some other people writing so. Stupid word wrapping can well put "-- " at the beginning of a line in running text. Again not an issue, just document that "-- " must come after an empty line (?). About email reading, also note that ">>> " becomes ambiguos between doctest blocks and some email clients that compact nested "> " quoting by omiting the spaces. And while we are there, how about "initials> " quoting? Also the "On Someday, Random Writer wrote:" is probably an attribution too. Now how do you handle a quote that's broken in the middle and resumed? Add to that nesting... >[snip] > You can do that with current reST; you would end up with nested block > quotes.I don't get it. Can you show me an example of how you'd like > to apply this concept to reStructuredText sources? > I think the proposal is similar to the complaints of C coders coming to Python -- they want indentation to have no meaning so they can make it reflect the program's structure but *in the way they like it*. I'm not saying that the request should be rejected based on this analogy and Python's rejection of such requests (maybe it would be nicer in reST). If you ask my personal opinion, I'm quite happy with reST's current style (perhaps modulo allowing indented bulleted lists instead of empty lines but I'm not settled on it). -- Beni Cherniavsky From goodger@python.org Sun Dec 15 20:46:22 2002 From: goodger@python.org (David Goodger) Date: Sun, 15 Dec 2002 15:46:22 -0500 Subject: [Doc-SIG] reST block quotes In-Reply-To: Message-ID: Beni Cherniavsky wrote: ... a bunch of email-related details omitted Yes, there are many thorny issues wrt Email context. That's why I don't want to make any premature decisions, and why I invite anyone who's interested to look into the issues. > If you ask my personal opinion, I'm quite happy with reST's current > style (perhaps modulo allowing indented bulleted lists instead of > empty lines but I'm not settled on it). Not following you. Can you elaborate please? -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From lists@morpheus.demon.co.uk Sun Dec 15 17:30:00 2002 From: lists@morpheus.demon.co.uk (Paul Moore) Date: Sun, 15 Dec 2002 17:30:00 +0000 Subject: [Doc-SIG] reST block quotes References: <3DFC095B.5040009@escape.com> Message-ID: David Goodger writes: >> Subheading >> >> example >> >> paragraph >> >> I would like to be able to do that in an reST doc. > > You can do that with current reST; you would end up with nested > block quotes. I don't get it. Can you show me an example of how > you'd like to apply this concept to reStructuredText sources? Sorry to butt in. I think the OP's point is that he wants to be able to indent the source text, for readability of the source, *without* having any effect on markup. So, for example:: Section 1 --------- This is section 1. I have indented it simply so that it shows up more clearly in the source text. I am *not* looking for a blockquote construct in the processed output. I'm not entirely sure I agree with the suggestion, but I do sympathise with it - I often use indentation in plain text postings, and it's vey rarely for something I'd call a "block quote". But I haven't analysed my "natural tendencies" against reST conventions, to be sure of this. > HTML's
doesn't seem to have universal > support. How should attributions be handled for current browsers > (back to Netscape 4)? In IE6 on Windows 2000, the cite attribute seems to completely disappear. I can't get it displayed, no matter what I do. Given this fact, I'd avoid the cite construct like the plague - it involves a serious risk of information just "disappearing". Paul. -- This signature intentionally left blank From cben@techunix.technion.ac.il Sun Dec 15 23:24:34 2002 From: cben@techunix.technion.ac.il (Beni Cherniavsky) Date: Mon, 16 Dec 2002 01:24:34 +0200 (IST) Subject: [Doc-SIG] reST block quotes In-Reply-To: Message-ID: On 2002-12-15, David Goodger wrote: > Beni Cherniavsky wrote: > ... a bunch of email-related details omitted > > Yes, there are many thorny issues wrt Email context.That's why I > don't want to make any premature decisions, and why I invite anyone > who's interested to look into the issues. > All right, I'll start thinking about the emails I write "how will this interact with rST?" ;-) Of course this is biased since I already use many parts of rST in emails. > > If you ask my personal opinion, I'm quite happy with reST's current > > style (perhaps modulo allowing indented bulleted lists instead of > > empty lines but I'm not settled on it). > > Not following you.Can you elaborate please? > I meant that: - I don't myself feel the need to free up indentation (but that doesn't mean that others don't). - I'm not entirely happy with making empty lines around lists. It takes to much real estate, especially if I make empty lines between items. So I don't but then the list looks too isolated from the paragraph [1]_. - I understand that omitting the empty line, as is, would create an ambiguity. It's not even clear to my eyes. - Demanding an extra space before the bullet would remove the ambiguity, I think. Currently this means a list in a blockquote or a list in the definition of a definition list, depending on presense of empty line before. Both are infrequently needed, so the empty comment hack looks acceptable to me (but I'm biased). - I'm sure I saw this discussed somewhere already but I can't find it in ``alternatives.txt`` [2]_... .. [1] This reminds me of a different concern I had. Some markup models (LaTeX and my brain ;-) think of paragraphs as logical beasts. A paragraph could contain a list - (like this) or other things (especially blockquotes) and then continue. There are three more combinations: - The thing is part of the previous paragraph, a new paragraph starts after it. - The thing can be a logical paragraph on its own. - The thing starts a new paragraph. Seems rare but consider a text where each quote is followed by some comments (as in emails). Most markup models (HTML, current rST) treat a paragraph as an atomic piece of text. Any other construct terminates it. But look at any book - it's not so! LaTeX renders a new paragraph indented and a resumed paragraph without indentation. Math formula "dysplays" are another example for things that could be part of a paragraph... So I want a way to represent the disctinctions. - As you see, the space-before-bullet format allows to express it for lists. However blockquotes are not discernable from definition lists then (if the paragraph above would be one-line). - I'm not sure how to solve it. Scanning the spec, it seems that only blockquotes create problems. Maybe some explicit blockquote-marking syntax is needed after all. This time an empty comment won't cut it. But I don't see a good one. Then maybe a definition list should be explicit. How about terminating each definition line with " --" (removed in the output)? - Just tried putting a list in a substitution:: Text |sub| text. .. |sub| replace:: - Foo - Bar Didn't work. (See, here I wanted the text-literal-text to form one logical paragraph). I'm not sure it should work but it indicates the big issue -- the model that a paragraph contains no other elements must be abandoned to support this concept. .. [2] Unrelated question: when should I use literal text (``), interpreted text (`) and no quoting? - What's the red line between an identifier and a piece of Python code? If I refer to variable `foo` that's interpreted; if I refer to ``a() + b()``, that should probably be literal; what about `m.bar` where m is not a class or variable in current scope but conventionally stands for any "Matcher" object (there are many matcher classes) in some library I'm writing? - Should I put all filenames in literal quotes? To a human it's already discernible when there is an extension (foo.py) so I'm not sure. - Generally the docs (including the PEPs) need some more discussion on where actually to use interpreted text... -- Beni Cherniavsky From fantasai@escape.com Mon Dec 16 04:16:41 2002 From: fantasai@escape.com (fantasai) Date: Sun, 15 Dec 2002 23:16:41 -0500 Subject: [Doc-SIG] reST block quotes References: Message-ID: <3DFD53A9.4010503@escape.com> Beni Cherniavsky wrote: > > Most markup models (HTML, current rST) treat a paragraph as an atomic > piece of text. Any other construct terminates it. But look at any > book - it's not so! Just a note: DocBook allows for this. can contain block-level content. http://www.docbook.org/tdg/en/html/para.html From goodger@python.org Mon Dec 16 04:35:40 2002 From: goodger@python.org (David Goodger) Date: Sun, 15 Dec 2002 23:35:40 -0500 Subject: [Doc-SIG] reST block quotes In-Reply-To: Message-ID: [Beni Cherniavsky] > All right, I'll start thinking about the emails I write "how will > this interact with rST?" ;-) Of course this is biased since I > already use many parts of rST in emails. It's the parts of email messages not part of reStructuredText that are interesting. [Beni Cherniavsky] >>> If you ask my personal opinion, I'm quite happy with reST's current >>> style (perhaps modulo allowing indented bulleted lists instead of >>> empty lines but I'm not settled on it). [David Goodger] >> Not following you. Can you elaborate please? [Beni Cherniavsky] > I meant that: > - I don't myself feel the need to free up indentation (but that > doesn't mean that others don't). > - I'm not entirely happy with making empty lines around lists. It > takes to much real estate, especially if I make empty lines > between items. Something's gotta give. For zero ambiguity (within reStructuredText's framework), blank lines before & after lists are essential. > So I don't but then the list looks too isolated > from the paragraph [1]_. > - I understand that omitting the empty line, as is, would create an > ambiguity. It's not even clear to my eyes. > - Demanding an extra space before the bullet would remove the > ambiguity, I think. Too subtle, IMHO. > Currently this means a list in a blockquote > or a list in the definition of a definition list, depending on > presense of empty line before. Definition list only in the case of a single line before the indent. > Both are infrequently needed, so > the empty comment hack looks acceptable to me (but I'm biased). Frequently enough. > .. [1] This reminds me of a different concern I had. Some markup > models (LaTeX and my brain ;-) think of paragraphs as logical beasts. reStructuredText (and Docutils) treat paragraphs as physical. It would be impossible to reliably infer logical paragraph semantics from plaintext sources. The debate over physical model (a paragraph is a block in the document flow) vs. logical model (paragraphs can contain lists and block quootes and equations and others) has been around for a long time and I don't see any resolution. Personally, I prefer the physical model, not least because it results in a much simpler DTD. The logical model opens up a big can of worms. > A paragraph could contain a list > - (like this) > or other things (especially blockquotes) and then continue. > There are three more combinations: > - The thing is part of the previous paragraph, a new paragraph > starts after it. > > - The thing can be a logical paragraph on its own. > > - The thing starts a new paragraph. I'm not sure what "the thing" is or where you're going with this. If the text of your message was meant as an example of what you're proposing, I find it very hard to follow the structure. > Seems rare but consider a text where each quote is followed by some > comments (as in emails). Not following you. > So I want a way to represent the disctinctions. Not worth the trouble IMHO. > - Just tried putting a list in a substitution:: > Text |sub| text. > > .. |sub| replace:: > - Foo > - Bar > Didn't work. Substitutions have to be phrase-level. I can't remember if the parser checks or not; if not, it'll go on the to-do. > (See, here I wanted the text-literal-text to form one logical > paragraph). I'm not sure it should work but it indicates the > big issue -- the model that a paragraph contains no other > elements must be abandoned to support this concept. And, as I said, I don't think it's worth the effort even if it were feasible (which I doubt). The computer doesn't really care that

Beginning of paragraph

  • item one
  • item two

continuation of paragraph.

is a single logical paragraph (not to mention *this* paragraph!). The human reader picks it up right away though. Only if there's a first-line paragraph indent would it matter to the reader. > .. [2] Unrelated question: when should I use literal text (``), > interpreted text (`) and no quoting? Interpreted text hasn't really been implemented yet. Its main client will be the Python Source Reader, which is a work in progress. > - What's the red line between an identifier and a piece of > Python code? If I refer to variable `foo` that's interpreted; > if I refer to ``a() + b()``, that should probably be literal; > what about `m.bar` where m is not a class or variable in > current scope but conventionally stands for any "Matcher" > object (there are many matcher classes) in some library I'm > writing? `m.bar` would generate a warning or error. The identifier must be in the current namespace, with possible exceptions for stdlib modules. In the case you're describing, I'd use inline literals ``m.bar``. > - Should I put all filenames in literal quotes? Up to you and your document's context. > - Generally the docs (including the PEPs) need some more > discussion on where actually to use interpreted text... When there is a use for them, the docs will discuss them. Until then, it would just be confusing. It already *is* confusing, some would say. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@python.org Mon Dec 16 04:38:36 2002 From: goodger@python.org (David Goodger) Date: Sun, 15 Dec 2002 23:38:36 -0500 Subject: [Doc-SIG] reST block quotes In-Reply-To: <3DFD53A9.4010503@escape.com> Message-ID: Beni Cherniavsky wrote: >> Most markup models (HTML, current rST) treat a paragraph as an atomic >> piece of text. Any other construct terminates it. But look at any >> book - it's not so! fantasai wrote: > Just a note: DocBook allows for this. can contain > block-level content. > http://www.docbook.org/tdg/en/html/para.html And the result is a mess, IMO. Put a list inside a paragraph, and then you'll have paragraphs inside paragraphs! I don't consider a paragraph to be a recursive element. That's just my personal opinion though. -- David Goodger goodger@python.org From fantasai@escape.com Mon Dec 16 04:39:00 2002 From: fantasai@escape.com (fantasai) Date: Sun, 15 Dec 2002 23:39:00 -0500 Subject: [Doc-SIG] reST block quotes References: Message-ID: <3DFD58E4.1040005@escape.com> Beni Cherniavsky wrote: > > Then note that "-- " is the standard singnature separtor. Since then it's > alone on the line, this is not an issue, just a point to document. > > Also note that I use " -- " for a long dash -- probably a LaTeX-induced > habit; I saw some other people writing so. Stupid word wrapping can well > put "-- " at the beginning of a line in running text. Again not an issue, > just document that "-- " must come after an empty line (?). Would using three dashes solve these problems? > About email reading, also note that ">>> " becomes ambiguos between > doctest blocks and some email clients that compact nested "> " quoting by > omiting the spaces. Yes, that is true. That means either quoted blocks would have to be implemented as an option, defaulting to 'off' for backwards-compatability, or at least one space must be required between quote characters. Requiring at least one space before the quote character might not be a bad idea. It improves readability IMO. > And while we are there, how about "initials> " quoting? That can be dealt with later. It's not nearly as common, and it's even less important for processing documents (as opposed to emails and newsgroup posts), which is what most reST files are. > Also the "On Someday, Random Writer wrote:" is probably an > attribution too. It is, but it's not practical to parse that since people use so many different formats. It would have to be treated as a paragraph, which really isn't that bad. > Now how do you handle a quote that's broken in the middle and resumed? As multiple blockquotes. How would you do it with the current syntax? Come to think of it, the current syntax can't really handle nested blockquotes well, can it? Not if there's a quote at the beginning of another quote. ~fantasai From goodger@python.org Mon Dec 16 04:51:33 2002 From: goodger@python.org (David Goodger) Date: Sun, 15 Dec 2002 23:51:33 -0500 Subject: [Doc-SIG] reST block quotes In-Reply-To: <3DFD58E4.1040005@escape.com> Message-ID: [Beni Cherniavsky] >> About email reading, also note that ">>> " becomes ambiguos between >> doctest blocks and some email clients that compact nested "> " >> quoting by omiting the spaces. [fantasai wrote] > Yes, that is true. That means either quoted blocks would > have to be implemented as an option, defaulting to 'off' > for backwards-compatability, or at least one space must > be required between quote characters. The Email Reader would probably have to disable doctest blocks unless explicitly requested. > Requiring at least one space before the quote character > might not be a bad idea. It improves readability IMO. Obvious from your message ;). My emailer doesn't do it that way though and it would be onerous to require it. > Come to think of it, the current syntax can't really > handle nested blockquotes well, can it? Not if there's > a quote at the beginning of another quote. Try it; works fine. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From fantasai@escape.com Mon Dec 16 05:51:57 2002 From: fantasai@escape.com (fantasai) Date: Mon, 16 Dec 2002 00:51:57 -0500 Subject: [Doc-SIG] reST block quotes References: Message-ID: <3DFD69FD.8070302@escape.com> David Goodger wrote: > > The Email Reader would probably have to disable doctest blocks unless > explicitly requested. > >>Requiring at least one space before the quote character >>might not be a bad idea. It improves readability IMO. > > Obvious from your message ;). My emailer doesn't do it that way > though and it would be onerous to require it. It does solve the doctest problem, though. The requirement could be switched off with doctest blocks. We'll have a "Python documentation mode" and an "everything else" mode. ;) >>Come to think of it, the current syntax can't really >>handle nested blockquotes well, can it? Not if there's >>a quote at the beginning of another quote. > > Try it; works fine. Impressive. :) I don't think the source would be very easy to follow with a complicated set of quotes, but that is mostly an electronic message problem. Most documents don't even have two levels of nested quotes. ~fantasai From fantasai@escape.com Mon Dec 16 06:41:47 2002 From: fantasai@escape.com (fantasai) Date: Mon, 16 Dec 2002 01:41:47 -0500 Subject: [Doc-SIG] compact HTML output from Docutils In-Reply-To: References: Message-ID: <3DFD75AB.3040404@escape.com> David Goodger wrote: > > - Check for and omit

tags in "simple" lists: list items contain > either a single paragraph, a nested simple list, or a paragraph > followed by a nested simple list. It would be more flexible for the author if you based the omission of

tags in single-paragraph lists on whether the list is spaced out or not. For example, this: - apples - oranges - pears would not have paragraph tags whereas this: - This is really a paragraph, even though it's the only block of content in the list item. - A paragraph is the basic structural unit in prose. would. Another option is to trigger

tags only for multi-line paragraphs. > - Regardless of the above, in definitions, table cells, field bodies, > option descriptions, and list items, mark the first child with > 'class="first"' if it is a paragraph. The stylesheet sets the top > margin to 0 for these paragraphs. Have you tried using p:first-child? It would be nice to avoid cruft like 'class="first"'. ~fantasai From cben@techunix.technion.ac.il Mon Dec 16 10:57:59 2002 From: cben@techunix.technion.ac.il (Beni Cherniavsky) Date: Mon, 16 Dec 2002 12:57:59 +0200 (IST) Subject: [Doc-SIG] reST block quotes In-Reply-To: Message-ID: On 2002-12-15, David Goodger wrote: > [Beni Cherniavsky] > > I meant that: > > - I don't myself feel the need to free up indentation (but that > > doesn't mean that others don't). > > - I'm not entirely happy with making empty lines around lists.It > > takes to much real estate, especially if I make empty lines > > between items. > > Something's gotta give.For zero ambiguity (within reStructuredText's > framework), blank lines before & after lists are essential. > > > So I don't but then the list looks too isolated > > from the paragraph [1]_. > > - I understand that omitting the empty line, as is, would create an > > ambiguity. It's not even clear to my eyes. > > - Demanding an extra space before the bullet would remove the > > ambiguity, I think. > > Too subtle, IMHO. > You have a point. This proposal would work best with logical paragraphs but I understand that's not going to happen. > > Currently this means a list in a blockquote > > or a list in the definition of a definition list, depending on > > presense of empty line before. > > Definition list only in the case of a single line before the indent. > > > Both are infrequently needed, so > > the empty comment hack looks acceptable to me (but I'm biased). > > Frequently enough. > I meant that blockquotes/defnitions *which are bullted lists* are infrequent. I propose to: 1. Put aside the logical paragraphs (I was more or less convinced by following arguments). 2. Allow the indented non-separated list as an equivallent syntax alternative. See what people end up using more. It's not ideal but I prefer it over the current. - Since logical paragraphs are rejected, this won't be ambiguos as a block quote (which keeps its separating lines). - It will be ambiguos sometimes with a definition list. If something is too subtle, it's the definition list (IMHO) so I propose to require some marker at the end of the definition term. - My proposed " --" marker is not too pretty:: Foo -- Bar Quux -- Quuux Maybe something other would be better. " ::=" isn't pretty either... - This allows for more than one line of definition terms. Could be abused then for e.g. Q&A which I'm not sure is good. > > .. [1] This reminds me of a different concern I had.Some markup > > models (LaTeX and my brain ;-) think of paragraphs as logical beasts. > > reStructuredText (and Docutils) treat paragraphs as physical.It > would be impossible to reliably infer logical paragraph semantics from > plaintext sources. That's the biggest problem. It would only be very clear if we require intented first lines in logical paragraph (and then they are limited to start with text). > The debate over physical model (a paragraph is a > block in the document flow) vs. logical model (paragraphs can contain > lists and block quootes and equations and others) has been around for > a long time and I don't see any resolution. OK, I'll search the archives. > Personally, I prefer the > physical model, not least because it results in a much simpler DTD. > The logical model opens up a big can of worms. > That's true. I want that can :) -- but only it it could be represented in a clean way. > > A paragraph could contain a list > > - (like this) > > or other things (especially blockquotes) and then continue. > > There are three more combinations: > > - The thing is part of the previous paragraph, a new paragraph > > starts after it. > > > > - The thing can be a logical paragraph on its own. > > > > - The thing starts a new paragraph. > > I'm not sure what "the thing" is or where you're going with this.If > the text of your message was meant as an example of what you're > proposing, I find it very hard to follow the structure. > The "thing" is a list, blockquote, or any other construct nested inside the logical paragraph. Yes, I tried to write in clever form-matches-content style. So the example is surely contrived, such combinations are rare. Nevertheless, you have a point - it's not very clear. The big problem is that it takes away too much of the inter-line spacing freedom. People won't observe it because indeed the human reader can see it anyway. > > Seems rare but consider a text where each quote is followed by some > > comments (as in emails). > > Not following you. > I was refering to a nested construct (non-text) starting the paragraph. > > So I want a way to represent the disctinctions. > > Not worth the trouble IMHO. > Legitimate decision -- the trouble is big indeed ;-) > > - Just tried putting a list in a substitution:: > > Text |sub| text. > > > > .. |sub| replace:: > > - Foo > > - Bar > > Didn't work. > > Substitutions have to be phrase-level.I can't remember if the parser > checks or not; if not, it'll go on the to-do. > It correcrtly complains that a substitution must be a single paragraph. > > (See, here I wanted the text-literal-text to form one logical > > paragraph). I'm not sure it should work but it indicates the > > big issue -- the model that a paragraph contains no other > > elements must be abandoned to support this concept. > > And, as I said, I don't think it's worth the effort even if it were > feasible (which I doubt).The computer doesn't really care that > >

Beginning of paragraph

>
    >
  • item one
  • >
  • item two
  • >
>

continuation of paragraph.

> > is a single logical paragraph (not to mention *this* paragraph!). > The human reader picks it up right away though.Only if there's a > first-line paragraph indent would it matter to the reader. > Good points. But that quite precludes rST as a good typesetting medium. I think also that the vertical spacing differs (para/list spacing smaller than inter-para). > > .. [2] Unrelated question: when should I use literal text (``), > > interpreted text (`) and no quoting? > > [snip] > > When there is a use for them, the docs will discuss them.Until > then, it would just be confusing.It already *is* confusing, some > would say. > That's why I asked :-) -- Beni Cherniavsky From cben@techunix.technion.ac.il Mon Dec 16 11:17:35 2002 From: cben@techunix.technion.ac.il (Beni Cherniavsky) Date: Mon, 16 Dec 2002 13:17:35 +0200 (IST) Subject: [Doc-SIG] reST block quotes In-Reply-To: <3DFD58E4.1040005@escape.com> Message-ID: On 2002-12-15, fantasai wrote: > Beni Cherniavsky wrote: > > > > Then note that "-- " is the standard singnature separtor.Since then it's > > alone on the line, this is not an issue, just a point to document. > > > > Also note that I use " -- " for a long dash -- probably a LaTeX-induced > > habit; I saw some other people writing so.Stupid word wrapping can well > > put "-- " at the beginning of a line in running text.Again not an issue, > > just document that "-- " must come after an empty line (?). > > Would using three dashes solve these problems? > Yes but there is no big need. These are not real problems, they only make the recongnition rules more subtle. > > About email reading, also note that ">>> " becomes ambiguos between > > doctest blocks and some email clients that compact nested "> " quoting by > > omiting the spaces. > > Yes, that is true. That means either quoted blocks would > have to be implemented as an option, defaulting to 'off' > for backwards-compatability, or at least one space must > be required between quote characters. > Another option, not completely automatic but easy to use: +## +## Any bogus quoting style is recognized as such by a line before and/or +## after the paragraph the contains only the quoting string (which must +## be non-alphabetic, I don't see a good way to accomodate "FOO> "). + Nested quotes are recognized, generalizing the current mechanism. + Trouble begins when breaking nested quotes (assume I wanted to place a non-quoted comment between ...). and Nested... -- they won't be recognized as nested. In such (all?) cases, demand a space between the quoting levels ("+ >#"). There is an ambiguity with lists => outlaw empty list items. Diverectives can be implemented for declaring certain quoting style to have some meaning (e.g. "# " == Python comments). > Requiring at least one space before the quote character > might not be a bad idea. It improves readability IMO. > But most mailers don't do it and manually converting is a huge pain. > > Also the "On Someday, Random Writer wrote:" is probably an > > attribution too. > > It is, but it's not practical to parse that since > people use so many different formats. It would have > to be treated as a paragraph, which really isn't > that bad. > Agreed. > > Now how do you handle a quote that's broken in the middle and resumed? > > As multiple blockquotes. How would you do it with the > current syntax? > OK. Just take care that different parts of an interrupted quote are at the same nesting level (space compation is evil in this respect and should probably be outlawed). -- Beni Cherniavsky From goodger@python.org Tue Dec 17 02:36:37 2002 From: goodger@python.org (David Goodger) Date: Mon, 16 Dec 2002 21:36:37 -0500 Subject: [Doc-SIG] compact HTML output from Docutils In-Reply-To: <3DFD75AB.3040404@escape.com> Message-ID: fantasai wrote: > It would be more flexible for the author if you based the > omission of

tags in single-paragraph lists on whether > the list is spaced out or not. For example, this: I'm happy with the current behavior, but that may have potential. I'll add it as a "to do?" item and await a patch. > Another option is to trigger

tags only for multi-line > paragraphs. Too arbitrary IMO. > Have you tried using p:first-child? It would be nice to > avoid cruft like 'class="first"'. I did try it, but it didn't do what I needed. The 'class="first"' isn't added in all contexts, but selectively. A 'p:first-child' style isn't selective. I don't remember the details, but a lot of things were tried, and I feel the current code works best. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@python.org Tue Dec 17 02:37:40 2002 From: goodger@python.org (David Goodger) Date: Mon, 16 Dec 2002 21:37:40 -0500 Subject: [Doc-SIG] reST block quotes In-Reply-To: Message-ID: Beni Cherniavsky wrote: > I meant that blockquotes/defnitions *which are bullted lists* are > infrequent. And I understood it in that way. That situation is not rare; I've written definition list items where the definition is just a bullet list. Block quotes containing just a list are also quite common, depending on writing style, and are the subject of a "to-do?" item: * Allow for variant styles by interpreting indented lists as if they weren't indented? ... See . > I propose to: ... > 2. Allow the indented non-separated list as an equivallent syntax > alternative. See what people end up using more. It's not ideal > but I prefer it over the current. Sorry, I don't. There's not enough value to justify the cost. > - It will be ambiguos sometimes with a definition list. If > something is too subtle, it's the definition list (IMHO) so I > propose to require some marker at the end of the definition > term. I don't want to add complexity to a construct which currently works fine, just to enable a dubious optimization. > - This allows for more than one line of definition terms. > Could be abused then for e.g. Q&A which I'm not sure is > good. There are related entries in the to-do: * Allow very long titles (on two or more lines)? * And for the sake of completeness, should definition list terms be allowed to be very long (two or more lines) also? They'll stay in the to-do list until somebody presents a good enough case for their implementation (and ideally, a patch as well). >> The debate over physical model (a paragraph is a block in the >> document flow) vs. logical model (paragraphs can contain lists and >> block quootes and equations and others) has been around for a long >> time and I don't see any resolution. > > OK, I'll search the archives. The debate I mentioned wasn't on this list (at least, not exclusively). It's a debate among document system designers that's been going on as long as there has been markup (XML, SGML, GML before it, perhaps others). DocBook chose one path, HTML and OpenOffice.org and Docutils another. >> Personally, I prefer the physical model, not least because it >> results in a much simpler DTD. The logical model opens up a big >> can of worms. > > That's true. I want that can :) -- but only it it could be > represented in a clean way. The can of worms is that the logical model *cannot* be represented in a clean way. > The big problem is that it takes away too much of the inter-line > spacing freedom. People won't observe it because indeed the human > reader can see it anyway. I think the current situation is a good compromise (StructuredText required blank lines between *each* list item!). I think blank lines help readability, and omitting them harms it. > Good points. But that quite precludes rST as a good typesetting > medium. reStructuredText is a limited medium. Every markup strikes a balance between readability and functionality; reStructuredText is heavy on readability. If your functionality needs are greater than what it provides, there are plenty of other choices out there. We can't have everything. > I think also that the vertical spacing differs (para/list > spacing smaller than inter-para). If I understand you correctly, that's a rendering issue. I'm not sure I do understand correctly though; please elaborate more & include examples in future. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From cben@techunix.technion.ac.il Tue Dec 17 13:18:52 2002 From: cben@techunix.technion.ac.il (Beni Cherniavsky) Date: Tue, 17 Dec 2002 15:18:52 +0200 (IST) Subject: [Doc-SIG] reST block quotes In-Reply-To: Message-ID: [I'm on the list, no need to cc: me] On 2002-12-16, David Goodger wrote: > Beni Cherniavsky wrote: > > I meant that blockquotes/defnitions *which are bullted lists* are > > infrequent. > > And I understood it in that way.That situation is not rare; I've > written definition list items where the definition is just a bullet > list.Block quotes containing just a list are also quite common, > depending on writing style, and are the subject of a "to-do?" item: > Note that since the presence/absence of the empty lines is not going to become meaningful (I agree that logical paragraphs are unexpressible in rST), the will be no problem with blockquotes being lists. Still it would make the rules even more subtle. The comment by Edward D. Loper (in http://mail.python.org/pipermail/doc-sig/2001-April/001793.html): > No indentation is necessary. I suggest that if there *is* > indentation, an alternate interpretation is possible. When I read them, *I* don't interpret them differently (as an uninitiated reader). is correct even now. At least "every indented thing is a blockquote" (modulo ::) is a simple rule to understand... Consider the indeted list idea more or less withdrawn. > * Allow for variant styles by interpreting indented lists as if > they weren't indented? ... > > See . > That's where I saw the idea before! :-) > > I propose to: > ... > >2. Allow the indented non-separated list as an equivallent syntax > > alternative. See what people end up using more. It's not ideal > > but I prefer it over the current. > > Sorry, I don't.There's not enough value to justify the cost. > All right, I'll defer to your judgment. > >> The debate over physical model (a paragraph is a block in the > >> document flow) vs. logical model(paragraphs can contain lists and > >> block quootes and equations and others) has been around for a long > >> time and I don't see any resolution. > > > > OK, I'll search the archives. > > The debate I mentioned wasn't on this list (at least, not > exclusively).It's a debate among document system designers that's > been going on as long as there has been markup (XML, SGML, GML before > it, perhaps others).DocBook chose one path, HTML and OpenOffice.org > and Docutils another. > Oh, that's to big to read ;-). But I can figure it out from the resulting formats. Docbook's choice is obvious -- the logical model is richer. > The can of worms is that the logical model *cannot* be represented in > a clean way. > Accepted. > > Good points.But that quite precludes rST as a good typesetting > > medium. > > reStructuredTextis a limited medium.Every markup strikes a balance > between readability and functionality; reStructuredText is heavy on > readability.If your functionality needs are greater than what it > provides, there are plenty of other choices out there.We can't have > everything. > Of course. Just remembered reading somebody hoping to write his next book in rST, so I thought that would be posible. Quite naturally, it can't really be expected (except for an rST write -> convert -> polish layout in target format)... > > I think also that the vertical spacing differs (para/list > > spacing smaller than inter-para). > > If I understand you correctly, that's a rendering issue.I'm not sure > I do understand correctly though; please elaborate more & include > examples in future. > LaTeX pseudo-screenshot (unconfirmed, memory/imagination mix ;):: Standalone logical paragraph. (1) * List that's outside the paragraph. (2) * Another item. (3) Another logical paragraph, containing: (4) * this list, (5) and continuing here. The vertical spacings (1) and (3) are spaces between separate logical paragraphs [1]_. The spacings (4) and (5) are around a list inside a paragraph. These need not be equal and determining which spacing to use is impossible without knowing logical paragraph structure. .. [1] this is not that simple in LaTeX which uses first-line indent with no extra vertical space between paragraphs. So (1) and (3) are forced by the minimal vertical spacing in a list and it might indeed be equal to (4) and (5). However other typesetting style might hav ea difference (they most probably will, if they don't have use first-line indentation). -- Beni Cherniavsky From akuchlin@mems-exchange.org Tue Dec 17 13:43:28 2002 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Tue, 17 Dec 2002 08:43:28 -0500 Subject: [Doc-SIG] Adding new inline markup? Message-ID: RST only supports a few different inline markup notations, such as * and ** for emphasis, ` for interpreted things, &c. For my application I'd like to add some more inline markups, such as /cited text/. Should I expect to be able to subclass Inliner in order to add new notations? Right now that's rather messy. Inliner.dispatch is a dictionary mapping symbols to handler methods, but the large regular expression stored as Inliner.parts doesn't take this into account. First, is there some other, better, way of adding new inline notations? Second, if not, should it be possible by subclassing Inliner? A possible way of handling this would be to add an internal method _get_initial_pattern() to the Inliner class that synthesized the 'parts' regular expression, using self.dispatch.keys() to match all the listed inline markups. Inliner is fairly complicated, though, so maybe there are additional changes that would be necessary. --amk (www.amk.ca) It was astonishing the number of useless things people found to do. -- R.H. Barlow and H.P. Lovecraft, "The Night Ocean" From cben@techunix.technion.ac.il Tue Dec 17 13:51:15 2002 From: cben@techunix.technion.ac.il (Beni Cherniavsky) Date: Tue, 17 Dec 2002 15:51:15 +0200 (IST) Subject: [Doc-SIG] reST block quotes In-Reply-To: Message-ID: On 2002-12-16, Beni Cherniavsky wrote: > On 2002-12-15, fantasai wrote: > > > Beni Cherniavsky wrote: > > > Requiring at least one space before the quote character > > might not be a bad idea. It improves readability IMO. > > > But most mailers don't doit and manually converting is a huge pain. > > >> Also the "On Someday, Random Writer wrote:" is probably an > >> attribution too. > > > > It is, but it's not practical to parse that since > > people use so many different formats. It would have > > to be treated as a paragraph, which really isn't > > that bad. > > > Agreed. > The more I think about an email reader the harder it looks. There are two possible goals: - Provide a slightly modified rST syntax that would making writing rST in emails more convenient (and maybe other goodies like processing the headers). This is surely a goal. - Simplify as much as possible convertions of mail written by rST-unaware people, or people writing half-rST mails (I do, especially when mailing without relation to Python). This goal is very desired but can conflict with the first goal. First, what's inconvenient with writing rST in email as it is now? Nothing critical (except the quoting) but there are little issues here and there. One that I constantly experience is that indenting things in Pine is so inconvenient. True, I should switch to an external editor but I probably prefer to accumulate the pain until it'll itch enough to write my own with proper rST support (either an emacs mode or something standalone). Don't hold your breath. I do think that some indentation demands could be relaxed, in particular literal blocks should be allowed in column 0 (i.e. even with negative indent!). That would simplify cut-and-paste, in and out of the mail (e.g. literal Pythom code can't be easily pasted into the interpreter prompt if indented; doctest style is even worse). The second goal opens a can of worms -- free text is too free to parse automatically and needs corrections. A good rST editor could ease the editing but another possible approach is allowing many modes in the reader, so you mark a whole message accrding to the style it's written in. In the long run this is only useful if a normaling rST->rST convertion is implemented... BTW, maybe a generic syntax diagram parser [generator] would be useful to rapidly experiment with rST syntax variations. Getting automatic "indent/reduce conflict" reporting would be very cool :-). -- Beni Cherniavsky From aahz@pythoncraft.com Tue Dec 17 15:22:58 2002 From: aahz@pythoncraft.com (Aahz) Date: Tue, 17 Dec 2002 10:22:58 -0500 Subject: [Doc-SIG] Adding new inline markup? In-Reply-To: References: Message-ID: <20021217152258.GA20148@panix.com> [This is probably better handled on docutils-developers, but I'll let David make that decision.] On Tue, Dec 17, 2002, Andrew Kuchling wrote: > > RST only supports a few different inline markup notations, such as * > and ** for emphasis, ` for interpreted things, &c. For my application > I'd like to add some more inline markups, such as /cited text/. Why can't you use ` for cited text? Remember that you're allowed to have different kinds of interpreted text:: :cite:`cited text` -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "To me vi is Zen. To use vi is to practice zen. Every command is a koan. Profound to the user, unintelligible to the uninitiated. You discover truth everytime you use it." --reddy@lion.austin.ibm.com From akuchlin@mems-exchange.org Tue Dec 17 15:50:09 2002 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Tue, 17 Dec 2002 10:50:09 -0500 Subject: [Doc-SIG] Adding new inline markup? In-Reply-To: <20021217152258.GA20148@panix.com> References: <20021217152258.GA20148@panix.com> Message-ID: <20021217155009.GA16023@ute.mems-exchange.org> On Tue, Dec 17, 2002 at 10:22:58AM -0500, Aahz wrote: >Why can't you use ` for cited text? Remember that you're allowed to >have different kinds of interpreted text:: > :cite:`cited text` Oh, I wasn't aware of that! Thanks! It would be marginally easier for the intended audience if a simpler notation like /cited/ was possible, but I can live with using the role notation, and it means I don't need to pick more typographic symbols for everything. (%per se% for foreign text, @DARPA@ for acronyms, ad nauseam...) --amk (www.amk.ca) LaTeX2HTML is pain. -- Fred Drake in a documentation checkin message, 14 Mar 2000 From aahz@pythoncraft.com Tue Dec 17 17:38:08 2002 From: aahz@pythoncraft.com (Aahz) Date: Tue, 17 Dec 2002 12:38:08 -0500 Subject: [Doc-SIG] Adding new inline markup? In-Reply-To: <20021217155009.GA16023@ute.mems-exchange.org> References: <20021217152258.GA20148@panix.com> <20021217155009.GA16023@ute.mems-exchange.org> Message-ID: <20021217173808.GA11636@panix.com> On Tue, Dec 17, 2002, Andrew Kuchling wrote: > On Tue, Dec 17, 2002 at 10:22:58AM -0500, Aahz wrote: >> >>Why can't you use ` for cited text? Remember that you're allowed to >>have different kinds of interpreted text:: >> :cite:`cited text` > > Oh, I wasn't aware of that! Thanks! It would be marginally easier > for the intended audience if a simpler notation like /cited/ was > possible, but I can live with using the role notation, and it means I > don't need to pick more typographic symbols for everything. (%per se% > for foreign text, @DARPA@ for acronyms, ad nauseam...) For acronyms use | (pipe), assuming you want it expanded. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "To me vi is Zen. To use vi is to practice zen. Every command is a koan. Profound to the user, unintelligible to the uninitiated. You discover truth everytime you use it." --reddy@lion.austin.ibm.com From goodger@python.org Wed Dec 18 00:53:33 2002 From: goodger@python.org (David Goodger) Date: Tue, 17 Dec 2002 19:53:33 -0500 Subject: [Doc-SIG] reST block quotes In-Reply-To: Message-ID: Beni Cherniavsky wrote: > [I'm on the list, no need to cc: me] Just a common courtesy on lists. I *will not remember* to remove your address in future. Might as well give up now ;) > Just remembered reading somebody hoping to write his next book in > rST, so I thought that would be posible. That would be Aahz. As far as I know, he still is using Docutils for his book, via OpenOffice.org (confirm/deny, Aahz?). > Quite naturally, it can't really be expected (except for an rST > write -> convert -> polish layout in target format)... Depends on the complexity of the document. I wouldn't be surprised if some tweaking were necessary. However Docutils is still young and flexible. >>> I think also that the vertical spacing differs (para/list >>> spacing smaller than inter-para). >> >> If I understand you correctly, that's a rendering issue.I'm not >> sure I do understand correctly though; please elaborate more & >> include examples in future. >> > LaTeX pseudo-screenshot (unconfirmed, memory/imagination mix ;):: Thanks for the explanation. Indeed, a rendering issue. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@python.org Wed Dec 18 00:54:42 2002 From: goodger@python.org (David Goodger) Date: Tue, 17 Dec 2002 19:54:42 -0500 Subject: [Doc-SIG] Adding new inline markup? In-Reply-To: <20021217173808.GA11636@panix.com> Message-ID: [Andrew Kuchling] > RST only supports a few different inline markup notations, such as * > and ** for emphasis, ` for interpreted things, &c. For my > application I'd like to add some more inline markups, such as /cited > text/. [Aahz] > [This is probably better handled on docutils-developers, but I'll > let David make that decision.] No biggie. Overlap is inevitable. > Why can't you use ` for cited text? Remember that you're allowed to > have different kinds of interpreted text:: > > :cite:`cited text` [Andrew Kuchling] > Oh, I wasn't aware of that! Thanks! It would be marginally easier > for the intended audience if a simpler notation like /cited/ was > possible, but I can live with using the role notation, and it means > I don't need to pick more typographic symbols for everything. (%per > se% for foreign text, @DARPA@ for acronyms, ad nauseam...) The intention of interpreted text roles is to allow new inline descriptive markup, with the simultaneous advantage and disadvantage of being explicit. If your application has one "main" role, that can be the default (i.e. no explicit role required, just `backquotes`). This area hasn't been explored much nor has any support code been written. For example, I'm not sure when to validate roles and process the interpreted text: in the parser, in the reader, or in a transform. It could be that the "interpreted" element may disappear from the Docutils internal doctree, just as the "directive" element did. [Aahz] > For acronyms use | (pipe), assuming you want it expanded. Pipes are used for |substitutions|, which are like inline directives, allowing graphics and arbitrary constructs within text. Replacing an acronym with its full text is one application. See . [Andrew Kuchling] > Should I expect to be able to subclass Inliner in order to add new > notations? If necessary, yes, but it hasn't been necessary yet so that functionality hasn't been added (XP's "add no functionality before its time"). > Right now that's rather messy. Inliner.dispatch is a > dictionary mapping symbols to handler methods, but the large regular > expression stored as Inliner.parts doesn't take this into account. ... > A possible way of handling this would be to add an internal method > _get_initial_pattern() to the Inliner class that synthesized the > 'parts' regular expression, using self.dispatch.keys() to match all > the listed inline markups. Way ahead of you ;). Look again, and you'll see that ``Inliner.parts`` isn't a regexp, it's a data structure that's used to synthesize a regexp. ``Inliner.patterns.initial`` is built by the ``build_regexp`` function (which see for a description of the data structure). This issue *has* come up before, WRT embedded URIs, and although the support wasn't used for that, it did simplify the regular expression (you should've seen it before!). A subclass should be able to extend (or replace) this data structure and re-synthesize the regexp. > Inliner is fairly complicated, though, so maybe there are additional > changes that would be necessary. Probably :). Limitations are often discovered when the code is exercised in novel and interesting ways. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@python.org Wed Dec 18 00:55:56 2002 From: goodger@python.org (David Goodger) Date: Tue, 17 Dec 2002 19:55:56 -0500 Subject: [Doc-SIG] Email Reader (was Re: reST block quotes) In-Reply-To: Message-ID: Beni Cherniavsky wrote: > The more I think about an email reader the harder it looks. :) > I do think that some indentation demands could be relaxed, in > particular literal blocks should be allowed in column 0 (i.e. even > with negative indent!). How would you know when they end? > The second goal opens a can of worms -- free text is too free to > parse automatically and needs corrections. I suspect that an actual application for an Email Reader must present itself before we can make these decisions. IOW, what's the use case? I don't know that there's much value in supporting email in general, but I do know there would be much pain. > BTW, maybe a generic syntax diagram parser [generator] would be > useful to rapidly experiment with rST syntax variations. Sounds cool, and non-trivial :) > Getting automatic "indent/reduce conflict" reporting would be very > cool :-). Again, not following you. Not enough words! -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From aahz@pythoncraft.com Wed Dec 18 04:20:28 2002 From: aahz@pythoncraft.com (Aahz) Date: Tue, 17 Dec 2002 23:20:28 -0500 Subject: [Doc-SIG] reST block quotes In-Reply-To: References: Message-ID: <20021218042027.GA1679@panix.com> On Tue, Dec 17, 2002, David Goodger wrote: > Beni Cherniavsky wrote: >> >> Just remembered reading somebody hoping to write his next book in >> rST, so I thought that would be posible. > > That would be Aahz. As far as I know, he still is using Docutils for > his book, via OpenOffice.org (confirm/deny, Aahz?). Yup, still moving forward, though more slowly than I'd like on the content side. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "To me vi is Zen. To use vi is to practice zen. Every command is a koan. Profound to the user, unintelligible to the uninitiated. You discover truth everytime you use it." --reddy@lion.austin.ibm.com From cben@techunix.technion.ac.il Wed Dec 18 17:01:08 2002 From: cben@techunix.technion.ac.il (Beni Cherniavsky) Date: Wed, 18 Dec 2002 19:01:08 +0200 (IST) Subject: [Doc-SIG] Re: Email Reader (was Re: reST block quotes) In-Reply-To: Message-ID: On 2002-12-17, David Goodger wrote: > Beni Cherniavsky wrote: > > I do think that some indentation demands could be relaxed, in > > particular literal blocks should be allowed in column 0 (i.e. even > > with negative indent!). > > How would you know when they end? > Forgot to say, until the first blank line. Crude but can simplify typing in many cases of short literal fragments... > > The second goal opens a can of worms -- free text is too free to > > parse automatically and needs corrections. > > I suspect that an actual application for an Email Reader must present > itself before we can makethese decisions.IOW, what's the use case? > I don't know that there's much value in supporting email in general, > but I do know there would be much pain. > :) Interesting that I didn't think of it before. I have no good application in my head :). THe only one is taking interesting half-rST emails I write and easily converting them to standalone valid rST articles. This could be done with a sloppy/bendable rST reader fed to an rST writer, to normilize the text... > > BTW, maybe a generic syntax diagram parser [generator] would be > > useful to rapidly experiment with rST syntax variations. > > Sounds cool, and non-trivial :) > True. I do have a scheme for only inline markup and bullets, which would define the markup->xml mappings on the spot (one-time or scoped, with attribute merging). Something like::

==== Heading ==== {| A paragraph. |} {- - List item with *emphasized* text. :: *em* - Another item containing `a link`_. :: `"http://target/url"`_ -}
This is largely inspired by rST's style of putting long markup data outside of the text (e.g. link targets, substitions). I already think I know how to interpret this, I just need to find time and write it :). Than I will have a very handy xml typing notation. > > Getting automatic "indent/reduce conflict" reporting would be very > > cool :-). > > Again, not following you.Not enough words! > That was suppossed to be a joking reference to Yacc's "shift/reduce conflict" messages... -- Beni Cherniavsky From goodger@python.org Thu Dec 19 01:17:08 2002 From: goodger@python.org (David Goodger) Date: Wed, 18 Dec 2002 20:17:08 -0500 Subject: [Doc-SIG] Python Reader module parser now usable Message-ID: The first part of the Docutils Python Source Reader component is in a usable form: . It takes a module's text (a string) and converts it into a documentation-oriented tree. Assignments/attributes, functions, classes, and methods are all converted. Arbitrarily complex right-hand sides of assignments (including default parameter values) are supported by parsing tokens from tokenize.py. Comments are not handled yet, and namespaces are not computed. There's a list of open issues at the end of the module docstring. I've also added a "showdoc" script to test/test_readers/test_python which processes input from the test_parser.py module, or stdin, depending on how it's called. Please play with these; any input is welcome. Here's a sample. Given this module as input:: # comment """Docstring""" """Additional docstring""" __docformat__ = 'reStructuredText' a = 1 """Attribute docstring""" class C(Super): """C's docstring""" class_attribute = 1 """class_attribute's docstring""" def __init__(self, text=None): """__init__'s docstring""" self.instance_attribute = (text * 7 + ' whaddyaknow') """instance_attribute's docstring""" def f(x, # parameter x y=a*5, # parameter y *args): # parameter args """f's docstring""" return [x + item for item in args] f.function_attribute = 1 """f.function_attribute's docstring""" Here's the output tree, with objects converted to the pseudo-XML form I'm fond of:: Docstring Additional docstring 'reStructuredText' 1 Attribute docstring C's docstring 1 class_attribute's docstring __init__'s docstring None (text * 7 + ' whaddyaknow') instance_attribute's docstring f's docstring a * 5 1 f.function_attribute's docstring -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From b.fallenstein@gmx.de Thu Dec 19 17:52:52 2002 From: b.fallenstein@gmx.de (Benja Fallenstein) Date: Thu, 19 Dec 2002 18:52:52 +0100 Subject: [Doc-SIG] Email Reader (was Re: reST block quotes) In-Reply-To: References: Message-ID: <3E020774.6010104@gmx.de> Hi David, hi Beni, David Goodger wrote: >Beni Cherniavsky wrote: > > >>The second goal opens a can of worms -- free text is too free to >>parse automatically and needs corrections. >> >> > >I suspect that an actual application for an Email Reader must present >itself before we can make these decisions. IOW, what's the use case? >I don't know that there's much value in supporting email in general, >but I do know there would be much pain. > I've been idly pondering ReST email before, so I have my own ideas about this :-) I don't think that parsing arbitrary incoming emails as ReST makes much sense-- it'd be like running all .txt files on my harddisk through the ReST tools. I wouldn't want to use this on any emails but those explicitly written as ReST. The applications I envision are two flavors of email user agent (for reading & writing email): - A text-based client, which would provide tools for composing ReST e-mails, most notably a syntax validator (so that you don't accidentally send emails that cannot be read because they aren't ReST); this would send emails as text/vnd.python.rst or so. - A graphical client, which would render text/vnd.python.rst e-mails graphically, and allow for graphical composing of ReST e-mails, like Mozilla's HTML email composer. This would finally allow me to use hyperlinks, variable-width fonts, italics etc.pp. in composing & reading e-mail, without worrying about unreadable font size settings, people complaining about HTML mail bloat, people turning off HTML mail (like me :) ), or readability on text terminals; instead, I'd know that any email client would get something it can read, w/o bloat, and people with a graphical ReST mail reader would see the mail the way I typed it (modulo preferences like font or text size). The difficult question is how to do quoting; I think (contrary to what some people seem to be saying here) that quoting should not generally be done through literal blocks-- I think quoting of non-ReST mails should be, but quoting of ReST mails should preserve the formatting of the quoted mail. Also I think that quoting should be done through the commonplace ">" syntax; this is simply the most wide-spread variant, and forcing something else down peoples' throats seems wrong. To disambiguify from doctest blocks, choosing the ">" + space syntax as required for ReSTmail seems good. To discriminate between literal and non-literal quotings, I'd suggest: I wrote: > foo > bar for non-literal quoting, and I wrote:: > foo > bar for literal quoting. More to the point, the rules would be as follows: - A block where each line is prefixed by "> " (angle bracket + space) is *quoted*. - A block where all but the first line is quoted by "> " is *quoted with source given*. - If the first line in a "quoted with source given" block ends in a double colon, this quoted block and all following quoted blocks *without* source given would also be literal blocks. - If the first line does not end in a double colon, this quoted block and all following ones w/o source would just be quoted, not literal blocks. - (If there's a quoted block w/o source, and there is no quoted block with source above, that block would be just quoted, not literal.) Here's an example: === snip === > Blabla Foo wrote:: > Bla, *baz* > Foo Then, bar wrote: > Foo, *blaabla*! === /snip === This could then become something like: === snip === Blabla Foo wrote: Bla, *baz* Foo Then, bla wrote: Foo, blablaa! === /snip === - Benja From cben@techunix.technion.ac.il Thu Dec 19 18:57:53 2002 From: cben@techunix.technion.ac.il (Beni Cherniavsky) Date: Thu, 19 Dec 2002 20:57:53 +0200 (IST) Subject: [Doc-SIG] Email Reader (was Re: reST block quotes) In-Reply-To: <3E020774.6010104@gmx.de> Message-ID: On 2002-12-19, Benja Fallenstein wrote: > I've been idly pondering ReST email before, so I have my own ideas about > this :-) > > I don't think that parsing arbitrary incoming emails as ReST makes much > sense-- it'd be like running all .txt files on my harddisk through the > ReST tools. I wouldn't want to use this on any emails butthose > explicitly written as ReST. > True. But problems can arise with: 1. A thread where some people write rST and some don't. The quoted illegal parts can easily spoil the rST context for the rest. Literal quoting for non-rST parts doesn't solve this entirely because the rST messages quoted by the non rST parts lose their rST information... 2. Sloppy mail. I don't see myself writing 100% correct rST all the time and I don't want it pushed down my throat. If that's the choice, I'd be more happy with writing rST as I see fit and using only the eyeball reader ;-). 3. Once mail is sent, it generally can't be edited (since it's already archived and people already hav it in their mailboxes). I'm not sure if anything can be done about it but it's a pecularity of email that should be remembered. Something like treating every paragraph that doesn't parse as literal would solve some of the issues. > The applications I envision are two flavors of email user agent (for > reading & writing email): > *Breakage ahead*: your list is broken by line folding. Unexpected line folding in the places where you least want it is all too popular with email tool writers ;-(. Good rST that you write will end up broken for some readers sooner or later. The "lazy indentation" ideas from the to-do list could help here a bit. > - A text-based client, which would provide tools for composing ReST > e-mails, most notably a syntax validator (so that you don't accidentally > send emails that cannot be read because they aren't ReST); this would > send emails as text/vnd.python.rst or so. I want such a tool. I feel more need for editing features than for validation (all right, one day there will be an emacs mode). Of course it would reflow paragraphs instead of broken line folding :-). I'm not sure I like text/vnd.python.rst. I don't want another text/html. Most mail readers will probably complain that they don't know how to read it. Sending identical content both as plain text and rST is more stupid than sending text version of html mail :). I think it should be marked as plain text and be autodetected simply by validating. > - A graphical client, which would render text/vnd.python.rst e-mails > graphically, and allow for graphical composing of ReST e-mails, like > Mozilla's HTML email composer. > > This would finally allow me to use hyperlinks, variable-width fonts, > italics etc.pp. in composing & reading e-mail, without worrying about > unreadable font size settings, people complaining about HTML mail bloat, > people turning off HTML mail (like me :) ), or readability on text > terminals; instead, I'd know that any email client would get something > it can read, w/o bloat, and people with a graphical ReST mail reader > would see themail the way I typed it (modulo preferences like font or > text size). > > The difficult question is how to do quoting; I think (contrary to what > some people seem to be saying here) that quoting should not generally be > done through literal blocks-- I think quoting of non-ReST mails should > be, but quoting of ReST mails should preserve the formatting of the > quoted mail. Who argued that? IIRC, all the discussion assumed that the quoted block can be literal or not depending on presence of ``::``, similarly to the way it currently behaves with indented blocks. > Also I think that quoting should be done through the > commonplace ">" syntax; this is simply the most wide-spread variant, and > forcing something else down peoples' throats seems wrong. To > disambiguify from doctest blocks, choosing the ">" + space syntax as > required for ReSTmail seems good. > > To discriminate between literal and non-literal quotings, I'd suggest: > > I wrote: > > foo > > bar > > for non-literal quoting, and > > I wrote:: > > foo > > bar > > for literal quoting. > > More to the point, the rules would be as follows: > > - A block where each line is prefixed by "> " (angle bracket + space) is > *quoted*. > - A block where all but the first line is quoted by "> " is *quoted with > source given*. > - If the first line in a "quoted with source given" block ends in a > double colon, this quoted block and all following quoted blocks > *without* source given would also be literal blocks. > - If the first line does not end in a double colon, this quoted block > and all following ones w/o source would just be quoted, not literal blocks. > - (If there's a quoted block w/o source, and there is no quoted block > with source above, that block would be just quoted, not literal.) > I think I like this. -- Beni Cherniavsky From b.fallenstein@gmx.de Thu Dec 19 21:51:26 2002 From: b.fallenstein@gmx.de (Benja Fallenstein) Date: Thu, 19 Dec 2002 22:51:26 +0100 Subject: [Doc-SIG] Email Reader (was Re: reST block quotes) In-Reply-To: References: Message-ID: <3E023F5E.7090009@gmx.de> Hiya, Beni Cherniavsky wrote: >On 2002-12-19, Benja Fallenstein wrote: > > >>I've been idly pondering ReST email before, so I have my own ideas about >>this :-) >> >>I don't think that parsing arbitrary incoming emails as ReST makes much >>sense-- it'd be like running all .txt files on my harddisk through the >>ReST tools. I wouldn't want to use this on any emails butthose >>explicitly written as ReST. >> >> >> >True. But problems can arise with: > >1. A thread where some people write rST and some don't. The quoted > illegal parts can easily spoil the rST context for the rest. Literal > quoting for non-rST parts doesn't solve this entirely because the rST > messages quoted by the non rST parts lose their rST information... > Yes, but somehow I'm inclined to think that wouldn't be all that bad... my feeling is that trying to guess the quoted ReST from a non-ReST mail is simply to hard and error-prone to be worthwhile... >2. Sloppy mail. I don't see myself writing 100% correct rST all the time > and I don't want it pushed down my throat. If that's the choice, I'd > be more happy with writing rST as I see fit and using only the eyeball > reader ;-). > Hmmmmm... I can sympathize, but I'm not sure how to deal with it. I have a dislike for guesswork on the part of the parser... Maybe what's needed is a 'ReST Tidy' (like w3c's HTML Tidy): A program taking sloppy ReST input and trying to make it into real ReST. -- Could this run on the sender's computer, so that they can see what the output'll look like? >3. Once mail is sent, it generally can't be edited (since it's already > archived and people already hav it in their mailboxes). I'm not sure > if anything can be done about it but it's a pecularity of email that > should be remembered. > What's the issue here? (BTW, email is quite often edited when being quoted :) ) >Something like treating every paragraph that doesn't parse as literal >would solve some of the issues. > This could potentially be done by the ReST Tidy tool... OTOH, aren't the paragraphs where you actually used some ReST-like markup most likely to contain 'sloppiness'? >>The applications I envision are two flavors of email user agent (for >>reading & writing email): >> >> >> >*Breakage ahead*: your list is broken by line folding. Unexpected line >folding in the places where you least want it is all too popular with >email tool writers ;-(. Good rST that you write will end up broken for >some readers sooner or later. The "lazy indentation" ideas from the to-do >list could help here a bit. > Yes, but so is 'text/plain; format=flowed' [RFC2646]. It gets around the problem by requiring (IIRC) that compliant user agents shall not fold lines, but reflow instead-- ReST mail could do that too. (Mozilla supports it. It seems to work in most cases, and the remaining ones can be glossed over.) >>- A text-based client, which would provide tools for composing ReST >>e-mails, most notably a syntax validator (so that you don't accidentally >>send emails that cannot be read because they aren't ReST); this would >>send emails as text/vnd.python.rst or so. >> >> > >I want such a tool. I feel more need for editing features than for >validation (all right, one day there will be an emacs mode). Of course it >would reflow paragraphs instead of broken line folding :-). > Yup :-) >I'm not sure I like text/vnd.python.rst. I don't want another text/html. >Most mail readers will probably complain that they don't know how to read >it. > Hmm. I wouldn't suspect this, but I haven't tried. (The reason I would suspect it should work is that the RFCs say very clearly that you have to treat it as text/plain, and that's not so hard to implement. But of course we know how good standards compliance is in most systems... Does anybody have experience in how mail readers treat unknown text/* formats?) >Sending identical content both as plain text and rST is more stupid >than sending text version of html mail :). I think it should be marked as >plain text and be autodetected simply by validating. > Don't like it. ReST is *not* plain text to my mind. I don't ".. note::" things in plain text... Besides, I think text/vnd.python.rest is exactly how MIME types are supposed to work: Everybody who knows it treats it specially, everybody who doesn't sees it as plain text. That was the idea behind text/enriched and text/html-- except that SGML isn't very readable and people complained about the angle brackets in the e-mails they recieved... >>The difficult question is how to do quoting; I think (contrary to what >>some people seem to be saying here) that quoting should not generally be >>done through literal blocks-- I think quoting of non-ReST mails should >>be, but quoting of ReST mails should preserve the formatting of the >>quoted mail. >> >> > >Who argued that? IIRC, all the discussion assumed that the quoted block >can be literal or not depending on presence of ``::``, similarly to the >way it currently behaves with indented blocks. > Ok, I probably misunderstood something. Sorry. >>More to the point, the rules would be as follows: >> >>- A block where each line is prefixed by "> " (angle bracket + space) is >>*quoted*. >>- A block where all but the first line is quoted by "> " is *quoted with >>source given*. >>- If the first line in a "quoted with source given" block ends in a >>double colon, this quoted block and all following quoted blocks >>*without* source given would also be literal blocks. >>- If the first line does not end in a double colon, this quoted block >>and all following ones w/o source would just be quoted, not literal blocks. >>- (If there's a quoted block w/o source, and there is no quoted block >>with source above, that block would be just quoted, not literal.) >> >> >> >I think I like this. > > :-) - Benja From cben@techunix.technion.ac.il Sat Dec 21 21:40:20 2002 From: cben@techunix.technion.ac.il (Beni Cherniavsky) Date: Sat, 21 Dec 2002 23:40:20 +0200 (IST) Subject: [Doc-SIG] Email Reader (was Re: reST block quotes) In-Reply-To: <3E023F5E.7090009@gmx.de> Message-ID: On 2002-12-19, Benja Fallenstein wrote: > > Hiya, > > Beni Cherniavsky wrote: > > >1. A thread where some people write rST and some don't.The quoted > > illegal parts can easily spoil the rST context for the rest. Literal > > quoting for non-rST parts doesn't solve this entirely because the rST > > messages quoted by the non rST parts lose their rST information... > > [Ooops, pine's quoting of indented text is broken. Sorry :-] > Yes, but somehow I'm inclined to think that wouldn't be all that bad... > my feeling is that trying to guess the quoted ReST from a non-ReST mail > is simply to hard and error-prone to be worthwhile... > > >2. Sloppy mail.I don't see myself writing 100% correct rST all the time > > and I don't want it pushed down my throat. If that's the choice, I'd > > be more happy with writing rST as I see fit and using only the eyeball > > reader ;-). > > > Hmmmmm... I can sympathize, but I'm not sure how to deal with it. I have > a dislike for guesswork on the part of the parser... Maybe what's needed > is a 'ReST Tidy' (like w3c's HTML Tidy): A program taking sloppy ReST > input and trying to make it into real ReST. -- Could this run on the > sender's computer, so that they can see what the output'll look like? > +1 on rST Tidy. > >3. Once mail is sent, it generally can't be edited (since it's already > > archived and people already hav it in their mailboxes). I'm not sure > > if anything can be done about it but it's a pecularity of email that > > should be remembered. > > > What's the issue here? > Just that if after a broken rST is emitted, it's generally too late to tidy it :-). Thinking of it again, I see your validate-when-sending point of view. I agree - I would usually want to validate to avoid inconveninece for readers. > (BTW, email is quite often edited when being quoted :) ) > But you can't fix the master which get's quoted afresh in new braches of the thread... > >*Breakage ahead*: your list is broken by line folding.Unexpected line > >folding in the places where you least want it is all too popular with > >email tool writers ;-(.Good rST that you write will end up broken for > >some readers sooner or later.The "lazy indentation" ideas from the to-do > >list could help here a bit. > > > Yes, but so is 'text/plain; format=flowed' [RFC2646]. It gets around the > problem by requiring (IIRC) that compliant user agents shall not fold > lines, but reflow instead-- ReST mail could do that too. (Mozilla > supports it. It seems to work in most cases, and the remaining ones can > be glossed over.) > Wouldn't it break every single literal block containg Python code, too? > >I'm not sure I like text/vnd.python.rst.I don't want another text/html. > >Most mail readers will probably complain that they don't know how to read > >it. > > > > Hmm. I wouldn't suspect this, but I haven't tried. (The reason I would > suspect it should work is that the RFCs say very clearly that you have > to treat it as text/plain, and that's not so hard to implement. But of > course we know how good standards compliance is in most systems... Does > anybody have experience in how mail readers treat unknown text/* formats?) > OK, you've obviously read more RFCs than I. Glad to know it should work. > >Sending identical content both as plain text and rST is more stupid > >than sending text version of html mail :).I think it should be marked as > >plain text and be autodetected simply by validating. > > > Don't like it. ReST is *not* plain text to my mind. I don't ".. note::" > things in plain text... > I was only suggesting that because I didn't know any text/... should work. -- Beni Cherniavsky From b.fallenstein@gmx.de Sun Dec 22 19:28:51 2002 From: b.fallenstein@gmx.de (Benja Fallenstein) Date: Sun, 22 Dec 2002 20:28:51 +0100 Subject: [Doc-SIG] Email Reader (was Re: reST block quotes) In-Reply-To: References: Message-ID: <3E061273.3010306@gmx.de> Beni Cherniavsky wrote: >On 2002-12-19, Benja Fallenstein wrote: > > >>Beni Cherniavsky wrote: >> >>>3. Once mail is sent, it generally can't be edited (since it's already >>>archived and people already hav it in their mailboxes). I'm not sure >>>if anything can be done about it but it's a pecularity of email that >>>should be remembered. >>> >>> >>> >>What's the issue here? >> >> >> >Just that if after a broken rST is emitted, it's generally too late to >tidy it :-). > Ah, ok, I see your point now. > Thinking of it again, I see your validate-when-sending point >of view. I agree - I would usually want to validate to avoid >inconveninece for readers. > Cool, sounds like we are in agreement here :-) :-) >>(BTW, email is quite often edited when being quoted :) ) >> >> >> >But you can't fix the master which get's quoted afresh in new braches of >the thread... > Right. >>>*Breakage ahead*: your list is broken by line folding.Unexpected line >>>folding in the places where you least want it is all too popular with >>>email tool writers ;-(.Good rST that you write will end up broken for >>>some readers sooner or later.The "lazy indentation" ideas from the to-do >>>list could help here a bit. >>> >>> >>> >>Yes, but so is 'text/plain; format=flowed' [RFC2646]. It gets around the >>problem by requiring (IIRC) that compliant user agents shall not fold >>lines, but reflow instead-- ReST mail could do that too. (Mozilla >>supports it. It seems to work in most cases, and the remaining ones can >>be glossed over.) >> >> >> >Wouldn't it break every single literal block containg Python code, too? > Hmmm. I guess the 'right' way to handle this would be not to reflow literal blocks, and not to line-break them either (since the mail reader would know about ReST, it would be able to take care of this correctly). I think that the SMTP infrastructure generally doesn't add additional linebreaks, so this should work-- if some server breaks the lines, of course, that would destroy the ReST formatting, but I think they don't. Again, the point is that the mail reader must handle ReST correctly... The RFCs allow lines of up to 1000 characters, but recomment lines up to 80 characters because many mail readers show these better. I guess that literal blocks with >80 chars/line (or literal blocks inside quoted text etc.) are good cases for using >80 chars. >>>I'm not sure I like text/vnd.python.rst.I don't want another text/html. >>>Most mail readers will probably complain that they don't know how to read >>>it. >>> >>> >>> >>Hmm. I wouldn't suspect this, but I haven't tried. (The reason I would >>suspect it should work is that the RFCs say very clearly that you have >>to treat it as text/plain, and that's not so hard to implement. But of >>course we know how good standards compliance is in most systems... Does >>anybody have experience in how mail readers treat unknown text/* formats?) >> >> >> >OK, you've obviously read more RFCs than I. Glad to know it should work. > Ok :-) So far, our discussion suggests that we'd need: - a ReST Tidy - an extension of the ReST specification, for "> " quoting - a specification of ReST email: MIME type, how to handle reflowing when replying, possibly other issues if they come up - an email reader implementing the above ReST Tidy would obviously also have applications outside this context. I think I may like to use ">" when quoting emails in ReST. - Benja