From pearu@cens.ioc.ee  Sun Dec  1 11:33:18 2002
From: pearu@cens.ioc.ee (Pearu Peterson)
Date: Sun, 1 Dec 2002 13:33:18 +0200 (EET)
Subject: [Doc-SIG] Including files
Message-ID: <Pine.LNX.4.21.0212011239580.7382-100000@cens.kybi>

Hi,

I am using Docutils in creating a reference manual (for F2PY project) and
found that it would be nice if rst formatted text files could have some
support for including other rst formatted text files as well as various
source codes to the document. The equivalent hooks in LaTeX would be 
\input{..} and \verbatiminput{..} commands.

Typical usage cases:

* If a rst-text file tends to become very large (in the sense that
  its printed version has, say, 20 or more pages) then factoring it to
  different files would ease maintaining such documents.

* Including example source codes. Currently one has to maintain two copies
  of source codes, one as a source file and one typed (copied) into
  the rst-text file.

I am not sure what would be appropiate Docutils hooks for emulating
LaTeX \input or \verbatiminput commands but may be something like the
following:

* Using

  ::

    .. input:: filename

  in a rst-text file is equivalent to a situation as if 
  the ``.. input:: filename`` part is replaced by the contents of
  ``filename``, possibly taking into account also indentation level.
  Possible variations for ``input``::

    file
    insert
    include
    fileinput
    ..

* Using

  ::

    .. verbatim:: filename
 
  is equivalent to including the contents of ``filename`` as a literate
  block to the current rst-text file.
  Possible variations for ``verbatim``::

    source
    verbatiminput
    ..

What do you think?

Pearu


From goodger@python.org  Sun Dec  1 15:51:25 2002
From: goodger@python.org (David Goodger)
Date: Sun, 01 Dec 2002 10:51:25 -0500
Subject: [Doc-SIG] Including files
In-Reply-To: <Pine.LNX.4.21.0212011239580.7382-100000@cens.kybi>
Message-ID: <BA0F9A2B.2C7BE%goodger@python.org>

Pearu Peterson wrote:
> What do you think?

The "include" directive is already implemented.  For included
reStructuredText source files use::

    .. include:: file.txt

For literal block inclusions (example code, etc.), use::

    .. include:: module.py
       :literal:

Make sure you're using the latest code from CVS or the snapshot.  See
<http::/docutils.sf.net/spec/rst/directives.html> for details.

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From pearu@cens.ioc.ee  Sun Dec  1 16:04:41 2002
From: pearu@cens.ioc.ee (Pearu Peterson)
Date: Sun, 1 Dec 2002 18:04:41 +0200 (EET)
Subject: [Doc-SIG] Including files
In-Reply-To: <BA0F9A2B.2C7BE%goodger@python.org>
Message-ID: <Pine.LNX.4.21.0212011758050.443-100000@cens.kybi>

On Sun, 1 Dec 2002, David Goodger wrote:

> Pearu Peterson wrote:
> > What do you think?
> 
> The "include" directive is already implemented.

Thanks!

Pearu


From goodger@python.org  Thu Dec  5 03:16:59 2002
From: goodger@python.org (David Goodger)
Date: Wed, 04 Dec 2002 22:16:59 -0500
Subject: [Doc-SIG] looking for prior art
Message-ID: <BA142F5A.2CAC0%goodger@python.org>

I have begun work on a Python source Reader component for Docutils.  I
expect the work to go slowly, as there is lots to absorb, much earlier work
to study and learn from, and little spare time to devote.  I'm trying to
keep it as simple as possible, mostly for my own benefit (lest my brain
explode).

I've looked over the HappyDoc code and Tony "Tibs" Ibbs' PySource prototype.
HappyDoc uses the stdlib "parser" module to parse Python modules into
abstract syntax trees (ASTs), but that seems difficult and fragile, the ASTs
being so low-level.  Tibs' prototype uses the much higher-level ASTs built
by the stdlib "compiler" module, which are much easier to understand.  I've
decided to use the "compiler" module also.

My first stumbling block is in parsing assignments.  I want to extract the
right-hand side (RHS) of assignments straight from the source.  In his
prototype, Tibs rebuilds the RHS from the AST, but that seems rather
roundabout and the results may not match the source perfectly (equivalent,
but not character-for-character).  I think using the "tokenize" module in
parallel with "compiler" may allow the code to extract the raw RHS text, as
well as other raw text that doesn't make it verbatim to the AST.

So, is there any prior art out there?  Any pointers or advice?

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From bac@OCF.Berkeley.EDU  Thu Dec  5 07:04:26 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Wed, 4 Dec 2002 23:04:26 -0800 (PST)
Subject: [Doc-SIG] looking for prior art
In-Reply-To: <BA142F5A.2CAC0%goodger@python.org>
References: <BA142F5A.2CAC0%goodger@python.org>
Message-ID: <Pine.SOL.4.50.0212042302520.12797-100000@death.OCF.Berkeley.EDU>

[David Goodger]

> So, is there any prior art out there?  Any pointers or advice?
>

How does PyChecker do it?  I would guess by reading the bytecode, but you
never know.

I would guess using regexes would be the best if you just want to read the
source.  The ``tokenize`` module has all the regexes and they might be
available independently from the methods in the module.

-Brett


From guido@python.org  Thu Dec  5 09:02:09 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 05 Dec 2002 04:02:09 -0500
Subject: [Doc-SIG] looking for prior art
In-Reply-To: Your message of "Wed, 04 Dec 2002 23:04:26 PST."
 <Pine.SOL.4.50.0212042302520.12797-100000@death.OCF.Berkeley.EDU>
References: <BA142F5A.2CAC0%goodger@python.org>
 <Pine.SOL.4.50.0212042302520.12797-100000@death.OCF.Berkeley.EDU>
Message-ID: <200212050902.gB5929m11117@pcp02138704pcs.reston01.va.comcast.net>

> [David Goodger]
> 
> > So, is there any prior art out there?  Any pointers or advice?

[Brett Cannon]
> How does PyChecker do it?  I would guess by reading the bytecode, but you
> never know.

It reads the bytecode, but PyChecker 2.0 will read the source.  The
bytecode is often hard to use; it also changes between versions.

> I would guess using regexes would be the best if you just want to read the
> source.  The ``tokenize`` module has all the regexes and they might be
> available independently from the methods in the module.

I recommend using the tokenizer module directly.  See pyclbr.py (in
current CVS; it used to have its own set of regexps) for an example of
how to do this.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From doug@hellfly.net  Thu Dec  5 14:04:47 2002
From: doug@hellfly.net (Doug Hellmann)
Date: Thu, 5 Dec 2002 09:04:47 -0500
Subject: [Doc-SIG] looking for prior art
In-Reply-To: <BA142F5A.2CAC0%goodger@python.org>
References: <BA142F5A.2CAC0%goodger@python.org>
Message-ID: <200212051404.JAA08933@branagan.hellfly.net>

On Wednesday 04 December 2002 10:16 pm, David Goodger wrote:
>
> I've looked over the HappyDoc code and Tony "Tibs" Ibbs' PySource
> prototype. HappyDoc uses the stdlib "parser" module to parse Python modules
> into abstract syntax trees (ASTs), but that seems difficult and fragile,
> the ASTs being so low-level.  Tibs' prototype uses the much higher-level
> ASTs built by the stdlib "compiler" module, which are much easier to
> understand.  I've decided to use the "compiler" module also.

I'm pretty sure HappyDoc was written before the compiler module was generally 
available, but I'm not sure.  I've only had to make a few minor 
modifications to it in the past, since the language syntax hasn't evolved 
that far.  I'm working on a major overhaul of HappyDoc anyway, so now might 
be the time to rewrite the parsing stuff to use the compiler module.  If 
you're interested in collaborating, let me know.

Doug


From goodger@python.org  Fri Dec  6 02:45:14 2002
From: goodger@python.org (David Goodger)
Date: Thu, 05 Dec 2002 21:45:14 -0500
Subject: [Doc-SIG] looking for prior art
In-Reply-To: <200212051404.JAA08933@branagan.hellfly.net>
Message-ID: <BA157968.2CB81%goodger@python.org>

Doug Hellmann wrote:
> I'm pretty sure HappyDoc was written before the compiler module was
> generally available

I suspected as much.  Either that, or you're a glutton for punishment
;-)

> I've only had to make a few minor modifications to it in the past,
> since the language syntax hasn't evolved that far.

That's good to know.  Still, the parser.suite() approach seems a lot
harder.

> I'm working on a major overhaul of HappyDoc anyway, so now might be
> the time to rewrite the parsing stuff to use the compiler module.
> If you're interested in collaborating, let me know.

I am, definitely.  What I'd like to do is to take a module, read in
the text, run it through the module parser (using compiler.py and
tokenize.py) and produce a high-level AST full of nodes that are
interesting from an auto-documentation standpoint.  For example, given
this module (x.py)::

    # comment

    """Docstring"""

    """Additional docstring"""

    __docformat__ = 'reStructuredText'

    a = 1
    """Attribute docstring"""

    class C(Super):

        """C's docstring"""

        class_attribute = 1
        """class_attribute's docstring"""

        def __init__(self, text=None):
            """__init__'s docstring"""

            self.instance_attribute = (text * 7
                                       + ' whaddyaknow')
            """instance_attribute's docstring"""


    def f(x, y=a*5, *args):
        """f's docstring"""
        return [x + item for item in args]

    f.function_attribute = 1
    """f.function_attribute's docstring"""

The module parser should produce a high-level AST, something like this
(in pseudo-XML_)::

    <Module filename="x.py">
        <Comment lineno=1>
            comment
        <Docstring lineno=3>
            Docstring
        <Docstring lineno=...>           (I'll leave out the lineno's)
            Additional docstring
        <Attribute name="__docformat__">
            <Expression>
                'reStructuredText'
        <Attribute name="a">
            <Expression>
                1
            <Docstring>
                Attribute docstring
        <Class name="C" inheritance="Super">
            <Docstring>
                C's docstring
            <Attribute name="class_attribute">
                <Expression>
                    1
                <Docstring>
                    class_attribute's docstring
            <Method name="__init__" argnames=['self', ('text', 'None')]>
                <Docstring>
                    __init__'s docstring
                <Attribute name="instance_attribute" instance=True>
                    <Expression>
                        (text * 7
                         + ' whaddyaknow')
                    <Docstring>
                        class_attribute's docstring
        <Function name="f" argnames=['x', ('y', 'a*5'), 'args']
varargs=True>
            <Docstring>
                f's docstring
            <Attribute name="function_attribute">
                <Expression>
                    1
                <Docstring>
                    f.function_attribute's docstring

compiler.parse() provides most of what's needed for this AST.  I think
that "tokenize" can be used to get the rest, and all that's left is to
hunker down and figure out how.  We can determine the line number from
the compiler.parse() AST, and a get_rhs(lineno) method would provide
the rest.

The Docutils Python reader component will transform this AST into a
Python-specific doctree, and then a `stylist transform`_ would further
transform it into a generic doctree.  Namespaces will have to be
compiled for each of the scopes, but I'm not certain at what stage of
processing.

It's very important to keep all docstring processing out of this, so
that it's a completely generic and not tool-specific.

For an overview see:

    http://docutils.sf.net/pep-0258.html#python-source-reader

For very preliminary code see:

    http://docutils.sf.net/docutils/readers/python/moduleparser.py

For tests and example output see:

    http://docutils.sf.net/test/test_readers/test_python/test_parser.py

I have also made some simple scripts to make "compiler", "parser", and
"tokenize" output easier to read.  They use input from the
test_parser.py module above.  See showast, showparse, and showtok in:

    http://docutils.sf.net/test/test_readers/test_python/

.. _pseudo-XML: http://docutils.sf.net/spec/doctree.html#pseudo-xml
.. _stylist transform:
   http://docutils.sf.net/spec/pep-0258.html#stylist-transforms

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From doug@hellfly.net  Fri Dec  6 13:27:49 2002
From: doug@hellfly.net (Doug Hellmann)
Date: Fri, 6 Dec 2002 08:27:49 -0500
Subject: [Doc-SIG] looking for prior art
In-Reply-To: <BA157968.2CB81%goodger@python.org>
References: <BA157968.2CB81%goodger@python.org>
Message-ID: <200212061327.IAA10440@branagan.hellfly.net>

On Thursday 05 December 2002 9:45 pm, David Goodger wrote:
> Doug Hellmann wrote:
> > I'm pretty sure HappyDoc was written before the compiler module was
> > generally available
>
> I suspected as much.  Either that, or you're a glutton for punishment
> ;-)

Well, I didn't say that wasn't true.  :-)  I actually started with some 
sample code included in the Python source distribution, so it wasn't too hard 
to extend it and come up with a useful parser.

> > I've only had to make a few minor modifications to it in the past,
> > since the language syntax hasn't evolved that far.
>
> That's good to know.  Still, the parser.suite() approach seems a lot
> harder.

If you're starting from scratch, I would definitely recommend trying the 
compiler module first.

> > I'm working on a major overhaul of HappyDoc anyway, so now might be
> > the time to rewrite the parsing stuff to use the compiler module.
> > If you're interested in collaborating, let me know.
>
> I am, definitely.  What I'd like to do is to take a module, read in
> the text, run it through the module parser (using compiler.py and
> tokenize.py) and produce a high-level AST full of nodes that are
> interesting from an auto-documentation standpoint.  For example, given
> this module (x.py)::

[...]

> compiler.parse() provides most of what's needed for this AST.  I think
> that "tokenize" can be used to get the rest, and all that's left is to
> hunker down and figure out how.  We can determine the line number from
> the compiler.parse() AST, and a get_rhs(lineno) method would provide
> the rest.

Does compiler include comments?  I had to write a separate parser to pull 
comments out.

> The Docutils Python reader component will transform this AST into a
> Python-specific doctree, and then a `stylist transform`_ would further
> transform it into a generic doctree.  Namespaces will have to be
> compiled for each of the scopes, but I'm not certain at what stage of
> processing.

Why perform all of those transformations?  Why not go from the AST to a 
generic doctree?  Or, even from the AST to the final output?

> It's very important to keep all docstring processing out of this, so
> that it's a completely generic and not tool-specific.

Definitely.

Doug


From mwh@python.net  Fri Dec  6 13:41:11 2002
From: mwh@python.net (Michael Hudson)
Date: 06 Dec 2002 13:41:11 +0000
Subject: [Doc-SIG] looking for prior art
In-Reply-To: Doug Hellmann's message of "Fri, 6 Dec 2002 08:27:49 -0500"
References: <BA157968.2CB81%goodger@python.org> <200212061327.IAA10440@branagan.hellfly.net>
Message-ID: <2my973b6yw.fsf@starship.python.net>

Doug Hellmann <doug@hellfly.net> writes:

> If you're starting from scratch, I would definitely recommend trying the 
> compiler module first.

Amen.

[...]
> Does compiler include comments? 

No.  tokenize.py does, though.

I don't know how hard it would be to turn the output of tokenize.py
into something like the output of compiler/transformer.py, but with
comments.  SPARK may be your friend...

Cheers,
M.

-- 
  Two things I learned for sure during a particularly intense acid
  trip in my own lost youth: (1) everything is a trivial special case
  of something else; and, (2) death is a bunch of blue spheres.
                                             -- Tim Peters, 1 May 1998


From goodger@python.org  Sat Dec  7 02:47:58 2002
From: goodger@python.org (David Goodger)
Date: Fri, 06 Dec 2002 21:47:58 -0500
Subject: [Doc-SIG] looking for prior art
In-Reply-To: <200212061327.IAA10440@branagan.hellfly.net>
Message-ID: <BA16CB8D.2CC31%goodger@python.org>

Doug Hellmann wrote:
> Does compiler include comments?  I had to write a separate parser to
> pull comments out.

As Michael said, no.  That's another reason for using compiler and
tokenize in parallel.

>> The Docutils Python reader component will transform this AST into a
>> Python-specific doctree, and then a `stylist transform`_ would
>> further transform it into a generic doctree.  Namespaces will have
>> to be compiled for each of the scopes, but I'm not certain at what
>> stage of processing.
> 
> Why perform all of those transformations?  Why not go from the AST
> to a generic doctree?  Or, even from the AST to the final output?

I want the docutils.readers.python.moduleparser.parse_module()
function to produce a standard documentation-oriented AST that can be
used by any tool.  We can develop it together without having to
compromise on the rest of our design (i.e., HappyDoc doesn't have to
be made to work like Docutils, and vice-versa).  It would be a
higher-level version of what compiler.py provides.

The Python reader component transforms this generic AST into a
Python-specific doctree (it knows about modules, classes, functions,
etc.), but this is specific to Docutils and cannot be used by HappyDoc
or others.  The stylist transform does the final layout, converting
Python-specific structures ("class" sections, etc.) into a generic
doctree using primitives (tables, sections, lists, etc.).  This
generic doctree does *not* know about Python structures any more.  The
advantage is that this doctree can be handed off to any of the output
writers to create any output format we like.

The latter two transforms are separate because I want to be able to
have multiple independent layout styles (multiple runtime-selectable
"stylist transforms").  Each of the existing tools (HappyDoc, pydoc,
epydoc, Crystal, etc.) has its own fixed format.  I personally don't
like the tables-based format produced by these tools, and I'd like to
be able to customize the format easily.  That's the goal of stylist
transforms, which are independent from the Reader component itself.
One stylist transform could produce HappyDoc-like output, another
could produce output similar to module docs in the Python library
reference manual, and so on.

It's for exactly this reason:

>> It's very important to keep all docstring processing out of this,
>> so that it's a completely generic and not tool-specific.

... but it goes past docstring processing.  It's also important to
keep style decisions and tool-specific data transforms out of this
module parser.

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From doug@hellfly.net  Sat Dec  7 13:46:28 2002
From: doug@hellfly.net (Doug Hellmann)
Date: Sat, 7 Dec 2002 08:46:28 -0500
Subject: [Doc-SIG] looking for prior art
In-Reply-To: <BA16CB8D.2CC31%goodger@python.org>
References: <BA16CB8D.2CC31%goodger@python.org>
Message-ID: <200212071346.IAA12102@branagan.hellfly.net>

On Friday 06 December 2002 9:47 pm, David Goodger wrote:
> Doug Hellmann wrote:
> >
> > Why perform all of those transformations?  Why not go from the AST
> > to a generic doctree?  Or, even from the AST to the final output?
>
> I want the docutils.readers.python.moduleparser.parse_module()
> function to produce a standard documentation-oriented AST that can be
> used by any tool.  We can develop it together without having to
> compromise on the rest of our design (i.e., HappyDoc doesn't have to
> be made to work like Docutils, and vice-versa).  It would be a
> higher-level version of what compiler.py provides.

That part makes sense.

> The Python reader component transforms this generic AST into a
> Python-specific doctree (it knows about modules, classes, functions,
> etc.), but this is specific to Docutils and cannot be used by HappyDoc
> or others.  The stylist transform does the final layout, converting
> Python-specific structures ("class" sections, etc.) into a generic
> doctree using primitives (tables, sections, lists, etc.).  This
> generic doctree does *not* know about Python structures any more.  The
> advantage is that this doctree can be handed off to any of the output
> writers to create any output format we like.

Ah.  I handled that differently in HappyDoc.  Instead of building another 
data structure, I set up the API for the formatters to have methods that do 
things like start/end a (sub)section, start/end a list, etc.  The primary 
implementation is an HTML formatter that produces tables, but there are other 
formatters.  The docset is then responsible for calling the right formatter 
method when it wants it.  Having the docset and formatter separate makes 
things more complicated than I expected, so in HappyDoc 3.0 there will just 
be one plugin system.  

There is a new scanner which walks the input directory building a tree of 
scanned files, doing optional special processing for each based on mimetype.  
For text/x-python files, the file is parsed and information about classes, 
etc. are extracted.  The output formatter walks the resulting tree, also 
doing mimetype-based processing for each file.  HTML and image files will be 
copied from input to output.  Text files are converted using the docstring 
converter, and the parse results from Python modules are used to generate new 
HTML output files.

I've got the scanner done, and am working on the output formatter code now. 

Doug


From fantasai@escape.com  Sat Dec 14 04:48:21 2002
From: fantasai@escape.com (fantasai)
Date: Fri, 13 Dec 2002 23:48:21 -0500
Subject: [Doc-SIG] reST block quotes
Message-ID: <3DFAB815.70501@escape.com>

hmm.. it's been awhile.

ok, so there's a problem with the blockquote syntax
that was one of the first things I noticed: The syntax
relies exclusively on indentation. This means one can't
use indentation for other things--like indenting
sections to make the document structure easier to grasp.

The other problem is that attributions don't seem to
be recognized. It would be nice to put uris in HTML's
'cite' attribute and mark up just regular attributions
as such.

  | An Attribution identifies the source to whom a
  | BlockQuote or Epigraph is ascribed.

  -- http://www.docbook.org/tdg/en/html/attribution.html

So, I'd like to have reST take something like that
(URIs might need something to distinguish them from,
say, people) and translate it into appropriate markup.
An option could require the pipe quoting or another
symbol (e.g. '>') and just treat indented blocks as
regular text.

(I'd gotten the symbol-quoting part to work last year,
but ran into some trouble with the attribution (I was
using a different syntax) and put everything aside for
later.)

So, what do you think?

~fantasai


From goodger@python.org  Sat Dec 14 15:22:41 2002
From: goodger@python.org (David Goodger)
Date: Sat, 14 Dec 2002 10:22:41 -0500
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <3DFAB815.70501@escape.com>
Message-ID: <BA20B6EF.2D13D%goodger@python.org>

fantasai wrote:
> ok, so there's a problem with the blockquote syntax
> that was one of the first things I noticed: The syntax
> relies exclusively on indentation.

That's not a problem, that's a *feature*. :)

> This means one can't use indentation for other things--like
> indenting sections to make the document structure easier to grasp.

Not true.  Indentation is used in many local contexts, such as list
items.  In the documentation philosophy embodied in reStructuredText,
section structure through indentation is a *misfeature*.  See "Section
Structure via Indentation" in
<http://docutils.sf.net/spec/rst/problems.html>, and "Questions &
Answers", item 3, in <http://docutils.sf.net/spec/pep-0258.html>.

> The other problem is that attributions don't seem to
> be recognized. It would be nice to put uris in HTML's
> 'cite' attribute and mark up just regular attributions
> as such.
> 
>   | An Attribution identifies the source to whom a
>   | BlockQuote or Epigraph is ascribed.
> 
>   -- http://www.docbook.org/tdg/en/html/attribution.html

I don't see this as a problem either.  It's new functionality.  What
would the result look like?

> So, I'd like to have reST take something like that
> (URIs might need something to distinguish them from,
> say, people) and translate it into appropriate markup.

What would that be?

> An option could require the pipe quoting or another
> symbol (e.g. '>') and just treat indented blocks as
> regular text.
> 
> (I'd gotten the symbol-quoting part to work last year,
> but ran into some trouble with the attribution (I was
> using a different syntax) and put everything aside for
> later.)
> 
> So, what do you think?

I think this may be an appropriate use of a directive.  Directives
offer an easy way to experiment with new features without requiring
new general syntax.  If a feature is useful enough and has appropriate
and unambiguous syntax, it could become a general feature.

Please flesh out the spec more.

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From fantasai@escape.com  Sat Dec 14 20:11:55 2002
From: fantasai@escape.com (fantasai)
Date: Sat, 14 Dec 2002 15:11:55 -0500
Subject: [Doc-SIG] reST block quotes
References: <BA20B6EF.2D13D%goodger@python.org>
Message-ID: <3DFB908B.1030208@escape.com>

David Goodger wrote:
 > fantasai wrote:
 >>
 >>The [blockquote] syntax relies exclusively on indentation.
 >>
 >>This means one can't use indentation for other things--like
 >>indenting sections to make the document structure easier to grasp.
 >
 > Not true.  Indentation is used in many local contexts, such as list
 > items.  In the documentation philosophy embodied in reStructuredText,
 > section structure through indentation is a *misfeature*.  See "Section
 > Structure via Indentation" in
 > <http://docutils.sf.net/spec/rst/problems.html>, and "Questions &
 > Answers", item 3, in <http://docutils.sf.net/spec/pep-0258.html>.

I *know* that, and that's why I'm particularly glad reST
doesn't use indentation for structure like STX does. But
because of the blockquote syntax, it doesn't let me use
indentation as text formatting. Document structure is
easier to see if the sections are indented. I would like
to be able to indent sections of reST without triggering
*any* markup construct.

 >>The other problem is that attributions don't seem to
 >>be recognized. It would be nice to put uris in HTML's
 >>'cite' attribute and mark up just regular attributions
 >>as such.
 >>
 >>  | An Attribution identifies the source to whom a
 >>  | BlockQuote or Epigraph is ascribed.
 >>
 >>  -- http://www.docbook.org/tdg/en/html/attribution.html
 >
 >
 > I don't see this as a problem either.  It's new functionality.

I apologize for my inappropriate use of English vocabulary.

  > What would the result look like?

<blockquote cite="http://www.docbook.org/tdg/en/html/attribution.html">
    An Attribution identifies the source to whom a BlockQuote
    or Epigraph is ascribed.
</blockquote>

 >>An option could require the pipe quoting or another
 >>symbol (e.g. '>') and just treat indented blocks as
 >>regular text.
 >>
 > I think this may be an appropriate use of a directive.

Think what may be an appropriate use of a directive?
Attribution recognition or quoted blockquote recognition
or not recognizing purely indented blocks as blockquotes?

 > Please flesh out the spec more.

    An indented block in which each line begins with the
    same sequence of spaces+(>, |, #) is recognized as
    a blockquote.

    It may be optionally followed by blank lines and an
    attribution.

    The attribution begins with two dashes and a space
    (-- ) which must be indented at least to the preceding
    blockquote's quote character.

    The attribution may be multiple lines, but must be indented
    at least three spaces from the first dash.

     # quoted text
     # quoted text

     -- attribution attribution
        attribution attribution

    If a line in the attribution consists entirely of
    opening and closing angle brackets with a sequence of
    URI characters in between, the line is taken out of
    the attribution text and the URI sequence is put in
    the blockquote's 'cite' attribute.

    All other attribution content is parsed as inline
    content and placed in the attribution element, which
    is a child of the blockquote. In HTML, the attribution
    corresponds to

    <address class="attribution">attribution text</address>

So, this:

   | An Attribution identifies the source to whom a
   | BlockQuote or Epigraph is ascribed.

   -- DocBook: The Definitive Guide
      <http://www.docbook.org/tdg/en/html/attribution.html>

would result in this:

<blockquote cite="http://www.docbook.org/tdg/en/html/attribution.html">
    An Attribution identifies the source to whom a BlockQuote
    or Epigraph is ascribed.
    <address class="attribution">DocBook: The Definitive Guide</address>
</blockquote>


From goodger@python.org  Sun Dec 15 00:38:46 2002
From: goodger@python.org (David Goodger)
Date: Sat, 14 Dec 2002 19:38:46 -0500
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <3DFB908B.1030208@escape.com>
Message-ID: <BA213945.2D1D0%goodger@python.org>

[fantasai]
>>>The [blockquote] syntax relies exclusively on indentation.
>>>
>>>This means one can't use indentation for other things--like
>>>indenting sections to make the document structure easier to grasp.

[David Goodger]
>> Not true.  Indentation is used in many local contexts, such as list
>> items.  In the documentation philosophy embodied in reStructuredText,
>> section structure through indentation is a *misfeature*.  See "Section
>> Structure via Indentation" in
>> <http://docutils.sf.net/spec/rst/problems.html>, and "Questions &
>> Answers", item 3, in <http://docutils.sf.net/spec/pep-0258.html>.

[fantasai]
> I *know* that, and that's why I'm particularly glad reST
> doesn't use indentation for structure like STX does. But
> because of the blockquote syntax, it doesn't let me use
> indentation as text formatting. Document structure is
> easier to see if the sections are indented. I would like
> to be able to indent sections of reST without triggering
> *any* markup construct.

So you'd like to be able to turn off block quote recognition?  It
could be done, probably without too much pain, but care would have to
be taken with other uses of indentation (list items, definition
lists, etc.).

>>> The other problem is that attributions don't seem to
>>> be recognized. It would be nice to put uris in HTML's
>>> 'cite' attribute and mark up just regular attributions
>>> as such.
>>>
>>>  | An Attribution identifies the source to whom a
>>>  | BlockQuote or Epigraph is ascribed.
>>>
>>>  -- http://www.docbook.org/tdg/en/html/attribution.html
>>
>> I don't see this as a problem either.  It's new functionality.
> 
> I apologize for my inappropriate use of English vocabulary.

:)

>> What would the result look like?
> 
> <blockquote cite="http://www.docbook.org/tdg/en/html/attribution.html">
>     An Attribution identifies the source to whom a BlockQuote
>     or Epigraph is ascribed.
> </blockquote>

I tried looking at this code in a browser, MSIE 5.1.4/MacOS.  Not
state of the art, but the best I have at hand.  The "cite" attribute
didn't actually do anything.  What is it supposed to do?  (How is a
user agent supposed to render a "cite" attribute?)

>>> An option could require the pipe quoting or another
>>> symbol (e.g. '>') and just treat indented blocks as
>>> regular text.
>>>
>> I think this may be an appropriate use of a directive.
> 
> Think what may be an appropriate use of a directive?
> Attribution recognition or quoted blockquote recognition
> or not recognizing purely indented blocks as blockquotes?

Question: are the first two related to the last?  If so, how?

Any of those could be, but I was specifically referring to attribution
recognition and possibly quoted blockquote recognition.  Something
similar to the "quoted blockquote recognition" idea has already been
documented, although as a literal block alternative; see
<http://docutils.sf.net/spec/notes.html> and search for "per-line
quoting".

A "quoted-blockquote" directive could easily be constructed:

    Some ordinary text.
    
    .. quoted-blockquote::
    
        | Block quote text
        | goes here.

(Although I'm not sure if this is what you mean, or has any value.)

The "attribution recognition" idea could be done with a "cite" (or
whatever) directive, something like this:

    Some ordinary text.
    
        A block quote.
        
        .. cite:: This is some citation text, ending with a URI.
           <http://www.example.org/>

A "cite" directive might only be valid inside a block quote, and would
add a "cite" attribute to the block quote element itself.  If it was
useful or popular enough, it could grow special syntax.  I don't know
if "--" at the beginning of the paragraph is enough though; I already
use that style and would be surprised if hyperlinks after "--"
disappeared from the rendered form.

As for turning off indentation->blockquotes, that could be a
pragma-type directive, but would require some changes to the parser to
support it.  I'm not convinced of its usefulness.  Can you provide
some use cases?

>> Please flesh out the spec more.

Thank you.

> So, this:
> 
>    | An Attribution identifies the source to whom a
>    | BlockQuote or Epigraph is ascribed.
> 
>    -- DocBook: The Definitive Guide
>       <http://www.docbook.org/tdg/en/html/attribution.html>
> 
> would result in this:
> 
> <blockquote cite="http://www.docbook.org/tdg/en/html/attribution.html">
>     An Attribution identifies the source to whom a BlockQuote
>     or Epigraph is ascribed.
>     <address class="attribution">DocBook: The Definitive Guide</address>
> </blockquote>

The markup seems problematic to me.  There are two separate constructs
there, which aren't obviously related.  What if there's a quoted block
without an attribution?  What about an attribution without a quoted
block?  If they were joined into one construct, it would be easier to
digest:

    | An Attribution identifies the source to whom a
    | BlockQuote or Epigraph is ascribed.
    |
    | -- DocBook: The Definitive Guide
    |    <http://www.docbook.org/tdg/en/html/attribution.html>

This syntax would make a block quote difficult to maintain, almost as
bad as a grid table.  I don't see its value.  Use cases?

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From fantasai@escape.com  Sun Dec 15 04:47:23 2002
From: fantasai@escape.com (fantasai)
Date: Sat, 14 Dec 2002 23:47:23 -0500
Subject: [Doc-SIG] reST block quotes
References: <BA213945.2D1D0%goodger@python.org>
Message-ID: <3DFC095B.5040009@escape.com>

David Goodger wrote:
> [fantasai]
>
> So you'd like to be able to turn off block quote recognition?  It
> could be done, probably without too much pain, but care would have to
> be taken with other uses of indentation (list items, definition
> lists, etc.).

Certainly.

>>>What would the result look like?
>>
>><blockquote cite="http://www.docbook.org/tdg/en/html/attribution.html">
>>    An Attribution identifies the source to whom a BlockQuote
>>    or Epigraph is ascribed.
>></blockquote>
> 
> I tried looking at this code in a browser, MSIE 5.1.4/MacOS.  Not
> state of the art, but the best I have at hand.  The "cite" attribute
> didn't actually do anything.  What is it supposed to do?  (How is a
> user agent supposed to render a "cite" attribute?)

What the UA is supposed to do isn't really specified. Mozilla
provides access to it through the Properties item on the
context menu. The URI can, however, also be made available
with stylesheets and/or scripting.

>>>>An option could require the pipe quoting or another
>>>>symbol (e.g. '>') and just treat indented blocks as
>>>>regular text.
>>>
>>>I think this may be an appropriate use of a directive.
>>
>>Think what may be an appropriate use of a directive?
>>Attribution recognition or quoted blockquote recognition
>>or not recognizing purely indented blocks as blockquotes?
> 
> Question: are the first two related to the last?  If so, how?

Disabling blockquotes altogether isn't a good idea, so
if one were to disable indent block -> blockquote, one
should provide an alternative syntax.

Recognizing an attribution syntax is independent of
either of those.

 > Something similar to the "quoted blockquote recognition" idea
 > has already been documented, although as a literal block
 > alternative; see <http://docutils.sf.net/spec/notes.html> and
 > search for "per-line quoting".

It wouldn't interfere, as that requires a literal block
start sequence. That literal block example, btw, really
should be handled as two blockquotes, one inside the
other, since that's what it *is*.

> A "quoted-blockquote" directive could easily be constructed:
> 
>     Some ordinary text.
>     
>     .. quoted-blockquote::
>     
>         | Block quote text
>         | goes here.
> 
> (Although I'm not sure if this is what you mean, or has any value.)

Adding a directive like that defeats the purpose. I might as
well just write

.. blockquote::
      Block quote text goes here. There's no need for a symbol
      because it's already distinguished from a merely indented
      block.

The quoting syntax I'm using, though, is very common and so
it's non-intrusive as well as intuitive and unambiguous.
(It also allows cut & paste from emails without modification.)

> A "cite" directive might only be valid inside a block quote, and would
> add a "cite" attribute to the block quote element itself.  If it was
> useful or popular enough, it could grow special syntax.  I don't know
> if "--" at the beginning of the paragraph is enough though; I already
> use that style and would be surprised if hyperlinks after "--"
> disappeared from the rendered form.

Why would hyperlinks disappear? Inline markup is recognized
after the "-- ".

> As for turning off indentation->blockquotes, that could be a
> pragma-type directive, but would require some changes to the parser to
> support it.  I'm not convinced of its usefulness.  Can you provide
> some use cases?

Yeah. I just hand-converted an HTML file to plaintext
today to post to a mailing list. I indented every section
underneath its header. e.g.

Heading

   paragraph

   Subheading

     paragraph

     paragraph

   Subheading

     example

     paragraph

I would like to be able to do that in an reST doc.

>>   | An Attribution identifies the source to whom a
>>   | BlockQuote or Epigraph is ascribed.
>>
>>   -- DocBook: The Definitive Guide
>>      <http://www.docbook.org/tdg/en/html/attribution.html>
>>
>>would result in this:
>>
>><blockquote cite="http://www.docbook.org/tdg/en/html/attribution.html">
>>    An Attribution identifies the source to whom a BlockQuote
>>    or Epigraph is ascribed.
>>    <address class="attribution">DocBook: The Definitive Guide</address>
>></blockquote>
> 
> The markup seems problematic to me.  There are two separate constructs
> there, which aren't obviously related.  What if there's a quoted block
> without an attribution?

If there's no attribution, it doesn't get an attribution.
Parsing continues as usual.

>  What about an attribution without a quoted block? 

No special treatment. It will be handled as it is now.

> If they were joined into one construct, it would be easier to
> digest:
> 
>     | An Attribution identifies the source to whom a
>     | BlockQuote or Epigraph is ascribed.
>     |
>     | -- DocBook: The Definitive Guide
>     |    <http://www.docbook.org/tdg/en/html/attribution.html>

That could be construed as quoting the citation. That is,
the quoted text has an attribution, and you're quoting
that attribution. I think this would actually be more
difficult to parse, because one would have to know
whether this "attribution" is the last block of the
blockquote to determine whether or not it gets parsed
as an attribution.

~fantasai


From goodger@python.org  Sun Dec 15 15:16:57 2002
From: goodger@python.org (David Goodger)
Date: Sun, 15 Dec 2002 10:16:57 -0500
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <3DFC095B.5040009@escape.com>
Message-ID: <BA220718.2D223%goodger@python.org>

[fantasai]
>>> Think what may be an appropriate use of a directive?
>>> Attribution recognition or quoted blockquote recognition
>>> or not recognizing purely indented blocks as blockquotes?

[David Goodger]
>> Question: are the first two related to the last?  If so, how?

[fantasai]
> Disabling blockquotes altogether isn't a good idea, so
> if one were to disable indent block -> blockquote, one
> should provide an alternative syntax.

So, if the "diabling blockquotes" proposal is *not* accepted, the
alternative syntax wouldn't be required?  If that's not the case,
please justify them independently.  I see potential value for quoted
blockquote recognition in an email context (more below), but not for
disabling ordinary blockquotes.

>> Something similar to the "quoted blockquote recognition" idea
>> has already been documented, although as a literal block
>> alternative; see <http://docutils.sf.net/spec/notes.html> and
>> search for "per-line quoting".
> 
> It wouldn't interfere, as that requires a literal block
> start sequence. That literal block example, btw, really
> should be handled as two blockquotes, one inside the
> other, since that's what it *is*.

Take another look: """Generalize the "literal block" construct to
allow blocks with a per-line quoting to avoid indentation?"""  Last
three words are the whole point.

> The quoting syntax I'm using, though, is very common and so
> it's non-intrusive as well as intuitive and unambiguous.
> (It also allows cut & paste from emails without modification.)

Since early on there's been talk about an "Email Reader", which would
handle quoted segments of messages, signatures, and other
email-specific constructs.  Perhaps that's where effort should be
directed?  I wouldn't want to accept a proposal that was incompatible
with an Email Reader (even a future potential Email Reader).

>> A "cite" directive might only be valid inside a block quote, and
>> would add a "cite" attribute to the block quote element itself.  If
>> it was useful or popular enough, it could grow special syntax.  I
>> don't know if "--" at the beginning of the paragraph is enough
>> though; I already use that style and would be surprised if
>> hyperlinks after "--" disappeared from the rendered form.
> 
> Why would hyperlinks disappear? Inline markup is recognized
> after the "-- ".

If the hyperlink is subsumed into the <blockquote>'s "cite" attribute,
in all but the most cutting-edge browsers (if then) it's as good as
invisible.  If something so drastic is happening to data, I think the
markup should be much more explicit and distinctive.

>> As for turning off indentation->blockquotes, that could be a
>> pragma-type directive, but would require some changes to the parser
>> to support it.  I'm not convinced of its usefulness.  Can you
>> provide some use cases?
> 
> Yeah. I just hand-converted an HTML file to plaintext
> today to post to a mailing list. I indented every section
> underneath its header. e.g.
> 
> Heading
> 
>    paragraph
> 
>    Subheading
> 
>      paragraph
> 
>      paragraph
> 
>    Subheading
> 
>      example
> 
>      paragraph
> 
> I would like to be able to do that in an reST doc.

You can do that with current reST; you would end up with nested block
quotes.  I don't get it.  Can you show me an example of how you'd like
to apply this concept to reStructuredText sources?

>>>    | An Attribution identifies the source to whom a
>>>    | BlockQuote or Epigraph is ascribed.
>>> 
>>>    -- DocBook: The Definitive Guide
>>>       <http://www.docbook.org/tdg/en/html/attribution.html>
>>>
>>> would result in this:
>>>
>>> <blockquote
>>> cite="http://www.docbook.org/tdg/en/html/attribution.html">
>>>     An Attribution identifies the source to whom a BlockQuote
>>>     or Epigraph is ascribed.
>>>     <address class="attribution">DocBook: The Definitive
>>>     Guide</address>
>>> </blockquote>
>> 
>> The markup seems problematic to me.  There are two separate
>> constructs there, which aren't obviously related.  ...
>>  What about an attribution without a quoted block?
> 
> No special treatment. It will be handled as it is now.

How is it handled now?

>> If they were joined into one construct, it would be easier to
>> digest:
>> 
>>     | An Attribution identifies the source to whom a
>>     | BlockQuote or Epigraph is ascribed.
>>     |
>>     | -- DocBook: The Definitive Guide
>>     |    <http://www.docbook.org/tdg/en/html/attribution.html>
> 
> That could be construed as quoting the citation. That is,
> the quoted text has an attribution, and you're quoting
> that attribution.

The block quote part could require quote char + whitespace, and the
attribution could omit the whitespace.  That would disambiguate it:

    | An Attribution identifies the source to whom a
    | BlockQuote or Epigraph is ascribed.
    |
    |-- DocBook: The Definitive Guide
    |   <http://www.docbook.org/tdg/en/html/attribution.html>

But I still don't see the value of the attribution proposal, as
opposed to simply rendering the attribution as an ordinary paragraph.
I'd like to see a concrete example where the results would be
*useful*.  HTML's <blockquote cite="..."> doesn't seem to have
universal support.  How should attributions be handled for current
browsers (back to Netscape 4)?

Even more fundamentally, how should attributions be marked up in the
Docutils internal doctree?  I.E., what changes to the "block_quote"
element in spec/docutils.dtd?

> I think this would actually be more
> difficult to parse, because one would have to know
> whether this "attribution" is the last block of the
> blockquote to determine whether or not it gets parsed
> as an attribution.

That's easy to know.  What's not so easy (programmatically or to the
human eye) is to link two successive elements that don't have a
logical containing context.

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From cben@techunix.technion.ac.il  Sun Dec 15 17:13:39 2002
From: cben@techunix.technion.ac.il (Beni Cherniavsky)
Date: Sun, 15 Dec 2002 19:13:39 +0200 (IST)
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <BA220718.2D223%goodger@python.org>
Message-ID: <Pine.GSO.4.44_heb2.09.0212151851400.11885-100000@techunix.technion.ac.il>

On 2002-12-15, David Goodger wrote:

> I wouldn't want to accept a proposal that was incompatible
> with an Email Reader (even a future potential Email Reader).
>[snip]
> > after the "-- ".

Then note that "-- " is the standard singnature separtor.  Since then it's
alone on the line, this is not an issue, just a point to document.

Also note that I use " -- " for a long dash -- probably a LaTeX-induced
habit; I saw some other people writing so.  Stupid word wrapping can well
put "-- " at the beginning of a line in running text.  Again not an issue,
just document that "-- " must come after an empty line (?).

About email reading, also note that ">>> " becomes ambiguos between
doctest blocks and some email clients that compact nested "> " quoting by
omiting the spaces.  And while we are there, how about "initials> "
quoting?  Also the "On Someday, Random Writer wrote:" is probably an
attribution too.  Now how do you handle a quote that's broken in the
middle and resumed?  Add to that nesting...

>[snip]
> You can do that with current reST; you would end up with nested block
> quotes.I don't get it.  Can you show me an example of how you'd like
> to apply this concept to reStructuredText sources?
>
I think the proposal is similar to the complaints of C coders coming to
Python -- they want indentation to have no meaning so they can make it
reflect the program's structure but *in the way they like it*.  I'm not
saying that the request should be rejected based on this analogy and
Python's rejection of such requests (maybe it would be nicer in reST).
If you ask my personal opinion, I'm quite happy with reST's current style
(perhaps modulo allowing indented bulleted lists instead of empty lines
but I'm not settled on it).

-- 
Beni Cherniavsky <cben@tx.technion.ac.il>


From goodger@python.org  Sun Dec 15 20:46:22 2002
From: goodger@python.org (David Goodger)
Date: Sun, 15 Dec 2002 15:46:22 -0500
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <Pine.GSO.4.44_heb2.09.0212151851400.11885-100000@techunix.technion.ac.il>
Message-ID: <BA22544D.2D22B%goodger@python.org>

Beni Cherniavsky wrote:
... a bunch of email-related details omitted

Yes, there are many thorny issues wrt Email context.  That's why I
don't want to make any premature decisions, and why I invite anyone
who's interested to look into the issues.

> If you ask my personal opinion, I'm quite happy with reST's current
> style (perhaps modulo allowing indented bulleted lists instead of
> empty lines but I'm not settled on it).

Not following you.  Can you elaborate please?

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From lists@morpheus.demon.co.uk  Sun Dec 15 17:30:00 2002
From: lists@morpheus.demon.co.uk (Paul Moore)
Date: Sun, 15 Dec 2002 17:30:00 +0000
Subject: [Doc-SIG] reST block quotes
References: <3DFC095B.5040009@escape.com> <BA220718.2D223%goodger@python.org>
Message-ID: <n2m-g.adj78a1z.fsf@morpheus.demon.co.uk>

David Goodger <goodger@python.org> writes:

>>    Subheading
>> 
>>      example
>> 
>>      paragraph
>> 
>> I would like to be able to do that in an reST doc.
>
> You can do that with current reST; you would end up with nested
> block quotes.  I don't get it.  Can you show me an example of how
> you'd like to apply this concept to reStructuredText sources?

Sorry to butt in. I think the OP's point is that he wants to be able
to indent the source text, for readability of the source, *without*
having any effect on markup. So, for example::

    Section 1
    ---------

        This is section 1. I have indented it simply so that it shows
        up more clearly in the source text. I am *not* looking for a
        blockquote construct in the processed output.

I'm not entirely sure I agree with the suggestion, but I do sympathise
with it - I often use indentation in plain text postings, and it's vey
rarely for something I'd call a "block quote". But I haven't analysed
my "natural tendencies" against reST conventions, to be sure of this.

> HTML's <blockquote cite="..."> doesn't seem to have universal
> support.  How should attributions be handled for current browsers
> (back to Netscape 4)?

In IE6 on Windows 2000, the cite attribute seems to completely
disappear. I can't get it displayed, no matter what I do. Given this
fact, I'd avoid the cite construct like the plague - it involves a
serious risk of information just "disappearing".

Paul.
-- 
This signature intentionally left blank


From cben@techunix.technion.ac.il  Sun Dec 15 23:24:34 2002
From: cben@techunix.technion.ac.il (Beni Cherniavsky)
Date: Mon, 16 Dec 2002 01:24:34 +0200 (IST)
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <BA22544D.2D22B%goodger@python.org>
Message-ID: <Pine.GSO.4.44_heb2.09.0212152327300.16160-100000@techunix.technion.ac.il>

On 2002-12-15, David Goodger wrote:

> Beni Cherniavsky wrote:
> ... a bunch of email-related details omitted
>
> Yes, there are many thorny issues wrt Email context.That's why I
> don't want to make any premature decisions, and why I invite anyone
> who's interested to look into the issues.
>
All right, I'll start thinking about the emails I write "how will this
interact with rST?" ;-)  Of course this is biased since I already use many
parts of rST in emails.

> > If you ask my personal opinion, I'm quite happy with reST's current
> > style (perhaps modulo allowing indented bulleted lists instead of
> > empty lines but I'm not settled on it).
>
> Not following you.Can you elaborate please?
>
I meant that:
- I don't myself feel the need to free up indentation (but that doesn't
  mean that others don't).
- I'm not entirely happy with making empty lines around lists.  It takes
  to much real estate, especially if I make empty lines between items.  So
  I don't but then the list looks too isolated from the paragraph [1]_.
  - I understand that omitting the empty line, as is, would create an
    ambiguity.  It's not even clear to my eyes.
     - Demanding an extra space before the bullet would remove the
       ambiguity, I think.  Currently this means a list in a blockquote
       or a list in the definition of a definition list, depending on
       presense of empty line before.  Both are infrequently needed, so
       the empty comment hack looks acceptable to me (but I'm biased).
  - I'm sure I saw this discussed somewhere already but I can't find it in
    ``alternatives.txt`` [2]_...

.. [1] This reminds me of a different concern I had.  Some markup models
   (LaTeX and my brain ;-) think of paragraphs as logical beasts.  A
   paragraph could contain a list
    - (like this)
   or other things (especially blockquotes) and then continue.  There are
   three more combinations:
    - The thing is part of the previous paragraph, a new paragraph starts
      after it.

   - The thing can be a logical paragraph on its own.

    - The thing starts a new paragraph.
   Seems rare but consider a text where each quote is followed by some
   comments (as in emails).

   Most markup models (HTML, current rST) treat a paragraph as an atomic
   piece of text.  Any other construct terminates it.  But look at any
   book - it's not so!  LaTeX renders a new paragraph indented and a
   resumed paragraph without indentation.  Math formula "dysplays" are
   another example for things that could be part of a paragraph...

   So I want a way to represent the disctinctions.
    - As you see, the space-before-bullet format allows to express it for
      lists.  However blockquotes are not discernable from definition
      lists then
         (if the paragraph above would be one-line).

      - I'm not sure how to solve it.  Scanning the spec, it seems that
        only blockquotes create problems.  Maybe some explicit
        blockquote-marking syntax is needed after all.  This time an empty
        comment won't cut it.  But I don't see a good one.  Then maybe a
        definition list should be explicit.  How about terminating each
        definition line with " --" (removed in the output)?

    - Just tried putting a list in a substitution::
          Text |sub| text.

          .. |sub| replace::
             - Foo
             - Bar
      Didn't work.  (See, here I wanted the text-literal-text to form one
      logical paragraph).  I'm not sure it should work but it indicates
      the big issue -- the model that a paragraph contains no other
      elements must be abandoned to support this concept.

.. [2] Unrelated question: when should I use literal text (``),
   interpreted text (`) and no quoting?
    - What's the red line between an identifier and a piece of Python
      code?  If I refer to variable `foo` that's interpreted; if I refer
      to ``a() + b()``, that should probably be literal; what about
      `m.bar` where m is not a class or variable in current scope but
      conventionally stands for any "Matcher" object (there are many
      matcher classes) in some library I'm writing?
    - Should I put all filenames in literal quotes?  To a human it's
      already discernible when there is an extension (foo.py) so I'm not
      sure.
    - Generally the docs (including the PEPs) need some more discussion on
      where actually to use interpreted text...

-- 
Beni Cherniavsky <cben@tx.technion.ac.il>


From fantasai@escape.com  Mon Dec 16 04:16:41 2002
From: fantasai@escape.com (fantasai)
Date: Sun, 15 Dec 2002 23:16:41 -0500
Subject: [Doc-SIG] reST block quotes
References: <Pine.GSO.4.44_heb2.09.0212152327300.16160-100000@techunix.technion.ac.il>
Message-ID: <3DFD53A9.4010503@escape.com>

Beni Cherniavsky wrote:
 >
>    Most markup models (HTML, current rST) treat a paragraph as an atomic
>    piece of text.  Any other construct terminates it.  But look at any
>    book - it's not so!

Just a note: DocBook allows for this. <para> can contain
block-level content.
   http://www.docbook.org/tdg/en/html/para.html


From goodger@python.org  Mon Dec 16 04:35:40 2002
From: goodger@python.org (David Goodger)
Date: Sun, 15 Dec 2002 23:35:40 -0500
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <Pine.GSO.4.44_heb2.09.0212152327300.16160-100000@techunix.technion.ac.il>
Message-ID: <BA22C24B.2D258%goodger@python.org>

[Beni Cherniavsky]
> All right, I'll start thinking about the emails I write "how will
> this interact with rST?" ;-)  Of course this is biased since I
> already use many parts of rST in emails.

It's the parts of email messages not part of reStructuredText that
are interesting.

[Beni Cherniavsky]
>>> If you ask my personal opinion, I'm quite happy with reST's current
>>> style (perhaps modulo allowing indented bulleted lists instead of
>>> empty lines but I'm not settled on it).

[David Goodger]
>> Not following you.  Can you elaborate please?

[Beni Cherniavsky]
> I meant that:
> - I don't myself feel the need to free up indentation (but that
>   doesn't mean that others don't).
> - I'm not entirely happy with making empty lines around lists.  It
>   takes to much real estate, especially if I make empty lines
>   between items.

Something's gotta give.  For zero ambiguity (within reStructuredText's
framework), blank lines before & after lists are essential.

>   So I don't but then the list looks too isolated
>   from the paragraph [1]_.
>   - I understand that omitting the empty line, as is, would create an
>     ambiguity.  It's not even clear to my eyes.
>      - Demanding an extra space before the bullet would remove the
>        ambiguity, I think.

Too subtle, IMHO.

>        Currently this means a list in a blockquote
>        or a list in the definition of a definition list, depending on
>        presense of empty line before.

Definition list only in the case of a single line before the indent.

>        Both are infrequently needed, so
>        the empty comment hack looks acceptable to me (but I'm biased).

Frequently enough.

> .. [1] This reminds me of a different concern I had.  Some markup
> models (LaTeX and my brain ;-) think of paragraphs as logical beasts.

reStructuredText (and Docutils) treat paragraphs as physical.  It
would be impossible to reliably infer logical paragraph semantics from
plaintext sources.  The debate over physical model (a paragraph is a
block in the document flow) vs. logical model (paragraphs can contain
lists and block quootes and equations and others) has been around for
a long time and I don't see any resolution.  Personally, I prefer the
physical model, not least because it results in a much simpler DTD.
The logical model opens up a big can of worms.

>    A paragraph could contain a list
>     - (like this)
>    or other things (especially blockquotes) and then continue.
>    There are three more combinations:
>     - The thing is part of the previous paragraph, a new paragraph
>       starts after it.
> 
>    - The thing can be a logical paragraph on its own.
> 
>     - The thing starts a new paragraph.

I'm not sure what "the thing" is or where you're going with this.  If
the text of your message was meant as an example of what you're
proposing, I find it very hard to follow the structure.

>    Seems rare but consider a text where each quote is followed by some
>    comments (as in emails).

Not following you.

>    So I want a way to represent the disctinctions.

Not worth the trouble IMHO.

>     - Just tried putting a list in a substitution::
>           Text |sub| text.
> 
>           .. |sub| replace::
>              - Foo
>              - Bar
>       Didn't work.

Substitutions have to be phrase-level.  I can't remember if the parser
checks or not; if not, it'll go on the to-do.

>       (See, here I wanted the text-literal-text to form one logical
>       paragraph).  I'm not sure it should work but it indicates the
>       big issue -- the model that a paragraph contains no other
>       elements must be abandoned to support this concept.

And, as I said, I don't think it's worth the effort even if it were
feasible (which I doubt).  The computer doesn't really care that

    <p>Beginning of paragraph</p>
    <ul>
        <li>item one</li>
        <li>item two</li>
    </ul>
    <p>continuation of paragraph.</p>

is a single logical paragraph (not to mention *this* paragraph!).
The human reader picks it up right away though.  Only if there's a
first-line paragraph indent would it matter to the reader.

> .. [2] Unrelated question: when should I use literal text (``),
>    interpreted text (`) and no quoting?

Interpreted text hasn't really been implemented yet.  Its main client
will be the Python Source Reader, which is a work in progress.

>     - What's the red line between an identifier and a piece of
>       Python code?  If I refer to variable `foo` that's interpreted;
>       if I refer to ``a() + b()``, that should probably be literal;
>       what about `m.bar` where m is not a class or variable in
>       current scope but conventionally stands for any "Matcher"
>       object (there are many matcher classes) in some library I'm
>       writing?

`m.bar` would generate a warning or error.  The identifier must be in
the current namespace, with possible exceptions for stdlib modules.
In the case you're describing, I'd use inline literals ``m.bar``.

>     - Should I put all filenames in literal quotes?

Up to you and your document's context.

>     - Generally the docs (including the PEPs) need some more
>       discussion on where actually to use interpreted text...

When there is a use for them, the docs will discuss them.  Until
then, it would just be confusing.  It already *is* confusing, some
would say.

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From goodger@python.org  Mon Dec 16 04:38:36 2002
From: goodger@python.org (David Goodger)
Date: Sun, 15 Dec 2002 23:38:36 -0500
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <3DFD53A9.4010503@escape.com>
Message-ID: <BA22C2FB.2D259%goodger@python.org>

Beni Cherniavsky wrote:
>> Most markup models (HTML, current rST) treat a paragraph as an atomic
>> piece of text.  Any other construct terminates it.  But look at any
>> book - it's not so!

fantasai wrote:
> Just a note: DocBook allows for this. <para> can contain
> block-level content.
>     http://www.docbook.org/tdg/en/html/para.html

And the result is a mess, IMO.  Put a list inside a paragraph, and then
you'll have paragraphs inside paragraphs!  I don't consider a paragraph
to be a recursive element.  That's just my personal opinion though.

-- David Goodger    goodger@python.org


From fantasai@escape.com  Mon Dec 16 04:39:00 2002
From: fantasai@escape.com (fantasai)
Date: Sun, 15 Dec 2002 23:39:00 -0500
Subject: [Doc-SIG] reST block quotes
References: <Pine.GSO.4.44_heb2.09.0212151851400.11885-100000@techunix.technion.ac.il>
Message-ID: <3DFD58E4.1040005@escape.com>

Beni Cherniavsky wrote:
  >
 > Then note that "-- " is the standard singnature separtor.  Since then it's
 > alone on the line, this is not an issue, just a point to document.
 >
 > Also note that I use " -- " for a long dash -- probably a LaTeX-induced
 > habit; I saw some other people writing so.  Stupid word wrapping can well
 > put "-- " at the beginning of a line in running text.  Again not an issue,
 > just document that "-- " must come after an empty line (?).

Would using three dashes solve these problems?

 > About email reading, also note that ">>> " becomes ambiguos between
 > doctest blocks and some email clients that compact nested "> " quoting by
 > omiting the spaces.

Yes, that is true. That means either quoted blocks would
have to be implemented as an option, defaulting to 'off'
for backwards-compatability, or at least one space must
be required between quote characters.

Requiring at least one space before the quote character
might not be a bad idea. It improves readability IMO.

 > And while we are there, how about "initials> " quoting?

That can be dealt with later. It's not nearly as common,
and it's even less important for processing documents
(as opposed to emails and newsgroup posts), which is what
most reST files are.

 > Also the "On Someday, Random Writer wrote:" is probably an
 > attribution too.

It is, but it's not practical to parse that since
people use so many different formats. It would have
to be treated as a paragraph, which really isn't
that bad.

 > Now how do you handle a quote that's broken in the middle and resumed?

As multiple blockquotes. How would you do it with the
current syntax?

Come to think of it, the current syntax can't really
handle nested blockquotes well, can it? Not if there's
a quote at the beginning of another quote.

~fantasai


From goodger@python.org  Mon Dec 16 04:51:33 2002
From: goodger@python.org (David Goodger)
Date: Sun, 15 Dec 2002 23:51:33 -0500
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <3DFD58E4.1040005@escape.com>
Message-ID: <BA22C604.2D25D%goodger@python.org>

[Beni Cherniavsky]
>> About email reading, also note that ">>> " becomes ambiguos between
>> doctest blocks and some email clients that compact nested "> "
>> quoting by omiting the spaces.

[fantasai wrote]
> Yes, that is true. That means either quoted blocks would
> have to be implemented as an option, defaulting to 'off'
> for backwards-compatability, or at least one space must
> be required between quote characters.

The Email Reader would probably have to disable doctest blocks unless
explicitly requested.

> Requiring at least one space before the quote character
> might not be a bad idea. It improves readability IMO.

Obvious from your message ;).  My emailer doesn't do it that way
though and it would be onerous to require it.

> Come to think of it, the current syntax can't really
> handle nested blockquotes well, can it? Not if there's
> a quote at the beginning of another quote.

Try it; works fine.

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From fantasai@escape.com  Mon Dec 16 05:51:57 2002
From: fantasai@escape.com (fantasai)
Date: Mon, 16 Dec 2002 00:51:57 -0500
Subject: [Doc-SIG] reST block quotes
References: <BA22C604.2D25D%goodger@python.org>
Message-ID: <3DFD69FD.8070302@escape.com>

David Goodger wrote:
 >
> The Email Reader would probably have to disable doctest blocks unless
> explicitly requested.
> 
>>Requiring at least one space before the quote character
>>might not be a bad idea. It improves readability IMO.
> 
> Obvious from your message ;).  My emailer doesn't do it that way
> though and it would be onerous to require it.

It does solve the doctest problem, though. The requirement
could be switched off with doctest blocks. We'll have
a "Python documentation mode" and an "everything else"
mode. ;)

>>Come to think of it, the current syntax can't really
>>handle nested blockquotes well, can it? Not if there's
>>a quote at the beginning of another quote.
> 
> Try it; works fine.

Impressive. :)

I don't think the source would be very easy to follow
with a complicated set of quotes, but that is mostly
an electronic message problem. Most documents don't
even have two levels of nested quotes.

~fantasai


From fantasai@escape.com  Mon Dec 16 06:41:47 2002
From: fantasai@escape.com (fantasai)
Date: Mon, 16 Dec 2002 01:41:47 -0500
Subject: [Doc-SIG] compact HTML output from Docutils
In-Reply-To: <B9638EDE.26271%goodger@users.sourceforge.net>
References: <B9638EDE.26271%goodger@users.sourceforge.net>
Message-ID: <3DFD75AB.3040404@escape.com>

David Goodger wrote:
>
> - Check for and omit <p> tags in "simple" lists: list items contain
>   either a single paragraph, a nested simple list, or a paragraph
>   followed by a nested simple list. 

It would be more flexible for the author if you based the
omission of <p> tags in single-paragraph lists on whether
the list is spaced out or not. For example, this:

   - apples
   - oranges
   - pears

would not have paragraph tags whereas this:

   - This is really a paragraph, even though it's the only
     block of content in the list item.

   - A paragraph is the basic structural unit in prose.

would.

Another option is to trigger <p> tags only for multi-line
paragraphs.

> - Regardless of the above, in definitions, table cells, field bodies,
>   option descriptions, and list items, mark the first child with
>   'class="first"' if it is a paragraph.  The stylesheet sets the top
>   margin to 0 for these paragraphs.

Have you tried using p:first-child? It would be nice to
avoid cruft like 'class="first"'.

~fantasai


From cben@techunix.technion.ac.il  Mon Dec 16 10:57:59 2002
From: cben@techunix.technion.ac.il (Beni Cherniavsky)
Date: Mon, 16 Dec 2002 12:57:59 +0200 (IST)
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <BA22C24B.2D258%goodger@python.org>
Message-ID: <Pine.GSO.4.44_heb2.09.0212161158460.7733-100000@techunix.technion.ac.il>

On 2002-12-15, David Goodger wrote:

> [Beni Cherniavsky]
> > I meant that:
> > - I don't myself feel the need to free up indentation (but that
> > doesn't mean that others don't).
> > - I'm not entirely happy with making empty lines around lists.It
> > takes to much real estate, especially if I make empty lines
> > between items.
>
> Something's gotta give.For zero ambiguity (within reStructuredText's
> framework), blank lines before & after lists are essential.
>
> > So I don't but then the list looks too isolated
> > from the paragraph [1]_.
> > - I understand that omitting the empty line, as is, would create an
> >   ambiguity.  It's not even clear to my eyes.
> >    - Demanding an extra space before the bullet would remove the
> >      ambiguity, I think.
>
> Too subtle, IMHO.
>
You have a point.  This proposal would work best with logical paragraphs
but I understand that's not going to happen.

> >      Currently this means a list in a blockquote
> >      or a list in the definition of a definition list, depending on
> >      presense of empty line before.
>
> Definition list only in the case of a single line before the indent.
>
> >      Both are infrequently needed, so
> >      the empty comment hack looks acceptable to me (but I'm biased).
>
> Frequently enough.
>
I meant that blockquotes/defnitions *which are bullted lists* are
infrequent.

I propose to:
 1. Put aside the logical paragraphs (I was more or less convinced by
    following arguments).
 2. Allow the indented non-separated list as an equivallent syntax
    alternative.  See what people end up using more.  It's not ideal but I
    prefer it over the current.
     - Since logical paragraphs are rejected, this won't be ambiguos as a
       block quote (which keeps its separating lines).
     - It will be ambiguos sometimes with a definition list.  If something
       is too subtle, it's the definition list (IMHO) so I propose to
       require some marker at the end of the definition term.
        - My proposed " --" marker is not too pretty::

              Foo --
                Bar

              Quux --
                Quuux

          Maybe something other would be better.  " ::=" isn't pretty
          either...

        - This allows for more than one line of definition terms.  Could
          be abused then for e.g. Q&A which I'm not sure is good.

> > .. [1] This reminds me of a different concern I had.Some markup
> > models (LaTeX and my brain ;-) think of paragraphs as logical beasts.
>
> reStructuredText (and Docutils) treat paragraphs as physical.It
> would be impossible to reliably infer logical paragraph semantics from
> plaintext sources.

That's the biggest problem.  It would only be very clear if we require
intented first lines in logical paragraph (and then they are limited to
start with text).

> The debate over physical model (a paragraph is a
> block in the document flow) vs. logical model (paragraphs can contain
> lists and block quootes and equations and others) has been around for
> a long time and I don't see any resolution.

OK, I'll search the archives.

> Personally, I prefer the
> physical model, not least because it results in a much simpler DTD.
> The logical model opens up a big can of worms.
>
That's true.  I want that can :) -- but only it it could be represented in
a clean way.

> >  A paragraph could contain a list
> >   - (like this)
> >  or other things (especially blockquotes) and then continue.
> >  There are three more combinations:
> >   - The thing is part of the previous paragraph, a new paragraph
> >     starts after it.
> >
> >  - The thing can be a logical paragraph on its own.
> >
> >   - The thing starts a new paragraph.
>
> I'm not sure what "the thing" is or where you're going with this.If
> the text of your message was meant as an example of what you're
> proposing, I find it very hard to follow the structure.
>
The "thing" is a list, blockquote, or any other construct nested inside
the logical paragraph.  Yes, I tried to write in clever
form-matches-content style.  So the example is surely contrived, such
combinations are rare.  Nevertheless, you have a point - it's not very
clear.

The big problem is that it takes away too much of the inter-line spacing
freedom.  People won't observe it because indeed the human reader can see
it anyway.

> >  Seems rare but consider a text where each quote is followed by some
> >  comments (as in emails).
>
> Not following you.
>
I was refering to a nested construct (non-text) starting the paragraph.

> >  So I want a way to represent the disctinctions.
>
> Not worth the trouble IMHO.
>
Legitimate decision -- the trouble is big indeed ;-)

> >   - Just tried putting a list in a substitution::
> >         Text |sub| text.
> >
> >         .. |sub| replace::
> >          - Foo
> >            - Bar
> >     Didn't work.
>
> Substitutions have to be phrase-level.I can't remember if the parser
> checks or not; if not, it'll go on the to-do.
>
It correcrtly complains that a substitution must be a single paragraph.

> >     (See, here I wanted the text-literal-text to form one logical
> >   paragraph).  I'm not sure it should work but it indicates the
> >     big issue -- the model that a paragraph contains no other
> >     elements must be abandoned to support this concept.
>
> And, as I said, I don't think it's worth the effort even if it were
> feasible (which I doubt).The computer doesn't really care that
>
>   <p>Beginning of paragraph</p>
>   <ul>
>       <li>item one</li>
>       <li>item two</li>
>   </ul>
>   <p>continuation of paragraph.</p>
>
> is a single logical paragraph (not to mention *this* paragraph!).
> The human reader picks it up right away though.Only if there's a
> first-line paragraph indent would it matter to the reader.
>
Good points.  But that quite precludes rST as a good typesetting medium.
I think also that the vertical spacing differs (para/list spacing smaller
than inter-para).

> > .. [2] Unrelated question: when should I use literal text (``),
> >  interpreted text (`) and no quoting?
>
> [snip]
>
> When there is a use for them, the docs will discuss them.Until
> then, it would just be confusing.It already *is* confusing, some
> would say.
>
That's why I asked :-)

-- 
Beni Cherniavsky <cben@tx.technion.ac.il>


From cben@techunix.technion.ac.il  Mon Dec 16 11:17:35 2002
From: cben@techunix.technion.ac.il (Beni Cherniavsky)
Date: Mon, 16 Dec 2002 13:17:35 +0200 (IST)
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <3DFD58E4.1040005@escape.com>
Message-ID: <Pine.GSO.4.44_heb2.09.0212161259440.7733-100000@techunix.technion.ac.il>

On 2002-12-15, fantasai wrote:

> Beni Cherniavsky wrote:
> >
>  > Then note that "-- " is the standard singnature separtor.Since then it's
>  > alone on the line, this is not an issue, just a point to document.
>  >
>  > Also note that I use " -- " for a long dash -- probably a LaTeX-induced
>  > habit; I saw some other people writing so.Stupid word wrapping can well
>  > put "-- " at the beginning of a line in running text.Again not an issue,
>  > just document that "-- " must come after an empty line (?).
>
> Would using three dashes solve these problems?
>
Yes but there is no big need.  These are not real problems, they only make
the recongnition rules more subtle.

>  > About email reading, also note that ">>> " becomes ambiguos between
>  > doctest blocks and some email clients that compact nested "> " quoting by
>  > omiting the spaces.
>
> Yes, that is true. That means either quoted blocks would
> have to be implemented as an option, defaulting to 'off'
> for backwards-compatability, or at least one space must
> be required between quote characters.
>
Another option, not completely automatic but easy to use:

 +##
 +## Any bogus quoting style is recognized as such by a line before and/or
 +## after the paragraph the contains only the quoting string (which must
 +## be non-alphabetic, I don't see a good way to accomodate "FOO> ").
 + Nested quotes are recognized, generalizing the current mechanism.
 +

Trouble begins when breaking nested quotes (assume I wanted to place a
non-quoted comment between ...). and Nested... -- they won't be recognized
as nested.  In such (all?) cases, demand a space between the quoting
levels ("+ >#").

There is an ambiguity with lists => outlaw empty list items.

Diverectives can be implemented for declaring certain quoting style to
have some meaning (e.g. "# " == Python comments).

> Requiring at least one space before the quote character
> might not be a bad idea. It improves readability IMO.
>
But most mailers don't do it and manually converting is a huge pain.

>  > Also the "On Someday, Random Writer wrote:" is probably an
>  > attribution too.
>
> It is, but it's not practical to parse that since
> people use so many different formats. It would have
> to be treated as a paragraph, which really isn't
> that bad.
>
Agreed.

> > Now how do you handle a quote that's broken in the middle and resumed?
>
> As multiple blockquotes. How would you do it with the
> current syntax?
>
OK.  Just take care that different parts of an interrupted quote are at
the same nesting level (space compation is evil in this respect and should
probably be outlawed).

-- 
Beni Cherniavsky <cben@tx.technion.ac.il>


From goodger@python.org  Tue Dec 17 02:36:37 2002
From: goodger@python.org (David Goodger)
Date: Mon, 16 Dec 2002 21:36:37 -0500
Subject: [Doc-SIG] compact HTML output from Docutils
In-Reply-To: <3DFD75AB.3040404@escape.com>
Message-ID: <BA23F7E4.2D37E%goodger@python.org>

fantasai wrote:
> It would be more flexible for the author if you based the
> omission of <p> tags in single-paragraph lists on whether
> the list is spaced out or not. For example, this:

I'm happy with the current behavior, but that may have potential.
I'll add it as a "to do?" item and await a patch.

> Another option is to trigger <p> tags only for multi-line
> paragraphs.

Too arbitrary IMO.

> Have you tried using p:first-child? It would be nice to
> avoid cruft like 'class="first"'.

I did try it, but it didn't do what I needed.  The 'class="first"'
isn't added in all contexts, but selectively.  A 'p:first-child' style
isn't selective.  I don't remember the details, but a lot of things
were tried, and I feel the current code works best.

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From goodger@python.org  Tue Dec 17 02:37:40 2002
From: goodger@python.org (David Goodger)
Date: Mon, 16 Dec 2002 21:37:40 -0500
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <Pine.GSO.4.44_heb2.09.0212161158460.7733-100000@techunix.technion.ac.il>
Message-ID: <BA23F823.2D37F%goodger@python.org>

Beni Cherniavsky wrote:
> I meant that blockquotes/defnitions *which are bullted lists* are
> infrequent.

And I understood it in that way.  That situation is not rare; I've
written definition list items where the definition is just a bullet
list.  Block quotes containing just a list are also quite common,
depending on writing style, and are the subject of a "to-do?" item:

    * Allow for variant styles by interpreting indented lists as if
      they weren't indented? ...

See <http://docutils.sf.net/spec/notes.html>.

> I propose to:
...
>  2. Allow the indented non-separated list as an equivallent syntax
>     alternative.  See what people end up using more.  It's not ideal
>     but I prefer it over the current.

Sorry, I don't.  There's not enough value to justify the cost.

>      - It will be ambiguos sometimes with a definition list.  If
>        something is too subtle, it's the definition list (IMHO) so I
>        propose to require some marker at the end of the definition
>        term.

I don't want to add complexity to a construct which currently works
fine, just to enable a dubious optimization.

>         - This allows for more than one line of definition terms.
>           Could be abused then for e.g. Q&A which I'm not sure is
>           good.

There are related entries in the to-do:

    * Allow very long titles (on two or more lines)?

    * And for the sake of completeness, should definition list terms
      be allowed to be very long (two or more lines) also?

They'll stay in the to-do list until somebody presents a good enough
case for their implementation (and ideally, a patch as well).

>> The debate over physical model (a paragraph is a block in the
>> document flow) vs. logical model (paragraphs can contain lists and
>> block quootes and equations and others) has been around for a long
>> time and I don't see any resolution.
> 
> OK, I'll search the archives.

The debate I mentioned wasn't on this list (at least, not
exclusively).  It's a debate among document system designers that's
been going on as long as there has been markup (XML, SGML, GML before
it, perhaps others).  DocBook chose one path, HTML and OpenOffice.org
and Docutils another.

>> Personally, I prefer the physical model, not least because it
>> results in a much simpler DTD.  The logical model opens up a big
>> can of worms.
>
> That's true.  I want that can :) -- but only it it could be
> represented in a clean way.

The can of worms is that the logical model *cannot* be represented in
a clean way.

> The big problem is that it takes away too much of the inter-line
> spacing freedom.  People won't observe it because indeed the human
> reader can see it anyway.

I think the current situation is a good compromise (StructuredText
required blank lines between *each* list item!).  I think blank lines
help readability, and omitting them harms it.

> Good points.  But that quite precludes rST as a good typesetting
> medium.

reStructuredText is a limited medium.  Every markup strikes a balance
between readability and functionality; reStructuredText is heavy on
readability.  If your functionality needs are greater than what it
provides, there are plenty of other choices out there.  We can't have
everything.

> I think also that the vertical spacing differs (para/list
> spacing smaller than inter-para).

If I understand you correctly, that's a rendering issue.  I'm not sure
I do understand correctly though; please elaborate more & include
examples in future.

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From cben@techunix.technion.ac.il  Tue Dec 17 13:18:52 2002
From: cben@techunix.technion.ac.il (Beni Cherniavsky)
Date: Tue, 17 Dec 2002 15:18:52 +0200 (IST)
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <BA23F823.2D37F%goodger@python.org>
Message-ID: <Pine.GSO.4.44_heb2.09.0212171431421.2450-100000@techunix.technion.ac.il>

[I'm on the list, no need to cc: me]

On 2002-12-16, David Goodger wrote:

> Beni Cherniavsky wrote:
> > I meant that blockquotes/defnitions *which are bullted lists* are
> > infrequent.
>
> And I understood it in that way.That situation is not rare; I've
> written definition list items where the definition is just a bullet
> list.Block quotes containing just a list are also quite common,
> depending on writing style, and are the subject of a "to-do?" item:
>
Note that since the presence/absence of the empty lines is not going to
become meaningful (I agree that logical paragraphs are unexpressible in
rST), the will be no problem with blockquotes being lists.  Still it would
make the rules even more subtle.  The comment by Edward D. Loper (in
http://mail.python.org/pipermail/doc-sig/2001-April/001793.html):

  > No indentation is necessary. I suggest that if there *is*
  > indentation, an alternate interpretation is possible.

  When I read them, *I* don't interpret them differently (as an
  uninitiated reader).

is correct even now.  At least "every indented thing is a blockquote"
(modulo ::) is a simple rule to understand...  Consider the indeted list
idea more or less withdrawn.

>   * Allow for variant styles by interpreting indented lists as if
>     they weren't indented? ...
>
> See <http://docutils.sf.net/spec/notes.html>.
>
That's where I saw the idea before! :-)

> > I propose to:
> ...
> >2. Allow the indented non-separated list as an equivallent syntax
> >   alternative.  See what people end up using more.  It's not ideal
> >   but I prefer it over the current.
>
> Sorry, I don't.There's not enough value to justify the cost.
>
All right, I'll defer to your judgment.

> >> The debate over physical model (a paragraph is a block in the
> >> document flow) vs. logical model(paragraphs can contain lists and
> >> block quootes and equations and others) has been around for a long
> >> time and I don't see any resolution.
> >
> > OK, I'll search the archives.
>
> The debate I mentioned wasn't on this list (at least, not
> exclusively).It's a debate among document system designers that's
> been going on as long as there has been markup (XML, SGML, GML before
> it, perhaps others).DocBook chose one path, HTML and OpenOffice.org
> and Docutils another.
>
Oh, that's to big to read ;-).  But I can figure it out from the resulting
formats.  Docbook's choice is obvious -- the logical model is richer.

> The can of worms is that the logical model *cannot* be represented in
> a clean way.
>
Accepted.

> > Good points.But that quite precludes rST as a good typesetting
> > medium.
>
> reStructuredTextis a limited medium.Every markup strikes a balance
> between readability and functionality; reStructuredText is heavy on
> readability.If your functionality needs are greater than what it
> provides, there are plenty of other choices out there.We can't have
> everything.
>
Of course.  Just remembered reading somebody hoping to write his next book
in rST, so I thought that would be posible.  Quite naturally, it can't
really be expected (except for an rST write -> convert -> polish layout in
target format)...

> > I think also that the vertical spacing differs (para/list
> > spacing smaller than inter-para).
>
> If I understand you correctly, that's a rendering issue.I'm not sure
> I do understand correctly though; please elaborate more & include
> examples in future.
>
LaTeX pseudo-screenshot (unconfirmed, memory/imagination mix ;)::

        Standalone logical
      paragraph.
  (1)
          * List that's outside the paragraph.
  (2)
          * Another item.
  (3)
        Another logical paragraph,
      containing:
  (4)
          * this list,
  (5)
      and continuing here.

The vertical spacings (1) and (3) are spaces between separate logical
paragraphs [1]_.  The spacings (4) and (5) are around a list inside a
paragraph.  These need not be equal and determining which spacing to use
is impossible without knowing logical paragraph structure.

.. [1] this is not that simple in LaTeX which uses first-line indent with
   no extra vertical space between paragraphs.  So (1) and (3) are forced
   by the minimal vertical spacing in a list and it might indeed be equal
   to (4) and (5).  However other typesetting style might hav ea
   difference (they most probably will, if they don't have use first-line
   indentation).


-- 
Beni Cherniavsky <cben@tx.technion.ac.il>


From akuchlin@mems-exchange.org  Tue Dec 17 13:43:28 2002
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Tue, 17 Dec 2002 08:43:28 -0500
Subject: [Doc-SIG] Adding new inline markup?
Message-ID: <E18OI0O-000459-00@ute.mems-exchange.org>

RST only supports a few different inline markup notations, such as *
and ** for emphasis, ` for interpreted things, &c.  For my application
I'd like to add some more inline markups, such as /cited text/.

Should I expect to be able to subclass Inliner in order to add new
notations?  Right now that's rather messy.  Inliner.dispatch is a
dictionary mapping symbols to handler methods, but the large regular
expression stored as Inliner.parts doesn't take this into account.  

First, is there some other, better, way of adding new inline notations?

Second, if not, should it be possible by subclassing Inliner?  A
possible way of handling this would be to add an internal method
_get_initial_pattern() to the Inliner class that synthesized the
'parts' regular expression, using self.dispatch.keys() to match all the
listed inline markups.  Inliner is fairly complicated, though, so
maybe there are additional changes that would be necessary.

--amk                                                             (www.amk.ca)
It was astonishing the number of useless things people found to do.
      -- R.H. Barlow and H.P. Lovecraft, "The Night Ocean"


From cben@techunix.technion.ac.il  Tue Dec 17 13:51:15 2002
From: cben@techunix.technion.ac.il (Beni Cherniavsky)
Date: Tue, 17 Dec 2002 15:51:15 +0200 (IST)
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <Pine.GSO.4.44_heb2.09.0212161259440.7733-100000@techunix.technion.ac.il>
Message-ID: <Pine.GSO.4.44_heb2.09.0212171522520.2450-100000@techunix.technion.ac.il>

On 2002-12-16, Beni Cherniavsky wrote:

> On 2002-12-15, fantasai wrote:
>
> > Beni Cherniavsky wrote:
>
> > Requiring at least one space before the quote character
> > might not be a bad idea. It improves readability IMO.
> >
> But most mailers don't doit and manually converting is a huge pain.
>
> >> Also the "On Someday, Random Writer wrote:" is probably an
> >> attribution too.
> >
> > It is, but it's not practical to parse that since
> > people use so many different formats. It would have
> > to be treated as a paragraph, which really isn't
> > that bad.
> >
> Agreed.
>
The more I think about an email reader the harder it looks.  There are two
possible goals:

- Provide a slightly modified rST syntax that would making writing rST
  in emails more convenient (and maybe other goodies like processing the
  headers).  This is surely a goal.

- Simplify as much as possible convertions of mail written by rST-unaware
  people, or people writing half-rST mails (I do, especially when mailing
  without relation to Python).  This goal is very desired but can conflict
  with the first goal.

First, what's inconvenient with writing rST in email as it is now?
Nothing critical (except the quoting) but there are little issues here and
there.

One that I constantly experience is that indenting things in Pine is so
inconvenient.  True, I should switch to an external editor but I probably
prefer to accumulate the pain until it'll itch enough to write my own with
proper rST support (either an emacs mode or something standalone).  Don't
hold your breath.

I do think that some indentation demands could be relaxed, in particular
literal blocks should be allowed in column 0 (i.e. even with negative
indent!).  That would simplify cut-and-paste, in and out of the mail (e.g.
literal Pythom code can't be easily pasted into the interpreter prompt if
indented; doctest style is even worse).

The second goal opens a can of worms -- free text is too free to parse
automatically and needs corrections.  A good rST editor could ease the
editing but another possible approach is allowing many modes in the
reader, so you mark a whole message accrding to the style it's written in.
In the long run this is only useful if a normaling rST->rST convertion is
implemented...

BTW, maybe a generic syntax diagram parser [generator] would be useful to
rapidly experiment with rST syntax variations.  Getting automatic
"indent/reduce conflict" reporting would be very cool :-).

-- 
Beni Cherniavsky <cben@tx.technion.ac.il>


From aahz@pythoncraft.com  Tue Dec 17 15:22:58 2002
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 17 Dec 2002 10:22:58 -0500
Subject: [Doc-SIG] Adding new inline markup?
In-Reply-To: <E18OI0O-000459-00@ute.mems-exchange.org>
References: <E18OI0O-000459-00@ute.mems-exchange.org>
Message-ID: <20021217152258.GA20148@panix.com>

[This is probably better handled on docutils-developers, but I'll let
David make that decision.]

On Tue, Dec 17, 2002, Andrew Kuchling wrote:
>
> RST only supports a few different inline markup notations, such as *
> and ** for emphasis, ` for interpreted things, &c.  For my application
> I'd like to add some more inline markups, such as /cited text/.

Why can't you use ` for cited text?  Remember that you're allowed to
have different kinds of interpreted text::

    :cite:`cited text`
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"To me vi is Zen.  To use vi is to practice zen.  Every command is a
koan.  Profound to the user, unintelligible to the uninitiated.  You
discover truth everytime you use it."  --reddy@lion.austin.ibm.com


From akuchlin@mems-exchange.org  Tue Dec 17 15:50:09 2002
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Tue, 17 Dec 2002 10:50:09 -0500
Subject: [Doc-SIG] Adding new inline markup?
In-Reply-To: <20021217152258.GA20148@panix.com>
References: <E18OI0O-000459-00@ute.mems-exchange.org> <20021217152258.GA20148@panix.com>
Message-ID: <20021217155009.GA16023@ute.mems-exchange.org>

On Tue, Dec 17, 2002 at 10:22:58AM -0500, Aahz wrote:
>Why can't you use ` for cited text?  Remember that you're allowed to
>have different kinds of interpreted text::
>    :cite:`cited text`

Oh, I wasn't aware of that!  Thanks!  It would be marginally easier
for the intended audience if a simpler notation like /cited/ was
possible, but I can live with using the role notation, and it means I
don't need to pick more typographic symbols for everything.  (%per se%
for foreign text, @DARPA@ for acronyms, ad nauseam...)

--amk                                                             (www.amk.ca)
LaTeX2HTML is pain.
      -- Fred Drake in a documentation checkin message, 14 Mar 2000


From aahz@pythoncraft.com  Tue Dec 17 17:38:08 2002
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 17 Dec 2002 12:38:08 -0500
Subject: [Doc-SIG] Adding new inline markup?
In-Reply-To: <20021217155009.GA16023@ute.mems-exchange.org>
References: <E18OI0O-000459-00@ute.mems-exchange.org> <20021217152258.GA20148@panix.com> <20021217155009.GA16023@ute.mems-exchange.org>
Message-ID: <20021217173808.GA11636@panix.com>

On Tue, Dec 17, 2002, Andrew Kuchling wrote:
> On Tue, Dec 17, 2002 at 10:22:58AM -0500, Aahz wrote:
>>
>>Why can't you use ` for cited text?  Remember that you're allowed to
>>have different kinds of interpreted text::
>>    :cite:`cited text`
> 
> Oh, I wasn't aware of that!  Thanks!  It would be marginally easier
> for the intended audience if a simpler notation like /cited/ was
> possible, but I can live with using the role notation, and it means I
> don't need to pick more typographic symbols for everything.  (%per se%
> for foreign text, @DARPA@ for acronyms, ad nauseam...)

For acronyms use | (pipe), assuming you want it expanded.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"To me vi is Zen.  To use vi is to practice zen.  Every command is a
koan.  Profound to the user, unintelligible to the uninitiated.  You
discover truth everytime you use it."  --reddy@lion.austin.ibm.com


From goodger@python.org  Wed Dec 18 00:53:33 2002
From: goodger@python.org (David Goodger)
Date: Tue, 17 Dec 2002 19:53:33 -0500
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <Pine.GSO.4.44_heb2.09.0212171431421.2450-100000@techunix.technion.ac.il>
Message-ID: <BA25313B.2D3E8%goodger@python.org>

Beni Cherniavsky wrote:
> [I'm on the list, no need to cc: me]

Just a common courtesy on lists.  I *will not remember* to remove your
address in future.  Might as well give up now ;)

> Just remembered reading somebody hoping to write his next book in
> rST, so I thought that would be posible.

That would be Aahz.  As far as I know, he still is using Docutils for
his book, via OpenOffice.org (confirm/deny, Aahz?).

> Quite naturally, it can't really be expected (except for an rST
> write -> convert -> polish layout in target format)...

Depends on the complexity of the document.  I wouldn't be surprised if
some tweaking were necessary.  However Docutils is still young and
flexible.

>>> I think also that the vertical spacing differs (para/list
>>> spacing smaller than inter-para).
>>
>> If I understand you correctly, that's a rendering issue.I'm not
>> sure I do understand correctly though; please elaborate more &
>> include examples in future.
>>
> LaTeX pseudo-screenshot (unconfirmed, memory/imagination mix ;)::

Thanks for the explanation.  Indeed, a rendering issue.

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From goodger@python.org  Wed Dec 18 00:54:42 2002
From: goodger@python.org (David Goodger)
Date: Tue, 17 Dec 2002 19:54:42 -0500
Subject: [Doc-SIG] Adding new inline markup?
In-Reply-To: <20021217173808.GA11636@panix.com>
Message-ID: <BA253181.2D3E9%goodger@python.org>

[Andrew Kuchling]
> RST only supports a few different inline markup notations, such as *
> and ** for emphasis, ` for interpreted things, &c.  For my
> application I'd like to add some more inline markups, such as /cited
> text/.

[Aahz]
> [This is probably better handled on docutils-developers, but I'll
> let David make that decision.]

No biggie.  Overlap is inevitable.

> Why can't you use ` for cited text?  Remember that you're allowed to
> have different kinds of interpreted text::
> 
>     :cite:`cited text`

[Andrew Kuchling]
> Oh, I wasn't aware of that!  Thanks!  It would be marginally easier
> for the intended audience if a simpler notation like /cited/ was
> possible, but I can live with using the role notation, and it means
> I don't need to pick more typographic symbols for everything.  (%per
> se% for foreign text, @DARPA@ for acronyms, ad nauseam...)

The intention of interpreted text roles is to allow new inline
descriptive markup, with the simultaneous advantage and disadvantage
of being explicit.  If your application has one "main" role, that can
be the default (i.e. no explicit role required, just `backquotes`).
This area hasn't been explored much nor has any support code been
written.  For example, I'm not sure when to validate roles and process
the interpreted text: in the parser, in the reader, or in a transform.
It could be that the "interpreted" element may disappear from the
Docutils internal doctree, just as the "directive" element did.

[Aahz]
> For acronyms use | (pipe), assuming you want it expanded.

Pipes are used for |substitutions|, which are like inline directives,
allowing graphics and arbitrary constructs within text.  Replacing an
acronym with its full text is one application.  See
<http://docutils.sf.net/spec/rst/reStructuredText.html>.

[Andrew Kuchling]
> Should I expect to be able to subclass Inliner in order to add new
> notations?

If necessary, yes, but it hasn't been necessary yet so that
functionality hasn't been added (XP's "add no functionality before its
time").

> Right now that's rather messy.  Inliner.dispatch is a
> dictionary mapping symbols to handler methods, but the large regular
> expression stored as Inliner.parts doesn't take this into account.
...
> A possible way of handling this would be to add an internal method
> _get_initial_pattern() to the Inliner class that synthesized the
> 'parts' regular expression, using self.dispatch.keys() to match all
> the listed inline markups.

Way ahead of you ;).  Look again, and you'll see that
``Inliner.parts`` isn't a regexp, it's a data structure that's used to
synthesize a regexp.  ``Inliner.patterns.initial`` is built by the
``build_regexp`` function (which see for a description of the data
structure).  This issue *has* come up before, WRT embedded URIs, and
although the support wasn't used for that, it did simplify the regular
expression (you should've seen it before!).

A subclass should be able to extend (or replace) this data structure
and re-synthesize the regexp.

> Inliner is fairly complicated, though, so maybe there are additional
> changes that would be necessary.

Probably :).  Limitations are often discovered when the code is
exercised in novel and interesting ways.

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From goodger@python.org  Wed Dec 18 00:55:56 2002
From: goodger@python.org (David Goodger)
Date: Tue, 17 Dec 2002 19:55:56 -0500
Subject: [Doc-SIG] Email Reader (was Re: reST block quotes)
In-Reply-To: <Pine.GSO.4.44_heb2.09.0212171522520.2450-100000@techunix.technion.ac.il>
Message-ID: <BA2531CB.2D3EA%goodger@python.org>

Beni Cherniavsky wrote:
> The more I think about an email reader the harder it looks.

:)

> I do think that some indentation demands could be relaxed, in
> particular literal blocks should be allowed in column 0 (i.e. even
> with negative indent!).

How would you know when they end?

> The second goal opens a can of worms -- free text is too free to
> parse automatically and needs corrections.

I suspect that an actual application for an Email Reader must present
itself before we can make these decisions.  IOW, what's the use case?
I don't know that there's much value in supporting email in general,
but I do know there would be much pain.

> BTW, maybe a generic syntax diagram parser [generator] would be
> useful to rapidly experiment with rST syntax variations.

Sounds cool, and non-trivial :)

> Getting automatic "indent/reduce conflict" reporting would be very
> cool :-).

Again, not following you.  Not enough words!

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From aahz@pythoncraft.com  Wed Dec 18 04:20:28 2002
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 17 Dec 2002 23:20:28 -0500
Subject: [Doc-SIG] reST block quotes
In-Reply-To: <BA25313B.2D3E8%goodger@python.org>
References: <Pine.GSO.4.44_heb2.09.0212171431421.2450-100000@techunix.technion.ac.il> <BA25313B.2D3E8%goodger@python.org>
Message-ID: <20021218042027.GA1679@panix.com>

On Tue, Dec 17, 2002, David Goodger wrote:
> Beni Cherniavsky wrote:
>>
>> Just remembered reading somebody hoping to write his next book in
>> rST, so I thought that would be posible.
> 
> That would be Aahz.  As far as I know, he still is using Docutils for
> his book, via OpenOffice.org (confirm/deny, Aahz?).

Yup, still moving forward, though more slowly than I'd like on the
content side.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"To me vi is Zen.  To use vi is to practice zen.  Every command is a
koan.  Profound to the user, unintelligible to the uninitiated.  You
discover truth everytime you use it."  --reddy@lion.austin.ibm.com


From cben@techunix.technion.ac.il  Wed Dec 18 17:01:08 2002
From: cben@techunix.technion.ac.il (Beni Cherniavsky)
Date: Wed, 18 Dec 2002 19:01:08 +0200 (IST)
Subject: [Doc-SIG] Re: Email Reader (was Re: reST block quotes)
In-Reply-To: <BA2531CB.2D3EA%goodger@python.org>
Message-ID: <Pine.GSO.4.44_heb2.09.0212181837290.29141-100000@techunix.technion.ac.il>

On 2002-12-17, David Goodger wrote:

> Beni Cherniavsky wrote:
> > I do think that some indentation demands could be relaxed, in
> > particular literal blocks should be allowed in column 0 (i.e. even
> > with negative indent!).
>
> How would you know when they end?
>
Forgot to say, until the first blank line.  Crude but can simplify typing
in many cases of short literal fragments...

> > The second goal opens a can of worms -- free text is too free to
> > parse automatically and needs corrections.
>
> I suspect that an actual application for an Email Reader must present
> itself before we can makethese decisions.IOW, what's the use case?
> I don't know that there's much value in supporting email in general,
> but I do know there would be much pain.
>
:)  Interesting that I didn't think of it before.  I have no good
application in my head :).  THe only one is taking interesting half-rST
emails I write and easily converting them to standalone valid rST
articles.  This could be done with a sloppy/bendable rST reader fed to an
rST writer, to normilize the text...

> > BTW, maybe a generic syntax diagram parser [generator] would be
> > useful to rapidly experiment with rST syntax variations.
>
> Sounds cool, and non-trivial :)
>
True.  I do have a scheme for only inline markup and bullets, which would
define the markup->xml mappings on the spot (one-time or scoped, with
attribute merging).  Something like::

<section :: ====heading==== `a href=`__ {|p|} {-ul-} {.ol.} - li>
==== Heading ====
{| A paragraph. |}
{-
   - List item with *emphasized* text.
     :: *em*
   - Another item
     containing `a link`_.
     :: `"http://target/url"`_
-}
</section>

This is largely inspired by rST's style of putting long markup data
outside of the text (e.g. link targets, substitions).  I already think I
know how to interpret this, I just need to find time and write it :).
Than I will have a very handy xml typing notation.

> > Getting automatic "indent/reduce conflict" reporting would be very
> > cool :-).
>
> Again, not following you.Not enough words!
>
That was suppossed to be a joking reference to Yacc's "shift/reduce
conflict" messages...

-- 
Beni Cherniavsky <cben@tx.technion.ac.il>


From goodger@python.org  Thu Dec 19 01:17:08 2002
From: goodger@python.org (David Goodger)
Date: Wed, 18 Dec 2002 20:17:08 -0500
Subject: [Doc-SIG] Python Reader module parser now usable
Message-ID: <BA268843.2D57F%goodger@python.org>

The first part of the Docutils Python Source Reader component is in a
usable form:
<http://docutils.sf.net/docutils/readers/python/moduleparser.py>.  It
takes a module's text (a string) and converts it into a
documentation-oriented tree.  Assignments/attributes, functions,
classes, and methods are all converted.  Arbitrarily complex
right-hand sides of assignments (including default parameter values)
are supported by parsing tokens from tokenize.py.  Comments are not
handled yet, and namespaces are not computed.  There's a list of open
issues at the end of the module docstring.

I've also added a "showdoc" script to test/test_readers/test_python
which processes input from the test_parser.py module, or stdin,
depending on how it's called.  Please play with these; any input is
welcome.

Here's a sample.  Given this module as input::

    # comment

    """Docstring"""

    """Additional docstring"""

    __docformat__ = 'reStructuredText'

    a = 1
    """Attribute docstring"""

    class C(Super):

        """C's docstring"""

        class_attribute = 1
        """class_attribute's docstring"""

        def __init__(self, text=None):
            """__init__'s docstring"""

            self.instance_attribute = (text * 7
                                       + ' whaddyaknow')
            """instance_attribute's docstring"""


    def f(x,                            # parameter x
          y=a*5,                        # parameter y
          *args):                       # parameter args
        """f's docstring"""
        return [x + item for item in args]

    f.function_attribute = 1
    """f.function_attribute's docstring"""

Here's the output tree, with objects converted to the pseudo-XML form
I'm fond of::

    <Module filename="<stdin>">
        <Docstring>
            Docstring
        <Docstring lineno="5">
            Additional docstring
        <Attribute lineno="7" name="__docformat__">
            <Expression lineno="7">
                'reStructuredText'
        <Attribute lineno="9" name="a">
            <Expression lineno="9">
                1
            <Docstring lineno="10">
                Attribute docstring
        <Class bases="Super" lineno="12" name="C">
            <Docstring lineno="12">
                C's docstring
            <Attribute lineno="16" name="class_attribute">
                <Expression lineno="16">
                    1
                <Docstring lineno="17">
                    class_attribute's docstring
            <Method lineno="19" name="__init__">
                <Docstring lineno="19">
                    __init__'s docstring
                <ParameterList lineno="19">
                    <Parameter lineno="19" name="self">
                    <Parameter lineno="19" name="text">
                        <Default lineno="19">
                            None
                <Attribute lineno="22" name="self.instance_attribute">
                    <Expression lineno="22">
                        (text * 7 + ' whaddyaknow')
                    <Docstring lineno="24">
                        instance_attribute's docstring
        <Function lineno="27" name="f">
            <Docstring lineno="27">
                f's docstring
            <ParameterList lineno="27">
                <Parameter lineno="27" name="x">
                <Parameter lineno="27" name="y">
                    <Default lineno="27">
                        a * 5
                <ExcessPositionalArguments lineno="27" name="args">
        <Attribute lineno="33" name="f.function_attribute">
            <Expression lineno="33">
                1
            <Docstring lineno="34">
                f.function_attribute's docstring

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From b.fallenstein@gmx.de  Thu Dec 19 17:52:52 2002
From: b.fallenstein@gmx.de (Benja Fallenstein)
Date: Thu, 19 Dec 2002 18:52:52 +0100
Subject: [Doc-SIG] Email Reader (was Re: reST block quotes)
In-Reply-To: <BA2531CB.2D3EA%goodger@python.org>
References: <BA2531CB.2D3EA%goodger@python.org>
Message-ID: <3E020774.6010104@gmx.de>

Hi David, hi Beni,

David Goodger wrote:

>Beni Cherniavsky wrote:
>  
>
>>The second goal opens a can of worms -- free text is too free to
>>parse automatically and needs corrections.
>>    
>>
>
>I suspect that an actual application for an Email Reader must present
>itself before we can make these decisions.  IOW, what's the use case?
>I don't know that there's much value in supporting email in general,
>but I do know there would be much pain.
>

I've been idly pondering ReST email before, so I have my own ideas about 
this :-)

I don't think that parsing arbitrary incoming emails as ReST makes much 
sense-- it'd be like running all .txt files on my harddisk through the 
ReST tools. I wouldn't want to use this on any emails but those 
explicitly written as ReST.

The applications I envision are two flavors of email user agent (for 
reading & writing email):

- A text-based client, which would provide tools for composing ReST 
e-mails, most notably a syntax validator (so that you don't accidentally 
send emails that cannot be read because they aren't ReST); this would 
send emails as text/vnd.python.rst or so.
- A graphical client, which would render text/vnd.python.rst e-mails 
graphically, and allow for graphical composing of ReST e-mails, like 
Mozilla's HTML email composer.

This would finally allow me to use hyperlinks, variable-width fonts, 
italics etc.pp. in composing & reading e-mail, without worrying about 
unreadable font size settings, people complaining about HTML mail bloat, 
people turning off HTML mail (like me :) ), or readability on text 
terminals; instead, I'd know that any email client would get something 
it can read, w/o bloat, and people with a graphical ReST mail reader 
would see the mail the way I typed it (modulo preferences like font or 
text size).

The difficult question is how to do quoting; I think (contrary to what 
some people seem to be saying here) that quoting should not generally be 
done through literal blocks-- I think quoting of non-ReST mails should 
be, but quoting of ReST mails should preserve the formatting of the 
quoted mail. Also I think that quoting should be done through the 
commonplace ">" syntax; this is simply the most wide-spread variant, and 
forcing something else down peoples' throats seems wrong. To 
disambiguify from doctest blocks, choosing the ">" + space syntax as 
required for ReSTmail seems good.

To discriminate between literal and non-literal quotings, I'd suggest:

I wrote:
 > foo
 > bar

for non-literal quoting, and

I wrote::
 > foo
 > bar

for literal quoting.

More to the point, the rules would be as follows:

- A block where each line is prefixed by "> " (angle bracket + space) is 
*quoted*.
- A block where all but the first line is quoted by "> " is *quoted with 
source given*.
- If the first line in a "quoted with source given" block ends in a 
double colon, this quoted block and all following quoted blocks 
*without* source given would also be literal blocks.
- If the first line does not end in a double colon, this quoted block 
and all following ones w/o source would just be quoted, not literal blocks.
- (If there's a quoted block w/o source, and there is no quoted block 
with source above, that block would be just quoted, not literal.)

Here's an example:

=== snip ===
 > Blabla

Foo wrote::
 > Bla, *baz*
 > Foo

Then, bar wrote:
 > Foo, *blaabla*!
=== /snip ===

This could then become something like:

=== snip ===
<quoted>
<quote>
Blabla
</quote>
</quoted>
<quoted>
<source>
Foo wrote:
</source>
<quote>
<literal>
Bla, *baz*
Foo
</literal>
</quote>
</quoted>
<quoted>
<source>
Then, bla wrote:
</source>
<quote>
Foo, <em>blablaa</em>!
</quote>
</quoted>
=== /snip ===

- Benja


From cben@techunix.technion.ac.il  Thu Dec 19 18:57:53 2002
From: cben@techunix.technion.ac.il (Beni Cherniavsky)
Date: Thu, 19 Dec 2002 20:57:53 +0200 (IST)
Subject: [Doc-SIG] Email Reader (was Re: reST block quotes)
In-Reply-To: <3E020774.6010104@gmx.de>
Message-ID: <Pine.GSO.4.44_heb2.09.0212192017370.14994-100000@techunix.technion.ac.il>

On 2002-12-19, Benja Fallenstein wrote:

> I've been idly pondering ReST email before, so I have my own ideas about
> this :-)
>
> I don't think that parsing arbitrary incoming emails as ReST makes much
> sense-- it'd be like running all .txt files on my harddisk through the
> ReST tools. I wouldn't want to use this on any emails butthose
> explicitly written as ReST.
>
True.  But problems can arise with:

1. A thread where some people write rST and some don't.  The quoted
   illegal parts can easily spoil the rST context for the rest.  Literal
   quoting for non-rST parts doesn't solve this entirely because the rST
   messages quoted by the non rST parts lose their rST information...

2. Sloppy mail.  I don't see myself writing 100% correct rST all the time
   and I don't want it pushed down my throat.  If that's the choice, I'd
   be more happy with writing rST as I see fit and using only the eyeball
   reader ;-).

3. Once mail is sent, it generally can't be edited (since it's already
   archived and people already hav it in their mailboxes).  I'm not sure
   if anything can be done about it but it's a pecularity of email that
   should be remembered.

Something like treating every paragraph that doesn't parse as literal
would solve some of the issues.

> The applications I envision are two flavors of email user agent (for
> reading & writing email):
>
*Breakage ahead*: your list is broken by line folding.  Unexpected line
folding in the places where you least want it is all too popular with
email tool writers ;-(.  Good rST that you write will end up broken for
some readers sooner or later.  The "lazy indentation" ideas from the to-do
list could help here a bit.

> - A text-based client, which would provide tools for composing ReST
> e-mails, most notably a syntax validator (so that you don't accidentally
> send emails that cannot be read because they aren't ReST); this would
> send emails as text/vnd.python.rst or so.

I want such a tool.  I feel more need for editing features than for
validation (all right, one day there will be an emacs mode).  Of course it
would reflow paragraphs instead of broken line folding :-).

I'm not sure I like text/vnd.python.rst.  I don't want another text/html.
Most mail readers will probably complain that they don't know how to read
it.  Sending identical content both as plain text and rST is more stupid
than sending text version of html mail :).  I think it should be marked as
plain text and be autodetected simply by validating.

> - A graphical client, which would render text/vnd.python.rst e-mails
> graphically, and allow for graphical composing of ReST e-mails, like
> Mozilla's HTML email composer.
>
> This would finally allow me to use hyperlinks, variable-width fonts,
> italics etc.pp. in composing & reading e-mail, without worrying about
> unreadable font size settings, people complaining about HTML mail bloat,
> people turning off HTML mail (like me :) ), or readability on text
> terminals; instead, I'd know that any email client would get something
> it can read, w/o bloat, and people with a graphical ReST mail reader
> would see themail the way I typed it (modulo preferences like font or
> text size).
>
> The difficult question is how to do quoting; I think (contrary to what
> some people seem to be saying here) that quoting should not generally be
> done through literal blocks-- I think quoting of non-ReST mails should
> be, but quoting of ReST mails should preserve the formatting of the
> quoted mail.

Who argued that?  IIRC, all the discussion assumed that the quoted block
can be literal or not depending on presence of ``::``, similarly to the
way it currently behaves with indented blocks.

> Also I think that quoting should be done through the
> commonplace ">" syntax; this is simply the most wide-spread variant, and
> forcing something else down peoples' throats seems wrong. To
> disambiguify from doctest blocks, choosing the ">" + space syntax as
> required for ReSTmail seems good.
>
> To discriminate between literal and non-literal quotings, I'd suggest:
>
> I wrote:
>  > foo
>  > bar
>
> for non-literal quoting, and
>
> I wrote::
>  > foo
>  > bar
>
> for literal quoting.
>
> More to the point, the rules would be as follows:
>
> - A block where each line is prefixed by "> " (angle bracket + space) is
> *quoted*.
> - A block where all but the first line is quoted by "> " is *quoted with
> source given*.
> - If the first line in a "quoted with source given" block ends in a
> double colon, this quoted block and all following quoted blocks
> *without* source given would also be literal blocks.
> - If the first line does not end in a double colon, this quoted block
> and all following ones w/o source would just be quoted, not literal blocks.
> - (If there's a quoted block w/o source, and there is no quoted block
> with source above, that block would be just quoted, not literal.)
>
I think I like this.

-- 
Beni Cherniavsky <cben@tx.technion.ac.il>


From b.fallenstein@gmx.de  Thu Dec 19 21:51:26 2002
From: b.fallenstein@gmx.de (Benja Fallenstein)
Date: Thu, 19 Dec 2002 22:51:26 +0100
Subject: [Doc-SIG] Email Reader (was Re: reST block quotes)
In-Reply-To: <Pine.GSO.4.44_heb2.09.0212192017370.14994-100000@techunix.technion.ac.il>
References: <Pine.GSO.4.44_heb2.09.0212192017370.14994-100000@techunix.technion.ac.il>
Message-ID: <3E023F5E.7090009@gmx.de>

Hiya,

Beni Cherniavsky wrote:

>On 2002-12-19, Benja Fallenstein wrote:
>  
>
>>I've been idly pondering ReST email before, so I have my own ideas about
>>this :-)
>>
>>I don't think that parsing arbitrary incoming emails as ReST makes much
>>sense-- it'd be like running all .txt files on my harddisk through the
>>ReST tools. I wouldn't want to use this on any emails butthose
>>explicitly written as ReST.
>>
>>    
>>
>True.  But problems can arise with:
>
>1. A thread where some people write rST and some don't.  The quoted
>   illegal parts can easily spoil the rST context for the rest.  Literal
>   quoting for non-rST parts doesn't solve this entirely because the rST
>   messages quoted by the non rST parts lose their rST information...
>

Yes, but somehow I'm inclined to think that wouldn't be all that bad... 
my feeling is that trying to guess the quoted ReST from a non-ReST mail 
is simply to hard and error-prone to be worthwhile...

>2. Sloppy mail.  I don't see myself writing 100% correct rST all the time
>   and I don't want it pushed down my throat.  If that's the choice, I'd
>   be more happy with writing rST as I see fit and using only the eyeball
>   reader ;-).
>

Hmmmmm... I can sympathize, but I'm not sure how to deal with it. I have 
a dislike for guesswork on the part of the parser... Maybe what's needed 
is a 'ReST Tidy' (like w3c's HTML Tidy): A program taking sloppy ReST 
input and trying to make it into real ReST. -- Could this run on the 
sender's computer, so that they can see what the output'll look like?

>3. Once mail is sent, it generally can't be edited (since it's already
>   archived and people already hav it in their mailboxes).  I'm not sure
>   if anything can be done about it but it's a pecularity of email that
>   should be remembered.
>

What's the issue here?

(BTW, email is quite often edited when being quoted :) )

>Something like treating every paragraph that doesn't parse as literal
>would solve some of the issues.
>

This could potentially be done by the ReST Tidy tool... OTOH, aren't the 
paragraphs where you actually used some ReST-like markup most likely to 
contain 'sloppiness'?

>>The applications I envision are two flavors of email user agent (for
>>reading & writing email):
>>
>>    
>>
>*Breakage ahead*: your list is broken by line folding.  Unexpected line
>folding in the places where you least want it is all too popular with
>email tool writers ;-(.  Good rST that you write will end up broken for
>some readers sooner or later.  The "lazy indentation" ideas from the to-do
>list could help here a bit.
>

Yes, but so is 'text/plain; format=flowed' [RFC2646]. It gets around the 
problem by requiring (IIRC) that compliant user agents shall not fold 
lines, but reflow instead-- ReST mail could do that too. (Mozilla 
supports it. It seems to work in most cases, and the remaining ones can 
be glossed over.)

>>- A text-based client, which would provide tools for composing ReST
>>e-mails, most notably a syntax validator (so that you don't accidentally
>>send emails that cannot be read because they aren't ReST); this would
>>send emails as text/vnd.python.rst or so.
>>    
>>
>
>I want such a tool.  I feel more need for editing features than for
>validation (all right, one day there will be an emacs mode).  Of course it
>would reflow paragraphs instead of broken line folding :-).
>

Yup :-)

>I'm not sure I like text/vnd.python.rst.  I don't want another text/html.
>Most mail readers will probably complain that they don't know how to read
>it.  
>

Hmm. I wouldn't suspect this, but I haven't tried. (The reason I would 
suspect it should work is that the RFCs say very clearly that you have 
to treat it as text/plain, and that's not so hard to implement. But of 
course we know how good standards compliance is in most systems... Does 
anybody have experience in how mail readers treat unknown text/* formats?)

>Sending identical content both as plain text and rST is more stupid
>than sending text version of html mail :).  I think it should be marked as
>plain text and be autodetected simply by validating.
>

Don't like it. ReST is *not* plain text to my mind. I don't ".. note::" 
things in plain text...

Besides, I think text/vnd.python.rest is exactly how MIME types are 
supposed to work: Everybody who knows it treats it specially, everybody 
who doesn't sees it as plain text. That was the idea behind 
text/enriched and text/html-- except that SGML isn't very readable and 
people complained about the angle brackets in the e-mails they recieved...

>>The difficult question is how to do quoting; I think (contrary to what
>>some people seem to be saying here) that quoting should not generally be
>>done through literal blocks-- I think quoting of non-ReST mails should
>>be, but quoting of ReST mails should preserve the formatting of the
>>quoted mail.
>>    
>>
>
>Who argued that?  IIRC, all the discussion assumed that the quoted block
>can be literal or not depending on presence of ``::``, similarly to the
>way it currently behaves with indented blocks.
>

Ok, I probably misunderstood something. Sorry.

>>More to the point, the rules would be as follows:
>>
>>- A block where each line is prefixed by "> " (angle bracket + space) is
>>*quoted*.
>>- A block where all but the first line is quoted by "> " is *quoted with
>>source given*.
>>- If the first line in a "quoted with source given" block ends in a
>>double colon, this quoted block and all following quoted blocks
>>*without* source given would also be literal blocks.
>>- If the first line does not end in a double colon, this quoted block
>>and all following ones w/o source would just be quoted, not literal blocks.
>>- (If there's a quoted block w/o source, and there is no quoted block
>>with source above, that block would be just quoted, not literal.)
>>
>>    
>>
>I think I like this.
>  
>

:-)

- Benja


From cben@techunix.technion.ac.il  Sat Dec 21 21:40:20 2002
From: cben@techunix.technion.ac.il (Beni Cherniavsky)
Date: Sat, 21 Dec 2002 23:40:20 +0200 (IST)
Subject: [Doc-SIG] Email Reader (was Re: reST block quotes)
In-Reply-To: <3E023F5E.7090009@gmx.de>
Message-ID: <Pine.GSO.4.44_heb2.09.0212212329270.20403-100000@techunix.technion.ac.il>

On 2002-12-19, Benja Fallenstein wrote:

>
> Hiya,
>
> Beni Cherniavsky wrote:
>
> >1. A thread where some people write rST and some don't.The quoted
> > illegal parts can easily spoil the rST context for the rest.  Literal
> > quoting for non-rST parts doesn't solve this entirely because the rST
> > messages quoted by the non rST parts lose their rST information...
> >
[Ooops, pine's quoting of indented text is broken.  Sorry :-]

> Yes, but somehow I'm inclined to think that wouldn't be all that bad...
> my feeling is that trying to guess the quoted ReST from a non-ReST mail
> is simply to hard and error-prone to be worthwhile...
>
> >2. Sloppy mail.I don't see myself writing 100% correct rST all the time
> > and I don't want it pushed down my throat.  If that's the choice, I'd
> > be more happy with writing rST as I see fit and using only the eyeball
> > reader ;-).
> >
> Hmmmmm... I can sympathize, but I'm not sure how to deal with it. I have
> a dislike for guesswork on the part of the parser... Maybe what's needed
> is a 'ReST Tidy' (like w3c's HTML Tidy): A program taking sloppy ReST
> input and trying to make it into real ReST. -- Could this run on the
> sender's computer, so that they can see what the output'll look like?
>
+1 on rST Tidy.

> >3. Once mail is sent, it generally can't be edited (since it's already
> > archived and people already hav it in their mailboxes).  I'm not sure
> > if anything can be done about it but it's a pecularity of email that
> > should be remembered.
> >
> What's the issue here?
>
Just that if after a broken rST is emitted, it's generally too late to
tidy it :-).  Thinking of it again, I see your validate-when-sending point
of view.  I agree - I would usually want to validate to avoid
inconveninece for readers.

> (BTW, email is quite often edited when being quoted :) )
>
But you can't fix the master which get's quoted afresh in new braches of
the thread...

> >*Breakage ahead*: your list is broken by line folding.Unexpected line
> >folding in the places where you least want it is all too popular with
> >email tool writers ;-(.Good rST that you write will end up broken for
> >some readers sooner or later.The "lazy indentation" ideas from the to-do
> >list could help here a bit.
> >
> Yes, but so is 'text/plain; format=flowed' [RFC2646]. It gets around the
> problem by requiring (IIRC) that compliant user agents shall not fold
> lines, but reflow instead-- ReST mail could do that too. (Mozilla
> supports it. It seems to work in most cases, and the remaining ones can
> be glossed over.)
>
Wouldn't it break every single literal block containg Python code, too?

> >I'm not sure I like text/vnd.python.rst.I don't want another text/html.
> >Most mail readers will probably complain that they don't know how to read
> >it.
> >
>
> Hmm. I wouldn't suspect this, but I haven't tried. (The reason I would
> suspect it should work is that the RFCs say very clearly that you have
> to treat it as text/plain, and that's not so hard to implement. But of
> course we know how good standards compliance is in most systems... Does
> anybody have experience in how mail readers treat unknown text/* formats?)
>
OK, you've obviously read more RFCs than I.  Glad to know it should work.

> >Sending identical content both as plain text and rST is more stupid
> >than sending text version of html mail :).I think it should be marked as
> >plain text and be autodetected simply by validating.
> >
> Don't like it. ReST is *not* plain text to my mind. I don't ".. note::"
> things in plain text...
>
I was only suggesting that because I didn't know any text/... should work.

-- 
Beni Cherniavsky <cben@tx.technion.ac.il>


From b.fallenstein@gmx.de  Sun Dec 22 19:28:51 2002
From: b.fallenstein@gmx.de (Benja Fallenstein)
Date: Sun, 22 Dec 2002 20:28:51 +0100
Subject: [Doc-SIG] Email Reader (was Re: reST block quotes)
In-Reply-To: <Pine.GSO.4.44_heb2.09.0212212329270.20403-100000@techunix.technion.ac.il>
References: <Pine.GSO.4.44_heb2.09.0212212329270.20403-100000@techunix.technion.ac.il>
Message-ID: <3E061273.3010306@gmx.de>

Beni Cherniavsky wrote:

>On 2002-12-19, Benja Fallenstein wrote:
>  
>
>>Beni Cherniavsky wrote:
>>
>>>3. Once mail is sent, it generally can't be edited (since it's already
>>>archived and people already hav it in their mailboxes).  I'm not sure
>>>if anything can be done about it but it's a pecularity of email that
>>>should be remembered.
>>>
>>>      
>>>
>>What's the issue here?
>>
>>    
>>
>Just that if after a broken rST is emitted, it's generally too late to
>tidy it :-).
>

Ah, ok, I see your point now.

>  Thinking of it again, I see your validate-when-sending point
>of view.  I agree - I would usually want to validate to avoid
>inconveninece for readers.
>

Cool, sounds like we are in agreement here :-) :-)

>>(BTW, email is quite often edited when being quoted :) )
>>
>>    
>>
>But you can't fix the master which get's quoted afresh in new braches of
>the thread...
>

Right.

>>>*Breakage ahead*: your list is broken by line folding.Unexpected line
>>>folding in the places where you least want it is all too popular with
>>>email tool writers ;-(.Good rST that you write will end up broken for
>>>some readers sooner or later.The "lazy indentation" ideas from the to-do
>>>list could help here a bit.
>>>
>>>      
>>>
>>Yes, but so is 'text/plain; format=flowed' [RFC2646]. It gets around the
>>problem by requiring (IIRC) that compliant user agents shall not fold
>>lines, but reflow instead-- ReST mail could do that too. (Mozilla
>>supports it. It seems to work in most cases, and the remaining ones can
>>be glossed over.)
>>
>>    
>>
>Wouldn't it break every single literal block containg Python code, too?
>

Hmmm. I guess the 'right' way to handle this would be not to reflow 
literal blocks, and not to line-break them either (since the mail reader 
would know about ReST, it would be able to take care of this correctly). 
I think that the SMTP infrastructure generally doesn't add additional 
linebreaks, so this should work-- if some server breaks the lines, of 
course, that would destroy the ReST formatting, but I think they don't. 
Again, the point is that the mail reader must handle ReST correctly...

The RFCs allow lines of up to 1000 characters, but recomment lines up to 
80 characters because many mail readers show these better. I guess that 
literal blocks with >80 chars/line (or literal blocks inside quoted text 
etc.) are good cases for using >80 chars.

>>>I'm not sure I like text/vnd.python.rst.I don't want another text/html.
>>>Most mail readers will probably complain that they don't know how to read
>>>it.
>>>
>>>      
>>>
>>Hmm. I wouldn't suspect this, but I haven't tried. (The reason I would
>>suspect it should work is that the RFCs say very clearly that you have
>>to treat it as text/plain, and that's not so hard to implement. But of
>>course we know how good standards compliance is in most systems... Does
>>anybody have experience in how mail readers treat unknown text/* formats?)
>>
>>    
>>
>OK, you've obviously read more RFCs than I.  Glad to know it should work.
>

Ok :-)

So far, our discussion suggests that we'd need:
- a ReST Tidy
- an extension of the ReST specification, for "> " quoting
- a specification of ReST email: MIME type, how to handle reflowing when 
replying, possibly other issues if they come up
- an email reader implementing the above

ReST Tidy would obviously also have applications outside this context. I 
think I may like to use ">" when quoting emails in ReST.

- Benja