[Doc-SIG] Lazy paragraph identation

David Goodger dgoodger@bigfoot.com
Fri, 20 Jul 2001 18:25:32 -0400


I've had some time to respond to individual comments:

on 2001-07-19 10:24 PM, Garth T Kidd (garth@deadlybloodyserious.com) wrote:
> .. The spec supports nested block quotes, right?

Yes. So? I don't get your point.

> It sure looks to me like a requirement for a switchable mode in the
> parser. Different applications can choose different defaults.

This is workable. If you can come up with consistent, unambiguous, safe
rules for lazy indentation, then Wikis and other apps could use the lazy
variant. 

> Or, the
> parser could attempt to automatically figure it out.

That's a dangerous path. Explicit is better.

> You point out ambiguity in your example of a badly wrapped paragraph
> containing the bullet selector::
> 
>   - This is list item 1. Here's a formula: "x = x
>   - 1".
>   - Here's list item 2. Sure looks like item 3 though.

I think intervening blank lines are an absolute requirement for lazy
indentation. So the example would be like this::

    - This is list item 1. Here's a formula: "x = x
    - 1".

    - Here's list item 2. Sure looks like item 3 though.

If *only* lazy indentation is used, no problem. If the parser tries to infer
the author's style, it would mistakenly infer strict indentation.

[Garth lists workarounds:]
>  * Manually wrap it closer to column zero::

Yes, but we are trying to avoid surprises when accidental bad wrapping takes
place. The user doesn't always have control. My email client wraps my
paragraphs, even if I don't want it to.

>  * Use a different bullet::

Change the example to "x = (x + 1) * 3 - 2" (all possible bullets included),
and this workaround won't always work.

>    Implication: a rule in the parser that says that blank lines
>    are required between adjacent but different lists at the same
>    indentation level, even if lazy paragraph formatting is turned
>    on.

My parser actually does this. I'll add mention of it to the spec.

>  * Use an inline literal::
> 
>      - This is list item 1. Here's a formula: ``x = x
>      - 1``.
>      - Here's list item 2, as the parser considers the second
>      line in this example part of the literal started in line 1.

Although not explicitly stated in the spec (yet), the way I've implemented
the parser is to do line/block parsing first, then inline markup parsing
afterwards (standalone URI parsing last). So in the case above, the "- 1``."
would be recognized as a new list item before being examined for inline
literals. The "\``x = x" at the end of the first line would generate a
warning, "Inline literal start-string without end-string."

> There are hundreds of billions [1]_ of frustrated Wiki users out there
> pounding their heads against the Wiki markup syntax, and almost as many
> ZWiki users ripping their hair out because StructuredText is just as bad
> or worse. Telling them we're not going to throw them a line and rescue
> them from shark infested water because they might get our precious rope
> wet seems a tad... stingy.

I'm all in favor of throwing them a line. But (to extend your analogy
further) I want the line to be strong and well anchored, so they don't get
tangled up in it and drown. :-)

> * The *user* might be a little confused for a moment.
> 
> The user is going to spend a lot of time confused regardless.

Confusion is OK, as long as it stems from ignorance; education/experience
fixes that. Confusion stemming from surprising (even if *very occasionally*
surprising) side-effects of the markup, that's not acceptable.

> I'm wary of insisting upon serious inconvenience to a large segment of
> the user population for [3]_ to save inconvenience to the occasional
> user who stumbles across the edge case of a list item that happens to
> have a list delimiter just after the wrap column.

In putting together these specs and the parser software, I've always kept
this in mind: 

   If it can go wrong, it will.

Writing the spec and implementing the parser, I've tried to avoid surprises
and ambiguity wherever possible. If avoidance is not possible, then the
possible surprises have to be minimized, explicity documented, and warned of
by the parser. Also, there has to be an "out" or workaround (which is where
backslash-escapes come in handy).

> More glibly put: two out of three ain't bad. I think they'll cope. :)

You're a programmer. Imagine if Python had funny edge cases. Would you
*cope*? Or would you scream bloody murder? Out of respect for the eventual
users of reStructuredText, we can't allow *any* surprises.

It will be great if you can come up with a consistent indentation-minimized
syntax; I'm all for it. All you need to do is devise an alternative
representation of hierarchical structures, one that doesn't use indentation
or begin/end markers. If it *does* use begin/end markers, we'll call it
something else ;-), and start another parser component project for it.