[Doc-SIG] Lazy paragraph identation
Garth T Kidd
Sat, 21 Jul 2001 19:04:20 +1000
> > It sure looks to me like a requirement for a switchable mode in the
> > parser. Different applications can choose different defaults.
> This is workable. If you can come up with consistent, unambiguous,
> rules for lazy indentation, then Wikis and other apps could use the
> > Or, the parser could attempt to automatically figure it out.
> That's a dangerous path. Explicit is better.
This quote is out of order because it's more important:
> Although not explicitly stated in the spec (yet), the way
> I've implemented the parser is to do line/block parsing first,
> then inline markup parsing afterwards (standalone URI parsing last).
"Sorry, changing the parser order is just too hard at this stage to
relax the requirement for blank lines between list entries in lazy mode"
is a perfectly reasonable argument in favour of that requirement, and
I'm entirely happy to accept it. _
.. _ You have no idea how frustrated my fiance gets when, after
attempting to justify a decision with several sadly illogical _
arguments in its favour and listening to me patiently dissect
and dismiss each one, discovers that "I feel like it, okay?"
was all that she needed to say.
Well, maybe you have a slight idea. :)
One of these days, I'm going to clue up and ask right after the
first one whether she just feels like it. It'll save a lot of
time and angst. Similarly, I should have asked up front whether
the implementation of my proposal was going to be difficult.
.. _ No, this is not an attempt to slyly call your arguments
illogical. Misguided, much Frowned upon by God, and if not
abandoned sure to lead to your Eternal Damnnation in Hell,
but not illogical by any shake of the stick. :)
The now sadly irrelevant argument in favour of a less strict lazy mode
Summarizing the issue of badly wrapped lists and lazy mode:
* We're either in lazy mode, or not. No automatic selection. Cool.
* The following example is still contentious::
- This is list item 1. Here's a formula: "x = x
- Here's list item 2. Sure looks like item 3 though.
* The strict approach to the example:
* A parser permitting lazy indentation without insisting upon
blank lines would interpret the above as "three lists", and
* A human reader strictly reading the specification would
reach a similar conclusion, but
* That's obviously not what the user intended when they wrote ::
- This is list item 1. Here's a formula: "x = x - 1".
before their editor badly wrapped the line.
* We can could this disconnect "ambiguity",
* In the parser world, "ambiguity" is a bad word,
* Therefore blank lines **must** be insisted upon between list items
in lazy mode.
* Arguments in favour of being more forgiving:
*Ambiguity ain't always that ambiguous*:
The kind of ambiguity we're most worried about is *circumstances
for which the parser's behaviour is undefined*. The parser needs
to be able to consistently make a decision, and programmers
implementing parsers need to be able to make a decision.
This clearly isn't such a case. The user will be typing a list.
When they see the results, they'll mutter dark words about the
stupid editor their company insist they use, and they'll fix
the markup somehow (see below).
If asked "hey, do you consider what just happened ambiguous?",
I don't imagine many users would reply in the affirmative.
They explicitly typed something. Their editor explicitly stuffed
it up. The parser explicitly interpreted the text, and the user
explicitly said expletives and explicitly fixed the problem.
Any confusion in the user's mind when seeing the output will
disappear when the system sends them their text back for editing
and they see what their text editor did.
*Consider the user impact*:
This kind of a strict "never suffer ambiguity to live" attitude
imposes a heavy burden on the user every time they use a list
(probably quite often) in order to save them from something
untoward that might happen to them only once a year, if ever.
A comparison might be made to money handling. If your current
cash register techniques occasionally let minor mistakes to be
made, you could well lose hundreds of dollars per year.
Insisting that all totals are manually verified by a supervisor
will save those hundreds of dollars, but cost tens of thousands
in additional salary. Moreover, all of your customers might
abandon your store because they're sick of the hassle.
*Users can avoid the problem very, very easily*:
Any user aware that their editor wraps lines for them, and
aware that a copy of the list delimiter unfortunately wrapped
to the beginning of the line will cause the parser to start
a new list item, will do one of the following:
* Manually wrap such a long item well before the wrap point::
- This is list item 1.
Here's a formula: "x = x - 1".
- Here's list item 2.
* Choose a different list delimiter.
* Use literals (assuming the parser is changed so that literals
bind harder than the beginning of list items).
* Drop into strict mode temporarily: _ ::
- This is list item 1, which contains a formula that
I'm not sure will wrap appropriately, so I'm going
to drop into strict mode and manually wrap each and
every line well before the wrap point.
Anyway, here's the formula: "x = x - 1".
- Here's list item 2.
I suspect the first two will be slightly more popular. :)
Any user waking up regularly dripping with sweat because of
recurring nightmares about having to go back and fix their
markup will, I think, go to the effort of finding an editor
that will write their markup for them.
*What would the user choose?*:
Given a choice between the following:
* a *strict* mode that insists that users manually wrap each
and every line well before their editor's wrap point *and*
manually indent those lines as well,
* a *strictly lazy* mode that relaxes the requirements for
manual wrapping and indentation but insists upon blank lines
between all list items, and
* a hypothetical *bloody lazy* _ mode that doesn't insist
upon those blank lines but that requires users to consider
editor wrap points when putting list delimiters in the middle
of list items,
I somewhat suspect that many users would end up being bloody
lazy. Certainly, if bloody laziness were the default, I
sincerely doubt that many people would bother switching to a
stricter mode, even if they got caught out once or twice.
.. _ There's the `Queen's English` again.
.. _ Well, there's an example of a parser directive, if we need
> Yes, but we are trying to avoid surprises when accidental bad
> wrapping takes place. The user doesn't always have control.
> My email client wraps my paragraphs, even if I don't want it to.
Well, exactly, but there's nothing wrong with surprises if the user can
figure out how to respond to the surprise. Users are going to be
stuffing up quite often, will be surprised to see that what they did
didn't work, and will look at their markup again and maybe refer to the
specification to figure out what happened and what to do about it.
If we're not worried about that (leading to directives like: "users must
never write their own markup, but must use an editor that doesn't let
them make mistakes"), why are we worried about this wrapping and list
The user has enough control over the wrapping to force a wrap earlier
than the parser did, which is more than s/he needs to either dodge or
fix the problem.
> > * The *user* might be a little confused for a moment.
> > The user is going to spend a lot of time confused regardless.
> Confusion is OK, as long as it stems from ignorance;
> education/experience fixes that. Confusion stemming from surprising
> (even if *very occasionally* surprising) side-effects of the markup,
> that's not acceptable.
Call it a side-effect of the editor. If anyone gets particularly
detail-oriented and angst ridden about the whole thing, direct them to
the list archives (of which I'm sure I'm going to be sufficiently
embarrassed), point out that it's all my fault, and give them my email
> Writing the spec and implementing the parser, I've tried to
> avoid surprises and ambiguity wherever possible. If avoidance is
> not possible, then the possible surprises have to be minimized,
> explicity documented, and warned of by the parser. Also, there
> has to be an "out" or workaround (which is where
> backslash-escapes come in handy).
Let's say that it were impossible to insist on the blank lines for
non-technical reasons (the managing director hates them). I think the
possible surprises are minimal, I'll write the documentation, I'll try
and figure out a way to warn about the situation (spotting a broken
literal is the easiest way until we climb into the ordered list
rat-hole), and there's an easy out. Close enough?
> > More glibly put: two out of three ain't bad. I think
> > they'll cope. :)
> You're a programmer. Imagine if Python had funny edge cases. Would
> you *cope*? Or would you scream bloody murder?
Python surprises me every week. Then I figure out that my editor broke
the indentation. I fix what my editor broke, and keep working. I cope.
> Out of respect for the eventual users of reStructuredText, we can't
> allow *any* surprises.
We're doing it for your own good!
Out of respect for people already suffering crummy editors, I'm trying
to cut them as many breaks as I can. Users who absolutely cannot stand
surprises can always turn on strictness or strict laziness, eh?
It just occurred to me that I've spent more time discussing this than I
could possibly have spent as a user swearing about needing to put blank
lines in. Sorry about that.
I'm mainly worried about people cutting and pasting mail in to their web
browser (it'll happen). Saving them the effort of breaking the bullet
lists apart seems like a fair thing.
> It will be great if you can come up with a consistent
> indentation-minimized syntax; I'm all for it.
Still working on it!
Oh, the shame: a Python programmer trying to figure out how to avoid
> All you need to do is devise an alternativerepresentation of
> hierarchical structures, one that doesn't use indentation
> or begin/end markers. If it *does* use begin/end markers,
> we'll call it something else ;-), and start another parser
> component project for it.
If it had to use begin and end markers, we may as well write it in