[Doc-SIG] formalizing StructuredText

Edward D. Loper edloper@gradient.cis.upenn.edu
Thu, 22 Mar 2001 11:54:49 EST


> > I guess that seems reasonable.  Within paragraphs, do you collapse
> > multiple spaces into one space?
> 
> No, within lines spaces are (very carefully) left untouched, just in
> case.

That seems inconsistant with stripping trailing whitespace.  I want
to make sure that things look "the same" whether you view them in
text or in html.. So it seems like if you print out a STpy string
in vt100 mode, with a 130-character-wide screen, it should collapse
spaces & re-word-wrap, just like HTML/LaTeX would..  Of course, you
could say that that's just a presentation issue.. But I think that
for consistancy, we should either:

    1. Preserve spaces within lines *and* trailing whitespace.
       Specify that display tools (even plaintext ones) should treat 
       spaces as soft, and should word-wrap.
    2. Remove leading/trailing whitespace, and collapse sequences
       of spaces to single spaces.  Specify that display tools 
       should word-wrap.  (of course, you don't colapse spaces in
       literal regions, inline regions, literal blocks, or python
       test blocks.)

(unless, of course, you can give me a good reason why we *should*
preserve sequences of spaces.

> > > > > >     "the following is not a url":<hi>
> > >
> > > That's right. In this instance.
> >
> > So does it get rendered as is (i.e., with two quote signs, one colon
> > sign, a less than sign, and a greater than sign)?
> 
> That's up to the renderer. But seriously, it gets *stored* as a node of
> the DOM tree which has the text within quotes (i.e., the quotes are not
> preserved) as its text, and the URL as its 'url' attribute. Thus the ST
> markup (the double quotes and the colon) are not remembered.

But "<hi>" doesn't match the url pattern, so presumably it doesn't
even get detected by the href-finding-regexp?  As I understand it,
you can say::

  "This is a test": of StructuredText

and it will be rendered (in HTML) as::

  "This is a test": of StructuredText

and not as::

  <a href="">This is a test</a> of StructuredText

> > The markup-nesting problem doesn't actually seem that difficult to me,
> > in principle.  I propose that we allow anything to nest
> > within anything,
> > with the restrictions:
> >   1. nothing can nest inside a literal, inline, or href url
> 
> Agreed. But please don't call it an 'href url' - that's an HTML term!
> 
> >   2. nothing can nest within itself (even with intervening levels)
> 
> Pragmatically has to be true, with non-differentiated start and end
> quotes.

It doesn't *have* to be true.. In principle we could allow::

  *This **is *no* good** for me*

But I don't think we should.

> These two seem to me to be the sane minimum, and thus sensible.

So we'll stick with that for now.

> > Also, spaces must come between * and ** delimiters, so you
> > can't say ***this***.
> 
> Ah, but there's no reason you shouldn't be able to *say **this***, for
> instance (it's quite unambiguous).

But I thought that regions had to be ended by valid punctuation or
space?  Does '*' count as valid punctuation, then?  (Of course,
I expect your regexps to change signifigantly when you try to do
nesting...)

But from a more abstract point of view, I think that '***' will end
up being too confusing.  I don't think it's unreasonable to require
that people *say **this** *.  At the very least, it seems much
easier to read (for those who aren't intimately familiar with ST,
i.e., our entire user base :) )

But I guess that if we are to allow it, I think '***' should only
be allowed to mean "close both strong and emph" or "open both
strong and emph".. So you shouldn't be able to say::

  *Too***confusing**

to mean::

  *Too* **confusing**

But just to be clear, I don't think we should allow it at all. :)

-Edward