[Doc-SIG] formalizing StructuredText
Edward D. Loper
edloper@gradient.cis.upenn.edu
Thu, 22 Mar 2001 11:54:49 EST
> > I guess that seems reasonable. Within paragraphs, do you collapse
> > multiple spaces into one space?
>
> No, within lines spaces are (very carefully) left untouched, just in
> case.
That seems inconsistant with stripping trailing whitespace. I want
to make sure that things look "the same" whether you view them in
text or in html.. So it seems like if you print out a STpy string
in vt100 mode, with a 130-character-wide screen, it should collapse
spaces & re-word-wrap, just like HTML/LaTeX would.. Of course, you
could say that that's just a presentation issue.. But I think that
for consistancy, we should either:
1. Preserve spaces within lines *and* trailing whitespace.
Specify that display tools (even plaintext ones) should treat
spaces as soft, and should word-wrap.
2. Remove leading/trailing whitespace, and collapse sequences
of spaces to single spaces. Specify that display tools
should word-wrap. (of course, you don't colapse spaces in
literal regions, inline regions, literal blocks, or python
test blocks.)
(unless, of course, you can give me a good reason why we *should*
preserve sequences of spaces.
> > > > > > "the following is not a url":<hi>
> > >
> > > That's right. In this instance.
> >
> > So does it get rendered as is (i.e., with two quote signs, one colon
> > sign, a less than sign, and a greater than sign)?
>
> That's up to the renderer. But seriously, it gets *stored* as a node of
> the DOM tree which has the text within quotes (i.e., the quotes are not
> preserved) as its text, and the URL as its 'url' attribute. Thus the ST
> markup (the double quotes and the colon) are not remembered.
But "<hi>" doesn't match the url pattern, so presumably it doesn't
even get detected by the href-finding-regexp? As I understand it,
you can say::
"This is a test": of StructuredText
and it will be rendered (in HTML) as::
"This is a test": of StructuredText
and not as::
<a href="">This is a test</a> of StructuredText
> > The markup-nesting problem doesn't actually seem that difficult to me,
> > in principle. I propose that we allow anything to nest
> > within anything,
> > with the restrictions:
> > 1. nothing can nest inside a literal, inline, or href url
>
> Agreed. But please don't call it an 'href url' - that's an HTML term!
>
> > 2. nothing can nest within itself (even with intervening levels)
>
> Pragmatically has to be true, with non-differentiated start and end
> quotes.
It doesn't *have* to be true.. In principle we could allow::
*This **is *no* good** for me*
But I don't think we should.
> These two seem to me to be the sane minimum, and thus sensible.
So we'll stick with that for now.
> > Also, spaces must come between * and ** delimiters, so you
> > can't say ***this***.
>
> Ah, but there's no reason you shouldn't be able to *say **this***, for
> instance (it's quite unambiguous).
But I thought that regions had to be ended by valid punctuation or
space? Does '*' count as valid punctuation, then? (Of course,
I expect your regexps to change signifigantly when you try to do
nesting...)
But from a more abstract point of view, I think that '***' will end
up being too confusing. I don't think it's unreasonable to require
that people *say **this** *. At the very least, it seems much
easier to read (for those who aren't intimately familiar with ST,
i.e., our entire user base :) )
But I guess that if we are to allow it, I think '***' should only
be allowed to mean "close both strong and emph" or "open both
strong and emph".. So you shouldn't be able to say::
*Too***confusing**
to mean::
*Too* **confusing**
But just to be clear, I don't think we should allow it at all. :)
-Edward