[Doc-SIG] New document - pytext-fat

Guido van Rossum guido@digicool.com
Fri, 30 Mar 2001 13:16:16 -0500


> > - Using --- for descriptive list is no better than --.  (Note that
> >   there's a typo in the example -- the second example uses '--'
> >   instead of '---'.  If you *have* to have descriptive lists, try
> >   doing something creative with input that already looks the way you
> >   propose that descriptive lists be rendered.
> 
> No, you're missing the point again.
> 
> *In the docstring, the text should read like normal text*.

That's what I was proposing!  OK, concretely, I was proposing (but
only half seriously) to recognize the following:

    Breadcrumbs
        Those little things that fall out of the loaf when you cut it
        in half.

    Wastebasket
        Where you dump the bread that you can't eat this weak, or that
        appears like it's been eaten by green fungus.

as a descripting list.

> That's point 1. So use of "--" or "---" or whatever is the natural way
> to do it. This point that the docstring should look like *normal* text
> is *so* important - it's the *really* important idea behind ST, not all
> the cruft about particular implementations.
> 
> *In the formatted result* I don't know *how* descriptive lists will
> look - that depends so much on the formatting mechanism (HTML, XHTML, an
> SGML thingy, TeX, LaTeX, texinfo, PDF, uncle tom cobbly and all) that
> there's no way I can (or should) mandate that.

If there's too much variation in what they will look like, I believe
that you're doing the author a disfavor -- authors tend to spend some
time debugging the output for the processor they are familiar with,
but you can't expect them to attempt to preview it using all possible
processors.  Since authors care about the final looks, we shouldn't
give them a reason to worry about surprises with unfamiliar
processors.  (Taken into the extreme, this leads to WYSIWIG -- which
is not a bad principle, but unfortunately unobtainable in the given
context for now.)

> Internal paragraph indentation is a difficult one to decide about. For
> simplicity, ignoring it is the best approach, because it makes lists and
> things easier (I don't believe that *most* people want to write::
> 
> 	1. This is some
>       text - isn't that ugly
> 
> and if they have to indent there... (I don't, though, think we *should*
> require them to have to)).

Anything that makes the source more readable is fair game to me.  This
is one of the guiding principles!

> If you mean the::
> 
> 	1.
> 
>          and this is in the list
> 
> type of example, it is there (so far as I am concerned) because text
> needs formatting before it is finished, and thus allowing the "empty
> bits I've to write yet" is good practice.

So indicate it with XXX or something like that.  If it looks this
funny in the input, the author probably intended something specific,
so we should attempt to make it look the same in the output.
Principle of least surprise.

> > - Having to work around auto-detection of numbered lists is my #1 ST
> >   pet peeve.  I know that part of that's a ST bug -- but I still
> >   believe ordered lists are not sufficiently important to warrant the
> >   pain they occasionally cause.
> 
> Then I think we may have a fairly fundamental problem/disagreement.
> Although the *only problem I can see is that if one has:

I'll hold off on this one until I see your final rules.  I'd like to
see what you recommend to work around this -- please don't tell me to
change my sentences!

> >   At the very least you should require that the rest of the input is
> >   neatly formatted the way one would format an ordered list in a plain
> >   text document.
> 
> That *who* would format it neatly? Using whose convention? I don't like
> trying to impose my own formatting conventions on people's use of text
> (and I think I would lose).

I think you've seen the light here given your latest post.  We can
enforce any convention we like as long as it reads well.

> Ah - sorry. The point is that the formatter needs to decide that a
> bullet item followed by a number item is actually two lists, not one.

But this would be silly input.  Why would you place a bulleted list
and a numbered lists adjacent without some text in between?  I say a
list is a list and if you want to mix item styles (maybe for a joke)
that doen't make it two lists.

> > - About dedented paragraphs after indented sections: you can't really
> >   express in regular text that a plain paragraph is not part of the
> >   previous section unless you insert a heading.  Maybe a better
> >   alternative (again using the rule that we should never ignore the
> >   whitespace clues in the source!) would be to simply indent indented
> >   headings and and paragraphs, a la <blockquote>.
> 
> Are we misunderstanding each other?
> 
> One of the problems, for many people, with ST is the need to indent
> sub-sections. I agree that this can be a pain. I deliberately dropped it
> for fat.html, as a requirment, but it is still perfectly legitimate to
> indent the text if you want, and as (I thought it said) then dedenting
> will end an indented section.

That has only meaning in the DOM tree, not in the rendition on paper
or screen.  I see no use for it.  Since indented paragraphs *do* have
a purpose in text documents (e.g. some styles use it for quotations),
I say that an indented paragraph should be rendered as an indented
paragraph!  This would seem to fit all the other guidelines that we're
trying to apply.

> Remember that we are aiming at docstrings, where the number of headings
> (especially given label blocks) is likely to be low.

Yes.  Another issue is that the entire docstring (except for its first
line) is likely to be indented; this should be accounted for first.

> Last time round the Doc-SIG loop, there were a couple of requests that
> tied together.
> 
> Several concepts like "Author" and "Arguments" came up, and there was a
> wish to generalise these, partly because we couldn't predict all of
> them. Some of them admit of having their information on one line, and
> it's nice to be able to control that. All of them contain information
> one might imagine a tool wanting to extract from the document,
> "standard" parts of text (cf the non-HTML in javadoc).
> 
> Some people wanted to be able to have extensibility built in - arbitrary
> additional tags. They were made happy because the label (tag as it was
> then) can map easily to an XML tag.

Hm.  All this flexibility seems to go against the idea of *simple*
markup for docstrings.  We can't be feature-happy!

> They also *look* like what people put in docstrings anyway - so we might
> as well gain leverage off that.

I've never put a [label] in a docstring.

> A literal block is a single block. It cannot have children (by
> definition - they would be part of it, since their indentation is less
> than that of their parent block).
> 
> I'll need to look at the explanation again - it's probably too close to
> implementation-speak.

I propose a different concept: there should be no concept of
paragraphs or blocks being children of other paragraphs.  Instead, a
(sub)section contains a sequence of paragraphs and blocks, some of
which are more indented than others.

[On Python literals]
> There was a *long* argument on this last time round the Doc-SIG.
> Everyone agreed it was ugly, but that was not the main reason for
> adopting it. For this one, can I please ask that you look back in the
> list?

Then please give me a URL of a thread to start.  I can't very well go
searching the entire archive looking for "argument", can I. :-)

> > - URL recognition: you know my position. :-)
> 
> I am sort of happy with ad-hoc recognition, but it really does give
> problems with trailing punctuation (*not* just fullstops), and going for
> ad-hoc recognition seems to me to be at odds with the "purity" of your
> approach on some other items...

Read the code in the FAQ wizard (Tools/faqwiz/ in the Python
distribution).  It deals with much more than full stop, and works very
well in practice -- not just "well enough".

> > Hope this helps,
> 
> Despite my gruntles, opinions are useful. It would just be nice if they
> didn't all come at once, and didn't require *me* to stand up for debates
> that happened long ago and I only half remember.

The PEP process requires you to summarize past debates for this
reason.  Too bad we didn't have a PEP process way back when this was
first being discussed...

--Guido van Rossum (home page: http://www.python.org/~guido/)