[Doc-SIG] Re: Ho hum - back to work...

Edward D. Loper edloper@gradient.cis.upenn.edu
Thu, 19 Apr 2001 23:06:15 EDT


> Well, I recovered from my flu (eventually) and am now back to
> "normal".

That's good to hear.  I was beginning to worry that you didn't like us
anymore. :)

> Anyway, to the point. I'm taking tomorrow (and maybe a day next
> week) off to do *some* work for the effort. It's a bit short notice
> to ask this, but given all the work that Edward and David are doing
> (I don't necessarily *agree* with them, but that's another matter),
> I figured I'd seek an opinion on how my time might best be spent.

It does seem like it would be nice to have a parser with which we can
try a number of different rules..  And since you've already spent a
fair amount of time on that, that seems like a reasonable thing to
work on.

> 2. Work on the Doc-SIG archives, to try to produce summaries of the
> arguments from its lifetime. Note that (technically) we may need
> this for any PEPs we produce! (and it would clearly be useful to be
> able to *point* to who said what and why, given the history of the
> group).

I tried to do this a few weeks back, (including copius pointers to
individual articles), but gave up because I don't have *that* much
free time. :) But it would be *really* useful to have, I think, and it
you're more familiar with the archives, then maybe it wouldn't take as
long.. At least getting a start on it would be nice.

Overall, I'd say to work on docutils/fat.py, but mainly because you've
already invested a fair amount of work in it.  Maybe we can convince
someone else to do the doc-sig summary stuff?  :)  

> I still don't understand why Edward (and Guido, although I think
> he's less likely to answer!) object to "simple" markup like ST and
> relatives use - why they consider it a Bad Thing to (a) use
> punctuation characters for markup, and (b) use them in a context
> dependent manner.

I actually don't object to either (a) or (b), strictly speaking.  What
I object to is markup that I think will be "unsafe."  For example, I
have no problem with using *one* *word* *emph*, or saying that
backticks around any valid Python identifier mark it as a Python
object.  My biggest pet peve about ST-like markup is having a markup
be context-dependant, with a basically unbounded context.. For
example, if "*" starts an emph region only if there's another "*"
later in the string somewhere; and otherwise is an asterisk.  This
seems very dangerous to me.  I want to be able to tell (under most
circumstances), by looking at a character and its immediate context,
whether it's markup or not..  So, as long as we keep our contexts
relatively small, I don't object to context-dependant markup.  (In
fact, both bullets and "::" are definitely context-sensitive markup,
and I think they're very intuitive.)

As for using punctuation characters, that's fine (what else would you
use??), but if possible we should try to keep the need for escaping to
a minimum, because escaping will be ugly and non-intuitive, no matter
how we do it.  So we should try to keep the number of punctuation
characters we use to a minimum.

> (As a subpoint, I don't *quite* understand why Edward wants to
> separate structuring and colourising so much - this seems to me to
> be implementation detail (for this purpose, I consider the EBNF to
> be "implementation" as well) - real people don't have trouble with
> fuzzy distinctions about such things.)

There are really 2 reasons:

  1. A general divide-and-conquor approach to the problem of coming
     up with a markup language.  I'm more confident that we'll be able
     to come to consensus on smaller issues/domains than larger ones.
     This reason has nothing to do with the final markup language, and
     everything to do with how we get there.

  2. A side-effect of dividing structuring and colorizing is 
     eliminating a number of issues, such as how to tell whether
     a line in a paragraph starting with "1." is a bullet or a
     continuation of the previous line.

  3. I think that the markup language will be easier to understand
     if colorizing and structuring don't interact much.  

My original reasons were (1) and (3).  (2) was something that happily
fell out.

> B. Reasons to be doing this
[Summarized:]
  - NOT: we don't need to invent a markup language
  - DOC: we want to be more expressive in our docstrings
  - REP: we want to be smarter about displaying docstrings
  - STRUC: we want to be able to do smart things with our
           docstrings (other than displaying them).

I would say that for me, your REP and DOC would be my most important
reasons for this work, probably in that order.  The reason that I put
REP above DOC is because I think that the need for standardization is
much less for DOC than it is for REP.

> I'm not sure I actually believe that we're going to get a lot from
> STRUC

One thing we get is the ability to check for certain completeness
criteria in our documentation.. e.g., did I specify a return
value/type for everything that returns something?  did I describe
every parameter?

-Edward