[Doc-SIG] Tokens for labels & endnotes

Tony J Ibbs (Tibs) tony@lsl.co.uk
Thu, 22 Mar 2001 10:54:00 -0000


> > I'm assuming we're talking about paragraph labels.
> Actually, I think we were talking about [endnotes].  But the same
> questions apply to labels..

Erm, maybe (sorry I lost the thread)

> > I think we should just go with the English definition of a
> word, which
> > means [-A-Za-z], and leave it at that. It is *meant* to look like a
> > word.
>
> Is that too anglo-centric?

Yes. And it will need to be fixed, but not in the first release.

(this is a general point about docutils, and at the moment STpy as well,
and I think it needs more input from other people at a later stage)

> It might be that underlines and digits are more applicable for
> endnotes.  Some people might like this [1] or this [noam_chomsky97].

For labels I want to exclude '-_', but yes, for labels I want to include
them.

> If LOCALE and UNICODE flags aren't used when compiling a regexp,
> \w = [a-zA-Z0-9_] (at least according to "the python library
> reference manual
> for re":<http://www.python.org/doc/current/lib/re-syntax.html>).
> Furthermore, it will always match '_', regardless of LOCALE and
> UNICODE (again, according to the ref. manual).

My rather desparate hope (not having read the RE section in the new 2.1
manuals yet) is that using REs will give good leverage on the problem
mentioned at the top of the email, at which point it *does* become
useful to use '\w'.

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"How fleeting are all human passions compared with the massive
continuity of ducks." - Dorothy L. Sayers, "Gaudy Night"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)