[Doc-SIG] suggestions for a PEP
Tony J Ibbs (Tibs)
tony@lsl.co.uk
Wed, 21 Mar 2001 10:32:36 -0000
> >> Have we
> >> considered the classic spec for labels to appear left of a
> colon, namely
> >> RFC 822 (e-mail headers) and its kin ? I think that
> basically comes
> >> down to r'\w+(-\w+)*' as regex, generally specified
> >> [...]
>
> Fine with me.
I'm assuming we're talking about paragraph labels.
I think we should just go with the English definition of a word, which
means [-A-Za-z], and leave it at that. It is *meant* to look like a
word.
Just because there is a colon there doesn't mean it is related to other
fields that happen to end with a colon.
The current default labels are::
label_dict = {"Arguments":"arguments",
"Author":"author",
"Authors":"author",
"Dedication":"dedication",
"History":"history",
"Raises":"raises",
"References":"references",
"Returns":"returns",
"Version":"version",
}
If one is translating (slightly modified format) PEPs, then one would
instead use::
builder.label_dict = {"PEP":"pep",
"Title":"title",
"Version":"version",
"Author":"author",
"Status":"status",
"Type":"type",
"Created":"created",
"Post-History":"post-history",
"Discussions-To":"discussions-to",
}
I think "keep it simple" is required here - these labels are meant to be
few and simple, so English words seems sensible to me. I would thus vote
against underlines and against digits.
Also, validation aside, I don't *use* a regular expression - I look for
the right "shape" of paragraph (1 line, colon in it) and check what is
to the left of the colon against the dictionary. From *my* point of view
the legitimate characters idea only comes in with a validation phase (of
course, it would be different for Edward).
> Basically re defines '\w' = '[0-9a-zA-Z_]
Erm - basically it doesn't - it invokes "locales" which makes life more
complex (and I have no idea what sre does about '\w').