[Python-3000] string module trimming
Jim Jewett
jimjjewett at gmail.com
Wed Apr 18 00:55:46 CEST 2007
On 4/17/07, Christian Heimes <lists at cheimes.de> wrote:
> Neal Norwitz schrieb:
> > I don't have any plans, just considering options. Move them
> > somewhere? Perhaps, trim the ones that are unused. In a unicode
> > world, I'm not sure how much some of these make sense. letters stands
> > out more than others. I don't know enough about unicode to know if
> > digits or whitespace can be diff.
There are several additional characters in both sets, and plenty of
reasons that a given program might want to use a restricted set.
(Probably those already in string, or else a letters grouping set by
locale.)
> What do you think about replacing the definitions by information from
> the unicode character properties database. The information are available
> somewhere in Python:
> http://docs.python.org/lib/re-syntax.html
> \w ... With LOCALE, it will match the set [0-9_] plus whatever
> characters are defined as alphanumeric for the current locale. If
> UNICODE is set, this will match the characters [0-9_] plus whatever is
> classified as alphanumeric in the Unicode character properties database.
There are reasons to want exactly ASCII.
There are also reasons to want only "local" letters. For example, in
a French interface, I might want to include the extra French letters,
but not the Greek.
Also note that regex isn't quite the only use of those letters groupings.
-jJ
More information about the Python-3000
mailing list