Wishlist: string attributes

Alex Martelli aleax at aleax.it
Wed Mar 26 11:34:42 CET 2003


<posted & mailed>

Raymond Hettinger wrote:

> 
> "Stephen Boulet" <stephen.boulet at motorola.com> wrote in message
> news:3E807948.5040401 at motorola.com...
>> Being a big fan of string methods, here's what I'd like to see for python
> strings:
>>
>> I can access string methods like (str.lower) but I would like to be able
> to
>> access all string module attributes without importing the string module
> (like
>> str.ascii_lowercase, str.whitespace, etc.).
> 
> The plan is to use functions like str.isascii() instead of attributes.

That covers the typical usecase of "if c in string.whatever:", and
indeed may substantially improve performance for that use, but it's
no big help for all other uses of having "all the letters" &c available,
such as building tables for the wonderfully fast translate method
of string objects.  Consider a typical case: I need to make a copy
of a string that removes all non-digits (more often the issue will be
"all non-printable characters" or the like, but I'm taking an example
which today offers me both a .isXXX method AND a string.XXX attr).

== rr.py:

from string import digits, maketrans

allchars = ''.join(map(chr, range(256)))
notrans = maketrans('', '')
nondigits = allchars.translate(notrans, digits)

def rem1(s, notrans=notrans, nondigits=nondigits):
        return s.translate(notrans, nondigits)

def rem2(s):
        return ''.join([c for c in s if c.isdigit()])

def rem3(s):
        return ''.join(filter(str.isdigit, s))


s = '23skidoo and 148!'

== end of rr.py

[alex at lancelot alex]$ python timeit.py -s 'import rr' 'rr.rem1(rr.s)'
100000 loops, best of 3: 7.6 usec per loop
[alex at lancelot alex]$ python timeit.py -s 'import rr' 'rr.rem2(rr.s)'
10000 loops, best of 3: 42.6 usec per loop
[alex at lancelot alex]$ python timeit.py -s 'import rr' 'rr.rem3(rr.s)'
10000 loops, best of 3: 34.7 usec per loop
[alex at lancelot alex]$


So, in practice, I *DO* still need a string of "all letters", "all
printable characters", and the like, for purposes such as this one --
even though I had a complete complement of .isXXX methods too.  Guess
I can always ''.join(filter(str.isXXX, allchars)), but I'd much
rather have the needed strings ready in the str class as I now have
them in the string module... is there any technical impediment to
supplying them?


Alex





More information about the Python-list mailing list