[Tutor] three numbers for one

eryksun eryksun at gmail.com
Mon Jun 10 05:40:53 CEST 2013


On Sun, Jun 9, 2013 at 11:22 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> On 10/06/13 00:26, Oscar Benjamin wrote:
>
>>>>> def is_ascii_digit(string):
>>
>> ...     return not (set(string) - set('0123456789'))
>
> That's buggy, because it claims that '' is an ascii digit.
>
> This is likely to be quicker, if not for small strings at least for large
> strings:
>
> def is_ascii_digit(string):
>     return string and all(c in set('0123456789') for c in string)

Or use a regex with the category \d and the re.ASCII flag:

http://docs.python.org/3/library/re#re.A

Its match() method should be several times faster than iterating the
string with a generator expression.

CPython implementation (lookup table, bit fields):
http://hg.python.org/cpython/file/3.3/Modules/_sre.c#l108


If you have a bytes/bytearray object, use isdigit():

    >>> '1234\u06f0'.isdecimal()  # Unicode decimal
    True

    >>> '1234\u06f0'.encode().isdigit()  # ASCII digits
    False

CPython 3 (and 2.x bytearray) uses a lookup table for this, defined in
pyctype.h:

http://hg.python.org/cpython/file/3.3/Python/pyctype.c


More information about the Tutor mailing list