Question about metacharacter '*'

Ian Kelly ian.g.kelly at gmail.com
Mon Jul 7 19:05:44 CEST 2014


On Sun, Jul 6, 2014 at 4:49 PM, MRAB <python at mrabarnett.plus.com> wrote:
> \d also matches more than just [0-9] in Unicode.

I think that anything matched by \d will also be accepted by int().

>>> decimals = [c for c in (chr(i) for i in range(17 * 2**16)) if unicodedata.category(c) == 'Nd']
>>> len(decimals)
460
>>> re.match(r'\d*', ''.join(decimals)).span()
(0, 460)
>>> int(''.join(decimals))
123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
>>> nondecimals = [c for c in (chr(i) for i in range(17 * 2**16)) if unicodedata.category(c) in 'NoNl']
>>> len(nondecimals)
688
>>> re.findall(r'\d', ''.join(nondecimals))
[]



More information about the Python-list mailing list