[Python-3000] Regular expressions, py3k and unicode

Antoine Pitrou solipsis at pitrou.net
Sun Jun 29 13:36:04 CEST 2008


Le dimanche 29 juin 2008 à 12:05 +0100, Mark Dickinson a écrit :
> Might this have some unintended consequences?  For example, one
> would then get the following undesirable behaviour from the decimal
> module, using inputs with Unicode fullwidth digits.
> 
> >>> Decimal('\uff11')
> Decimal('1')
> >>> Decimal('\uff11') == Decimal('1')
> False

Indeed. On the other hand it already works properly for ints and floats,
so perhaps Decimal shouldn't refuse unicode digits like it currently
does:

>>> int('\uff11')
1
>>> int('\uff11') == 1
True
>>> float('\uff11')
1.0
>>> float('\uff11') == 1.0
True
>>> decimal.Decimal('\uff11')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antoine/py3k/py_setref/Lib/decimal.py", line 545, in __new__
"Invalid literal for Decimal: %r" % value)
  File "/home/antoine/py3k/py_setref/Lib/decimal.py", line 3766, in _raise_error
raise error(explanation)
decimal.InvalidOperation: Invalid literal for Decimal: '1'


On a sidenote, it seems int objects constructed from strings don't use
the interned smallint constants, I will file a bug for it:

>>> 1+1 is 2
True
>>> int('2') is 2
False





More information about the Python-3000 mailing list