[Python-Dev] Decimal(unicode)

Nick Coghlan ncoghlan at gmail.com
Tue Mar 25 16:29:51 CET 2008


Mark Dickinson wrote:
> On Tue, Mar 25, 2008 at 9:46 AM, Oleg Broytmann <phd at phd.pp.ru> wrote:
>>    In 2.5.2 it prints
>>
>>  <type 'str'>
>>  <type 'unicode'>
>>
>>    Why the change? Is it a bug or a feature? Shouldn't .to_eng_string()
>>  always return a str?
> 
> I'd call this a bug.  The change is an accident, a side-effect of the fact
> that in 2.5.1 the coefficient (mantissa) of a Decimal was stored as a
> tuple, and in 2.5.2 it's stored as a string (which greatly improves efficiency).
> Clearly in 2.5.2 the mantissa is being stored as a unicode instance in the
> second case;  it should be explicitly coerced to str in Decimal.__new__.
> 
> If others agree that it's a bug, I'll fix it.

I thought that might be what happened, but I couldn't remember if that 
optimisation was a 2.6 only change or not (I suspect it was included in 
2.5 as a prereq to the spec compliance updates).

Anyway, +1 on coercing the mantissa to a str() instance in 2.5.

This does raise an interesting point though - currently Decimal in Py3k 
is storing the mantissa as a Unicode instance instead of a bytes 
instance. The performance implications of that are horrendous since 
PyLong_FromUnicode does a malloc, encodes the string into the malloced 
buffer, then invokes PyLong_FromString on the result.

To fix this, decimal probably needs to grow something like the following 
near the top of the module:

try:
   _bytes = bytes
except NameError: # 2.5 or earlier
   _bytes = str

and then use _bytes instead of str as appropriate throughout the rest of 
the module.

The following is also a problem in Py3k:

 >>> from decimal import Decimal as d
 >>> d(1)
Decimal('1')
 >>> d('1')
Decimal('1')
 >>> d(b'1')
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/home/ncoghlan/devel/py3k/Lib/decimal.py", line 659, in __new__
     raise TypeError("Cannot convert %r to Decimal" % value)
TypeError: Cannot convert b'1' to Decimal

The isinstance(value, str) check in Py3k is too restrictive - it needs 
to accept bytes instances as well.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org


More information about the Python-Dev mailing list