[Python-Dev] Decimal(unicode)

Nick Coghlan ncoghlan at gmail.com
Wed Mar 26 04:02:10 CET 2008


Greg Ewing wrote:
> Terry Reedy wrote:
>> The purpose of type constructors is to construct instances from reasonable 
>> inputs.  I think all number constructors should accept bytes
> 
> What should bytes as input to a number constructor
> mean, though?
> 
> People seem to be assuming it should be interpreted
> as ASCII-encoded characters.
> 
> But an equally plausible interpretation might be
> that it's some binary representation of a number.

The difference is that there are some hardware control protocols which 
it makes sense to treat as sequences of bytes, which also contain 
numbers as ASCII digits which need to be processed. It's also the case 
that the permitted characters when passing a *string* to a numeric 
constructor are themselves an ASCII subset.

For binary representations, we already have the struct module to handle 
the parsing, but for byte sequences with embedded ASCII digits it's 
reasonably common practice to use strings along with the respective type 
constructors.

However, Mark found another problem when he attempted to speed up the 
Py3k version of decimal by storing the mantissa as a bytes object 
instead of a unicode string: there is currently no efficient way to 
serialise a number into a byte sequence. So storing the mantissa as a 
bytes object is actually currently slower than storing it as a string, 
as you have to convert the number to a string before you can store it in 
a bytes object. That still leaves us with the problem that decimal is 
about 25% slower in 3.0 than it is in 2.6, due to the fact that the 
unicode->int conversion is much slower than the corresponding 2.x 
str->int conversion.

Ugly problem :P

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org


More information about the Python-Dev mailing list