[Python-ideas] Python3.3 Decimal Library Released
Steven D'Aprano
steve at pearwood.info
Tue Mar 4 02:55:16 CET 2014
On Tue, Mar 04, 2014 at 10:42:57AM +1100, Chris Angelico wrote:
> You could probably make the same performance argument against making
> Unicode the default string datatype.
I don't think so -- for ASCII strings the performance cost of Unicode is
significantly less than the performance hit for Decimal:
[steve at ando ~]$ python3.3 -m timeit -s "s = 'abcdef'*1000" "s.upper()"
100000 loops, best of 3: 8.76 usec per loop
[steve at ando ~]$ python3.3 -m timeit -s "s = b'abcdef'*1000" "s.upper()"
100000 loops, best of 3: 7.05 usec per loop
[steve at ando ~]$ python3.3 -m timeit -s "x = 123.4567" "x**6"
1000000 loops, best of 3: 0.344 usec per loop
[steve at ando ~]$ python3.3 -m timeit -s "from decimal import Decimal" \
> -s "x = Decimal('123.4567')" "x**6"
1000000 loops, best of 3: 1.41 usec per loop
That's a factor of 1.2 times slower for Unicode versus 4.1 for Decimal.
I think that's *fast enough* for all but the most heavy numeric needs,
but it's not something we can ignore.
> But a stronger argument is that
> the default string should be the one that does the right thing with
> text. As of Python 3, that's the case. And the default integer type
> handles arbitrary sized integers (although Py2 went most of the way
> there by having automatic promotion). It's reasonable to suggest that
> the default non-integer numeric type should also simply do the right
> thing.
Define "the right thing" for numbers.
> It's a trade-off, though, and for most people, float is sufficient.
That's a tricky one. For people doing quote-unquote "serious" numeric
work, they'll mostly want to stick to binary floats, even if that means
missing out on all the extra IEEE-754 goodies that the decimal module
has but floats don't. The momentum of 40+ years of almost entirely
binary floating point maths does not shift to decimal overnight.
But for everyone else, binary floats are sufficient except when they
aren't. Decimal, of course, won't solve all you floating point
difficulties -- it's easy to demonstrate that nearly all the common
pitfalls of FP maths also occurs with Decimal, with the exception of
inexact conversion from decimal strings to numbers. But that one issue
alone is a major cause of confusion.
My personal feeling is that for Python 4000 I'd vote for the default
floating point format to be decimal, with binary floats available with a
b suffix.
But since that could be a decade away, it's quite premature to spend too
much time on this.
--
Steven
More information about the Python-ideas
mailing list