On 2014-03-04 01:55, Steven D'Aprano wrote:
On Tue, Mar 04, 2014 at 10:42:57AM +1100, Chris Angelico wrote:
You could probably make the same performance argument against making Unicode the default string datatype.
I don't think so -- for ASCII strings the performance cost of Unicode is significantly less than the performance hit for Decimal:
[steve@ando ~]$ python3.3 -m timeit -s "s = 'abcdef'*1000" "s.upper()" 100000 loops, best of 3: 8.76 usec per loop [steve@ando ~]$ python3.3 -m timeit -s "s = b'abcdef'*1000" "s.upper()" 100000 loops, best of 3: 7.05 usec per loop
[steve@ando ~]$ python3.3 -m timeit -s "x = 123.4567" "x**6" 1000000 loops, best of 3: 0.344 usec per loop [steve@ando ~]$ python3.3 -m timeit -s "from decimal import Decimal" \
-s "x = Decimal('123.4567')" "x**6" 1000000 loops, best of 3: 1.41 usec per loop
That's a factor of 1.2 times slower for Unicode versus 4.1 for Decimal. I think that's *fast enough* for all but the most heavy numeric needs, but it's not something we can ignore.
But a stronger argument is that the default string should be the one that does the right thing with text. As of Python 3, that's the case. And the default integer type handles arbitrary sized integers (although Py2 went most of the way there by having automatic promotion). It's reasonable to suggest that the default non-integer numeric type should also simply do the right thing.
Define "the right thing" for numbers.
It's a trade-off, though, and for most people, float is sufficient.
That's a tricky one. For people doing quote-unquote "serious" numeric work, they'll mostly want to stick to binary floats, even if that means missing out on all the extra IEEE-754 goodies that the decimal module has but floats don't. The momentum of 40+ years of almost entirely binary floating point maths does not shift to decimal overnight.
[snip] Won't people doing quote-unquote "serious" numeric work be using numpy?