[Python-Dev] file.readinto performance regression in Python 3.2 vs. 2.7?

Fri Nov 25 12:37:49 CET 2011

On Fri, Nov 25, 2011 at 10:04 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Fri, 25 Nov 2011 20:34:21 +1100
> Matt Joiner <anacrolix at gmail.com> wrote:
>>
>> It's Python 3.2. I tried it for larger files and got some interesting results.
>>
>> readinto() for 10MB files, reading 10MB all at once:
>>
>> readinto/2.7 100 loops, best of 3: 8.6 msec per loop
>> readinto/3.2 10 loops, best of 3: 29.6 msec per loop
>> readinto/3.3 100 loops, best of 3: 19.5 msec per loop
>>
>> With 100KB chunks for the 10MB file (annotated with #):
>>
>> matt at stanley:~/Desktop$ for f in read bytearray_read readinto; do for
>> v in 2.7 3.2 3.3; do echo -n "$f/$v "; "python$v" -m timeit -s 'import
>> readinto' "readinto.$f()"; done; done
>> read/2.7 100 loops, best of 3: 7.86 msec per loop # this is actually
>> faster than the 10MB read
>> read/3.2 10 loops, best of 3: 253 msec per loop # wtf?
>> read/3.3 10 loops, best of 3: 747 msec per loop # wtf??
>
> No "wtf" here, the read() loop is quadratic since you're building a
> new, larger, bytes object every iteration.  Python 2 has a fragile
> optimization for concatenation of strings, which can avoid the
> quadratic behaviour on some systems (depends on realloc() being fast).

Is there any way to bring back that optimization? a 30 to 100x slow
down on probably one of the most common operations... string
contatenation, is very noticeable. In python3.3, this is representing
a 0.7s stall building a 10MB string. Python 2.7 did this in 0.007s.

>
>> readinto/2.7 100 loops, best of 3: 8.93 msec per loop
>> readinto/3.2 100 loops, best of 3: 10.3 msec per loop # suddenly 3.2
>> is performing well?
>> readinto/3.3 10 loops, best of 3: 20.4 msec per loop
>
> What if you allocate the bytearray outside of the timed function?

This change makes readinto() faster for 100K chunks than the other 2
methods and clears the differences between the versions.
readinto/2.7 100 loops, best of 3: 6.54 msec per loop
readinto/3.2 100 loops, best of 3: 7.64 msec per loop
readinto/3.3 100 loops, best of 3: 7.39 msec per loop

Updated test code: http://pastebin.com/8cEYG3BD

>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/anacrolix%40gmail.com
>

So as I think Eli suggested, the readinto() performance issue goes
away with large enough reads, I'd put down the differences to some
unrelated language changes.

However the performance drop on read(): Python 3.2 is 30x slower than
2.7, and 3.3 is 100x slower than 2.7.