[issue3873] Unpickling is really slow

STINNER Victor report at bugs.python.org
Thu Jul 29 00:58:04 CEST 2010


STINNER Victor <victor.stinner at haypocalc.com> added the comment:

New version of my patch:
 - add "used" attribute to UnpicklerBuffer structure: disable the read buffer for not seekable file and for protocol 0 (at the first call to unpickle_readline)
 - check if PyObject_GetAttrString(file, "seek") is NULL or not
 - unpickle_readline() flushs also the buffer
 - add a new patch specific to the read buffer: ensure that unpickler doesn't eat data at the end of the file

test_pickle pass without any error.

Disable read buffer at the first call to unpickle_readline() because unpickle_readline() have to flush the buffer. I will be very difficult to optimize protocol 0, but I hope that nobody uses it nowadays.

===========

Benchmark with [0]*10**6, Python compiled with pydebug.

Without the patch
-----------------

Protocol 0:
- dump: 598.0 ms
- load (seekable=False): 3337.3 ms
- load (seekable=True): 3309.6 ms

Protocol 1:
- dump: 217.8 ms
- load (seekable=False): 864.2 ms
- load (seekable=True): 873.3 ms

Protocol 2:
- dump: 226.5 ms
- load (seekable=False): 867.8 ms
- load (seekable=True): 854.6 ms


With the patch
--------------

Protocol 0
- dump: 615.5 ms
- load (seekable=False): 3201.3 ms
- load (seekable=True): 3223.4 ms

Protocol 1
- dump: 219.8 ms
- load (seekable=False): 942.1 ms
- load (seekable=True): 175.2 ms

Protocol 2
- dump: 221.1 ms
- load (seekable=False): 943.9 ms
- load (seekable=True): 175.5 ms

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue3873>
_______________________________________


More information about the Python-bugs-list mailing list