
On Tue, Sep 28, 2010 at 7:32 AM, Antoine Pitrou <solipsis@pitrou.net> wrote: [Guido]
wonder if it wouldn't be better to add an extra buffer to GzipFile so small seek() and read() calls can be made more efficient?
The problem is that, since the buffer of the unpickler and the buffer of the GzipFile are not aware of each other, the unpickler could easily ask to seek() backwards past the current GzipFile buffer, and fall back on the slow algorithm.
But AFAICT unpickle doesn't use seek()? [...]
But, if the stream had prefetch(), the unpickling would be simplified: I would only have to call prefetch() once when refilling the buffer, rather than two read()'s followed by a peek().
(I could try to coalesce the two reads, but it would complicate the code a bit more...)
Where exactly would the peek be used? (I must be confused because I can't find either peek or seek in _pickle.c.) It still seems to me that the "right" way to solve this would be to insert a transparent extra buffer somewhere, probably in the GzipFile code, and work in reducing the call overhead.
I want to push back on this more, primarily because a new primitive I/O operation has high costs: it can never be removed, it has to be added to every stream implementation, developers need to learn to use the new operation, and so on.
I agree with this (except that most developers don't really need to learn to use it: common uses of readable files are content with read() and readline(), and need neither peek() nor prefetch()). I don't intend to push this for 3.2; I'm throwing the idea around with a hypothetical 3.3 landing if it seems useful.
So far it seems more awkward than useful.
Also, if you can believe the multi-core crowd, a very different possible future development might be to run the gunzip algorithm and the unpickle algorithm in parallel, on separate cores. Truly such a solution would require totally *different* new I/O primitives, which might have a higher chance of being reusable outside the context of pickle.
Well, it's a bit of a pie-in-the-sky perspective :) Furthermore, such a solution won't improve CPU efficiency, so if your workload is already able to utilize all CPU cores (which it can easily do if you are in a VM, or have multiple busy daemons), it doesn't bring anything.
Agreed it's pie in the sky... Though the interface between the two CPUs might actually be designed to be faster than the current buffered I/O. I have (mostly :-) fond memories of async I/O on a mainframe I used in the '70s which worked this way. -- --Guido van Rossum (python.org/~guido)