Re: [Python-ideas] [Python-Dev] Prefetching on buffered IO files

On Mon, Sep 27, 2010 at 5:41 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
While trying to solve #3873 (poor performance of pickle on file objects, due to the overhead of calling read() with very small values),
After looking over the relevant code, it looks to me like the overhead of calling the read() method compared to calling fread() in Python 2 is the overhead of calling PyObject_Call along with the construction of argument tuples and deconstruction of the return value. I don't think the extra interface would benefit code written in Python as much. Even if Python code gets the data into a buffer more easily, it's going to pay those costs to manipulate the buffered data. It would mostly help modules written in C, such as pickle, which right now are heavily bottlenecked getting the data into a buffer. Comparing the C code for Python 2's cPickle and Python 3's pickle, I see that Python 2 has paths for unpickling from a FILE *, cStringIO, and "other". Python effectively only has a code path for "other", so it's not surprising that it's slower. In the worst case, I am sure that if we re-added specialized code paths that we could make it just as fast as Python 2, although that would make the code messy. Some ideas: - Use readinto() instead of read(), to avoid extra allocations/deallocations - But first, fix bufferediobase_readinto() so it doesn't work by calling the read() method and/or follow up on the TODO in buffered_readinto() If you want a new API, I think a new C API for I/O objects with C-friendly arguments would be better than a new Python-level API. In a nutshell, if you feel the need to make a buffer around BufferedReader, then I agree there's a problem, but I don't think helping you make a buffer around BufferedReader is the right solution. ;-) -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com/>

Right. It would, however, benefit /file objects/ written in Python (since the cost of calling a peek() written in pure Python is certainly significant compared to the cost of the actual peeking operation).
Patches welcome :)
It would be very ugly, IMO. And it would still be slower than the clean solution, which is to have a buffer size big enough that the overhead of making a read() method call is dwarfed by the processing cost of the data (that's how TextIOWrapper works). (for the record, with the read()+peek() patch, unpickle is already faster than Python 2, but that's comparing apples to oranges because Python 3 got other unpickle optimizations in the meantime)
If you want a new API, I think a new C API for I/O objects with C-friendly arguments would be better than a new Python-level API.
I really think we should keep an unified API. A low-level C API would be difficult to get right, make implementations more complicated, and consumers would have to keep fallback code for objects not implementing the C API, which would complicate things on their side too. Conversely, one purpose of my prefetch() proposal, besides optimizing some workloads, is to *simplify* writing of buffered IO code.
In a layered approach, it's hard not to end up with multiple levels of buffering (think TextIOWrapper + BufferedReader + OS page-level caching) :) I agree that shared buffers sound more efficient but, again, I fear they would be a lot of work to get right. If you look at the BufferedReader code, it's already non-trivial, and bugs in this area can be really painful. Regards Antoine.

On Tue, Sep 28, 2010 at 10:06 AM, Antoine Pitrou <solipsis@pitrou.net>wrote:
I'm not likely to get to it soon, but I've opened Issue 9971 to at least keep track of it. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com/>

Right. It would, however, benefit /file objects/ written in Python (since the cost of calling a peek() written in pure Python is certainly significant compared to the cost of the actual peeking operation).
Patches welcome :)
It would be very ugly, IMO. And it would still be slower than the clean solution, which is to have a buffer size big enough that the overhead of making a read() method call is dwarfed by the processing cost of the data (that's how TextIOWrapper works). (for the record, with the read()+peek() patch, unpickle is already faster than Python 2, but that's comparing apples to oranges because Python 3 got other unpickle optimizations in the meantime)
If you want a new API, I think a new C API for I/O objects with C-friendly arguments would be better than a new Python-level API.
I really think we should keep an unified API. A low-level C API would be difficult to get right, make implementations more complicated, and consumers would have to keep fallback code for objects not implementing the C API, which would complicate things on their side too. Conversely, one purpose of my prefetch() proposal, besides optimizing some workloads, is to *simplify* writing of buffered IO code.
In a layered approach, it's hard not to end up with multiple levels of buffering (think TextIOWrapper + BufferedReader + OS page-level caching) :) I agree that shared buffers sound more efficient but, again, I fear they would be a lot of work to get right. If you look at the BufferedReader code, it's already non-trivial, and bugs in this area can be really painful. Regards Antoine.

On Tue, Sep 28, 2010 at 10:06 AM, Antoine Pitrou <solipsis@pitrou.net>wrote:
I'm not likely to get to it soon, but I've opened Issue 9971 to at least keep track of it. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com/>
participants (2)
-
Antoine Pitrou
-
Daniel Stutzbach