
I don't think the extra interface would benefit code written in Python as much. Even if Python code gets the data into a buffer more easily, it's going to pay those costs to manipulate the buffered data. It would mostly help modules written in C, such as pickle, which right now are heavily bottlenecked getting the data into a buffer.
Right. It would, however, benefit /file objects/ written in Python (since the cost of calling a peek() written in pure Python is certainly significant compared to the cost of the actual peeking operation).
- But first, fix bufferediobase_readinto() so it doesn't work by calling the read() method and/or follow up on the TODO in buffered_readinto()
Patches welcome :)
Comparing the C code for Python 2's cPickle and Python 3's pickle, I see that Python 2 has paths for unpickling from a FILE *, cStringIO, and "other". Python effectively only has a code path for "other", so it's not surprising that it's slower. In the worst case, I am sure that if we re-added specialized code paths that we could make it just as fast as Python 2, although that would make the code messy.
It would be very ugly, IMO. And it would still be slower than the clean solution, which is to have a buffer size big enough that the overhead of making a read() method call is dwarfed by the processing cost of the data (that's how TextIOWrapper works). (for the record, with the read()+peek() patch, unpickle is already faster than Python 2, but that's comparing apples to oranges because Python 3 got other unpickle optimizations in the meantime)
If you want a new API, I think a new C API for I/O objects with C-friendly arguments would be better than a new Python-level API.
I really think we should keep an unified API. A low-level C API would be difficult to get right, make implementations more complicated, and consumers would have to keep fallback code for objects not implementing the C API, which would complicate things on their side too. Conversely, one purpose of my prefetch() proposal, besides optimizing some workloads, is to *simplify* writing of buffered IO code.
In a nutshell, if you feel the need to make a buffer around BufferedReader, then I agree there's a problem, but I don't think helping you make a buffer around BufferedReader is the right solution. ;-)
In a layered approach, it's hard not to end up with multiple levels of buffering (think TextIOWrapper + BufferedReader + OS page-level caching) :) I agree that shared buffers sound more efficient but, again, I fear they would be a lot of work to get right. If you look at the BufferedReader code, it's already non-trivial, and bugs in this area can be really painful. Regards Antoine.