[issue4428] io.BufferedWriter does not observe buffer size limits
David M. Beazley
report at bugs.python.org
Tue Nov 25 13:35:00 CET 2008
New submission from David M. Beazley <beazley at users.sourceforge.net>:
The Buffered I/O interface in the io module has the user specify buffer
limits such as size and max_buffer_size. The first limit (size) is
easy to understand as a buffering threshold at which writes will occur.
However, no apparent attempt is made to strictly limit the internal
buffer size to max_buffer_size.
In BuffererWriter.write(), one of the first operations is
which simply extends the buffer by the full data being written. If b
happens to be a large string (e.g., megabytes or even the entire
contents of a big file), then the internal I/O buffer makes a complete
copy of the data, effectively doubling the memory requirements for
carrying out the write operation.
I suppose most programmers might not notice given that everyone has
gigabytes of RAM these days, but you certainly don't see this kind of
buffering behavior in the operating system kernel or in the C library.
Some patch suggestions (details left to the maintainers of this module):
1. Don't extend self._write_buf by more than the max_buffer_size.
fragment = b[:self.max_buffer_size - len(self._write_buf)]
2. For large blocking writes, simply carry out the remaining I/O
operations in the write() method instead of in the _flush_locked()
method. Try to use the original input data b as the data
source instead of making copies of it. And if you have to copy
the data, don't do it all at once.
components: Library (Lib)
title: io.BufferedWriter does not observe buffer size limits
type: resource usage
versions: Python 2.6, Python 2.7, Python 3.0
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list