[issue41452] Inefficient BufferedReader.read(-1)

Ma Lin report at bugs.python.org
Fri Jul 31 23:34:14 EDT 2020


New submission from Ma Lin <malincns at 163.com>:

BufferedReader's constructor has a `buffer_size` parameter, it's the size of this buffer:

    When reading data from BufferedReader object, a larger
    amount of data may be requested from the underlying raw
    stream, and kept in an internal buffer.
    
    The doc of BufferedReader[1]


If call the BufferedReader.read(size) function:

    1, When `size` is a positive number, it reads `buffer_size`
       bytes from the underlying stream. This is expected behavior.

    2, When `size` is -1, it tries to call underlying stream's
       readall() function [2]. In this case `buffer_size` is not
       be respected.
       
       The underlying stream may be `RawIOBase`, its readall()
       function read `DEFAULT_BUFFER_SIZE` bytes in each read [3].
       
       `DEFAULT_BUFFER_SIZE` currently only 8KB, which is very
       inefficient for BufferedReader.read(-1). If `buffer_size`
       bytes is read every time, will be the expected performance.

Attached file demonstrates this problem.


[1] doc of BufferedReader:
https://docs.python.org/3/library/io.html#io.BufferedReader

[2] BufferedReader.read(-1) tries to call underlying stream's readall() function:
https://github.com/python/cpython/blob/v3.9.0b5/Modules/_io/bufferedio.c#L1538-L1542

[3] RawIOBase.readall() read DEFAULT_BUFFER_SIZE each time:
https://github.com/python/cpython/blob/v3.9.0b5/Modules/_io/iobase.c#L968-L969

----------
components: IO
files: demo.py
messages: 374652
nosy: malin
priority: normal
severity: normal
status: open
title: Inefficient BufferedReader.read(-1)
type: performance
versions: Python 3.10
Added file: https://bugs.python.org/file49354/demo.py

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue41452>
_______________________________________


More information about the Python-bugs-list mailing list