[New-bugs-announce] [issue41221] Output of print() might get truncated in unbuffered mode

Manuel Jacob report at bugs.python.org
Mon Jul 6 13:33:53 EDT 2020


New submission from Manuel Jacob <me at manueljacob.de>:

Without unbuffered mode, it works as expected:

% python -c "import sys; sys.stdout.write('x'*4294967296)" | wc -c        
4294967296

% python -c "import sys; print('x'*4294967296)" | wc -c 
4294967297

With unbuffered mode, writes get truncated to 2147479552 bytes on my Linux machine:

% python -u -c "import sys; sys.stdout.write('x'*4294967296)" | wc -c           
2147479552

% python -u -c "import sys; print('x'*4294967296)" | wc -c 
2147479553

I didn’t try, but it’s probably an even bigger problem on Windows, where writes might be limited to 32767 bytes: https://github.com/python/cpython/blob/v3.9.0b4/Python/fileutils.c#L1585

Without unbuffered mode, `sys.stdout.buffer` is a `io.BufferedWriter` object.

% python -c 'import sys; print(sys.stdout.buffer)'
<_io.BufferedWriter name='<stdout>'>

With unbuffered mode, `sys.stdout.buffer` is a `io.FileIO` object.

% python -u -c 'import sys; print(sys.stdout.buffer)' 
<_io.FileIO name='<stdout>' mode='wb' closefd=False>

`io.BufferedWriter` implements the `io.BufferedIOBase` interface. `io.BufferedIOBase.write()` is documented to write all passed bytes. `io.FileIO` implements the `io.RawIOBase` interface. `io.RawIOBase.write()` is documented to be able to write less bytes than passed.

`io.TextIOWrapper.write()` is not documented to write all characters it has been passed, but e.g. `print()` relies on that.

To fix the problem, it has to be ensured that either
* `sys.stdout.buffer` is an object that guarantees that all bytes passed to its `write()` method are written (e.g. deriving from `io.BufferedIOBase`), or
* `io.TextIOWrapper` calls the `write()` method of its underlying binary stream until all bytes have been written, or
* users of `io.TextIOWrapper` call `write()` until all characters have been written.

In the first two possibilities it probably makes sense to tighten the contract of `io.TextIOBase.write` to guarantee that all passed characters are written.

----------
components: IO
messages: 373151
nosy: mjacob
priority: normal
severity: normal
status: open
title: Output of print() might get truncated in unbuffered mode
type: behavior
versions: Python 3.8

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue41221>
_______________________________________


More information about the New-bugs-announce mailing list