[New-bugs-announce] [issue13442] Better support for pipe I/O encoding in subprocess
report at bugs.python.org
Mon Nov 21 02:22:48 CET 2011
New submission from Nick Coghlan <ncoghlan at gmail.com>:
Currently, pipes in the subprocess module work strictly with bytes I/O, *unless* you set "universal newlines=True". In that case, it assumes an output encoding of UTF-8 for stdout and stderr and applies universal newlines process.
When stdin/out/err are remapped to ordinary I/O streams then 'encoding' and 'errors' can be specified as usual, but it is currently challenging to do this for pipes. Since they're created internally by the subprocess module, user code doesn't get the opportunity to wrap them when using the convenience APIs. When using Popen objects, you have to create the object, then wrap each stream individually (rebinding the attributes as you go).
My suggestion is that we add a new option for the stdin/out/err arguments:
def __init__(self, encoding, errors='strict'):
self.encoding = encoding
self.errors = errors
So to read UTF-8 encoded data from a subprocess, you could just do:
data = check_stdout(cmd, stdout=TextPipe('utf-8'), stderr=STDOUT)
There are at least a couple of other alternatives here:
- separate out the pipe creation logic from the Popen logic so it is possible to create and wrap the pipe objects explicitly and then pass the wrapped pipe object to the subprocess invocation APIs. 'TextPipe' would then actually be such a wrapped pipe, rather than merely instructions to tell Popen what kind of pipe to create.
- instead of adding 'TextPipe', just re-use the PIPE name (with the class itself still being used as a marker constant to request implicit creation of a binary PIPE)
assignee: docs at python
nosy: docs at python, ncoghlan
stage: needs patch
title: Better support for pipe I/O encoding in subprocess
versions: Python 3.3
Python tracker <report at bugs.python.org>
More information about the New-bugs-announce