[New-bugs-announce] [issue20405] Add io.BinaryTransformWrapper and a "transform" parameter to open()

Nick Coghlan report at bugs.python.org
Mon Jan 27 06:24:39 CET 2014

New submission from Nick Coghlan:

Issue 20404 points out that io.TextIOWrapper can't be used with binary transform codecs like bz2 because the types are wrong.

By contrast, codecs.open() still defaults to working in binary mode, and just switches to returning a different type based on the specified encoding (exactly the kind of value-driven output type changes we're trying to eliminate from the core text model):

>>> import codecs
>>> print(codecs.open('hex.txt').read())
>>> print(codecs.open('hex.txt', encoding='hex').read())
>>> print(codecs.open('hex.txt', encoding='utf-8').read())

While for 3.4, I plan to just extend the issue 19619 blacklist to also cover TextIOWrapper (and hence open()), it seems to me that there is a valid use case for bytes-to-bytes transform support directly in the IO stack.

A PEP for 3.5 could propose:

- providing a public API that allows codecs to be classified into at least the following groups ("binary" = memorview compatible data exporters, including both bytes and bytearray):
  - text encodings (decodes binary to str, encodes str to bytes)
  - binary transforms (decodes *and* encodes binary to bytes)
  - text transforms (decodes and encodes str to str)
  - hybrid transforms (acts as both a binary transform *and* as a text transform)
  - hybrid encodings (decodes binary and potentially str to str, encodes binary and str to bytes)
  - arbitrary encodings (decodes and encodes object to object, without fitting any of the above categories)

- adding io.BinaryTransformWrapper that applies binary transforms when reading and writing data (similar to the way TextIOWrapper applies text encodings)

- adding a "transform" parameter to open that inserts BinaryTransformWrapper into the stack at the appropriate place (the PEP process would need to decide between supporting just a single transform per stream or multiple). In text mode, TextIOWrapper would be added to the stack after any binary transforms.

Optionally, the idea could also be extended to adding io.TextTransformWrapper and a "text_transform" parameter, but those seem somewhat less useful.

components: IO, Interpreter Core, Library (Lib)
messages: 209398
nosy: benjamin.peterson, ezio.melotti, haypo, hynek, lemburg, ncoghlan, pitrou, serhiy.storchaka, stutzbach
priority: normal
severity: normal
stage: needs patch
status: open
title: Add io.BinaryTransformWrapper and a "transform" parameter to open()
type: enhancement
versions: Python 3.5

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list