[Python-ideas] Adding 'bytes' as alias for 'latin_1' codec.
Stephen J. Turnbull
stephen at xemacs.org
Tue May 31 07:51:47 CEST 2011
Greg Ewing writes:
> Stephen J. Turnbull wrote:
> > Greg Ewing writes:
> >
> > > How would ascii behave when mixed with unicode strings? Should it
> > > automatically coerce to unicode,
> >
> > Definitely not! Bytes are not text, and the programmer must say when
> > they want those bytes decoded.
>
> But the proposed 'ascii' type *is* text, though.
If it's intended that the 'ascii' type *be* text, I don't see the
point. It *is* Unicode (with a restricted range), and no coercion is
necessary between str and 'ascii', just a change of representation.
This can be done completely transparently[1], no need for a new type,
except that some effort on the part of implementer can be saved by
imposing ongoing annoyance on the application programmer.
But even as a separate type, 'ascii' still can't mix with bytes
safely, for the same reason that str can't mix with bytes: 'ascii' and
str have a known fixed encoding (Unicode), and bytes have an unknown,
variable encoding (possibly the non-encoding 'binary'). YAGNI...
Footnotes:
[1] For some use cases it might be useful to allow specifying the
representation in advance, as a micro-optimization.
More information about the Python-ideas
mailing list