[Python-ideas] RFC: bytestring as a str representation [was: a new bytestring type?]
Greg Ewing
greg.ewing at canterbury.ac.nz
Thu Jan 9 05:44:44 CET 2014
Stephen J. Turnbull wrote:
> No, it doesn't. It means 'abc' followed by something that cannot be
> encoded by any codec without the surrogateescape handler.
> 'ascii-compatible' merely defaults to that handler. I wouldn't
> actually be too upset if I were told, no, you have to specify
> explicitly.
If I understand correctly, your intention is that
61 62 63 FF in this representation would simply be
a more compact version of 0061 0062 0063 DCFF,
with exactly the same semantics.
If that's right, then maybe something like "compressed
surrogateescape" or "8-bit surrogateescape" would be
a better name for it?
Also, it could be produced automatically where
possible by any decoding operation that specified
surrogateescape -- there wouldn't have to be a
dedicated encoding name for it (although there
could be for convenience).
It could also potentially be produced by any
slicing or other string operations that resulted
in characters within the appropriate ranges,
just like any of the other internal representations.
--
Greg
More information about the Python-ideas
mailing list