[Python-ideas] Adding 'bytes' as alias for 'latin_1' codec.

Stephen J. Turnbull stephen at xemacs.org
Tue May 31 11:08:06 CEST 2011


Greg Ewing writes:
 > Stephen J. Turnbull wrote:
 > 
 > > But even as a separate type, 'ascii' still can't mix with bytes
 > > safely,
 > 
 > Yes, it can, because it's also bytes. :-)

To the extent that's safe, you may as well just use str and force
encoding with the ascii codec and strict errors (as I suggested
earlier).  AFAICS, the argument that the visual signal of the special
literal syntax helps is bogus.  It doesn't help with variables;
variables aren't typed in Python.  It's still just as possible to type
a'äëïöü', although it might make the mistake a little more visible.
And in most cases, the use case for this feature will be very
stylized, with a very small vocabulary of ASCII puns, written as
literals at the point of combination with a bytes object.  Anything
else I can think of should be handled as text, via conversion to str.

I just don't see a use case for an 'ascii' type, vs. coercing str to
bytes and raising an error if the str is not all-ASCII.

 > If you're using the special ascii type at all, rather
 > than an ordinary str, it's precisely because you want
 > to mix it with bytes. Making that part hard would
 > defeat the purpose,

Indeed.  Most alleged use cases for "mixing" *should* be made hard to
do by operating on bytes directly.  Cf. the mixed-encoding log file
example.




More information about the Python-ideas mailing list