[Python-ideas] Adding 'bytes' as alias for 'latin_1' codec.
Stephen J. Turnbull
stephen at xemacs.org
Tue May 31 11:08:06 CEST 2011
Greg Ewing writes:
> Stephen J. Turnbull wrote:
>
> > But even as a separate type, 'ascii' still can't mix with bytes
> > safely,
>
> Yes, it can, because it's also bytes. :-)
To the extent that's safe, you may as well just use str and force
encoding with the ascii codec and strict errors (as I suggested
earlier). AFAICS, the argument that the visual signal of the special
literal syntax helps is bogus. It doesn't help with variables;
variables aren't typed in Python. It's still just as possible to type
a'äëïöü', although it might make the mistake a little more visible.
And in most cases, the use case for this feature will be very
stylized, with a very small vocabulary of ASCII puns, written as
literals at the point of combination with a bytes object. Anything
else I can think of should be handled as text, via conversion to str.
I just don't see a use case for an 'ascii' type, vs. coercing str to
bytes and raising an error if the str is not all-ASCII.
> If you're using the special ascii type at all, rather
> than an ordinary str, it's precisely because you want
> to mix it with bytes. Making that part hard would
> defeat the purpose,
Indeed. Most alleged use cases for "mixing" *should* be made hard to
do by operating on bytes directly. Cf. the mixed-encoding log file
example.
More information about the Python-ideas
mailing list