[Python-Dev] bytes type discussion

Adam Olsen rhamph at gmail.com
Wed Feb 15 05:41:02 CET 2006


On 2/14/06, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> Raymond Hettinger wrote:
> >>- bytes("abc") == bytes(map(ord, "abc"))
> >
> >
> > At first glance, this seems obvious and necessary, so if it's somewhat
> > controversial, then I'm missing something.  What's the issue?
>
> There is an "implicit Latin-1" assumption in that code. Suppose
> you do
>
> # -*- coding: koi-8r -*-
> print bytes("Гвидо ван Россум")
>
> in Python 2.x, then this means something (*). In Python 3, it gives
> you an exception, as the ordinals of this are suddenly above 256.
>
> Or, perhaps worse, the code
>
> # -*- coding: utf-8 -*-
> print bytes("Martin v. Löwis")
>
> will work in 2.x and 3.x, but produce different numbers (**).

My assumption is these would become errors in 3.x.  bytes(str) is only
needed so you can do bytes(u"abc".encode('utf-8')) and have it work in
2.x and 3.x.

(I wonder if maybe they should be an error in 2.x as well.  Source
encoding is for unicode literals, not str literals.)

--
Adam Olsen, aka Rhamphoryncus


More information about the Python-Dev mailing list