[Python-Dev] bytes type discussion
"Martin v. Löwis"
martin at v.loewis.de
Wed Feb 15 02:11:24 CET 2006
Raymond Hettinger wrote:
>>- bytes("abc") == bytes(map(ord, "abc"))
>
>
> At first glance, this seems obvious and necessary, so if it's somewhat
> controversial, then I'm missing something. What's the issue?
There is an "implicit Latin-1" assumption in that code. Suppose
you do
# -*- coding: koi-8r -*-
print bytes("Гвидо ван Россум")
in Python 2.x, then this means something (*). In Python 3, it gives
you an exception, as the ordinals of this are suddenly above 256.
Or, perhaps worse, the code
# -*- coding: utf-8 -*-
print bytes("Martin v. Löwis")
will work in 2.x and 3.x, but produce different numbers (**).
Regards,
Martin
(*) [231, 215, 201, 196, 207, 32, 215, 193, 206, 32, 242, 207, 211, 211,
213, 205]
(**) In 2.x, this will give
[77, 97, 114, 116, 105, 110, 32, 118, 46, 32, 76, 195, 182, 119, 105, 115]
whereas in 3.x, it will give
[77, 97, 114, 116, 105, 110, 32, 118, 46, 32, 76, 246, 119, 105, 115]
More information about the Python-Dev
mailing list