cp936 uses gbk codec, doesn't decode `\x80` as U+20AC EURO SIGN
Ulrich Eckhardt
eckhardt at satorlaser.com
Mon Oct 11 04:54:05 EDT 2010
John Machin wrote:
> |>>> '\x80'.decode('cp936')
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> UnicodeDecodeError: 'gbk' codec can't decode byte 0x80
> in position 0: incomplete multibyte sequence
[...]
> So Microsoft appears to think that
> cp936 includes the euro,
> and the ICU project seem to think that GBK and cp936
> both include the euro.
>
> A couple of questions:
>
> Is this a bug or a shrug?
Bug, IMHO.
Uli
--
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932
More information about the Python-list
mailing list