cp936 uses gbk codec, doesn't decode `\x80` as U+20AC EURO SIGN

Ulrich Eckhardt eckhardt at satorlaser.com
Mon Oct 11 10:54:05 CEST 2010

John Machin wrote:
> |>>> '\x80'.decode('cp936')
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> UnicodeDecodeError: 'gbk' codec can't decode byte 0x80
>  in position 0: incomplete multibyte sequence
> So Microsoft appears to think that
> cp936 includes the euro,
> and the ICU project seem to think that GBK and cp936
> both include the euro.
> A couple of questions:
> Is this a bug or a shrug?

Bug, IMHO.


Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

More information about the Python-list mailing list