Unicode Newbie

Gerhard Häring gh at ghaering.de
Tue Sep 9 16:49:20 CEST 2003


Manuel Huesser wrote:
> The unicode function implies that you only can use 2 ** 16 chars
> (unichr supports only this range) but with a given encoding e.g.
> unicode(",,,", "utf-8") i should be able to encode
> up to 2** 31 chars.
> 
> "\xfc\x12\x12\x12\x12\x12\x12" is an example for a 7
> byte utf-8 string. But on encoding i get the following
> error:
> 
> UTF-8 decoding error: unsupported Unicode code range
> 
> Is there any possibility to do the job?

You can try compiling Python with --enable-unicode=ucs4.

But just because all characters map to a 0 .. 2^32 interval doesn't mean 
that there is a defined character for every number in the interval. So 
you'll still get encoding errors when you try to throw random 
bytestrings at the encode function.

-- Gerhard





More information about the Python-list mailing list