Can't print Chinese to HTTP

Lie Ryan lie.1296 at gmail.com
Sat Dec 5 05:54:51 EST 2009


On 12/5/2009 2:57 PM, Gnarlodious wrote:
> On Dec 1, 3:06 pm, Terry Reedy wrote:
>> def print(s): return sys.stdout.buffer.write(s.encode('utf-8'))
>
> Here is a better solution that lets me send any string to the
> function:
>
> def print(html): return sys.stdout.buffer.write(("Content-type:text/
> plain;charset=utf-8\n\n"+html).encode('utf-8'))

No, that's wrong. You're serving HTML with Content-type:text/plain, it 
should've been text/html or application/xhtml+xml (though technically 
correct some older browsers have problems with the latter).

> Why this changed in Python 3 I do not know, nor why it was nowhere to
> be found on the internet.
>
> Can anyone explain it?

Python 3's str() is what was Python 2's unicode().
Python 2's str() turned into Python 3's bytes().

Python 3's print() now takes a unicode string, which is the regular string.

Because of the switch to unicode str, a simple print('晉') should've 
worked flawlessly if your terminal can accept the character, but the 
problem is your terminal does not.

The correct fix is to fix your terminal's encoding.

In Windows, due to the prompt's poor support for Unicode, the only real 
solution is to switch to a better terminal.

Another workaround is to use a real file:

import sys
f = open('afile.html', 'w', encoding='utf-8')
print("晉", file=f)
sys.stdout = f
print("晉")

or slightly better is to rewrap the buffer with io.TextIOWrapper:
import sys, io
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding="utf-8")
print("晉")



More information about the Python-list mailing list