How to pass Chinese characters as command-line arguments?

Sun Jan 31 20:35:51 CET 2010

>> I want to pass Chinese characters as command-line arguments to a
>> Python script.  My terminal has no problem displaying these
>> characters, and passing them to the script, but I can't get Python
>> to understand them properly.
>> E.g. if I pass one such character to the simple script
>> import sys
>> print sys.argv[1]
>> print type(sys.argv[1])
>> the first line of the output looks fine (identical to the input),
>> but the second line says "<type 'str'>".  If I add the line
>> arg = unicode(sys.argv[1])
>> I get the error
>> Traceback (most recent call last):
>>    File "", line 4, in<module>
>>      arg = unicode(sys.argv[1])
>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 0: ordinal not in range(128)
>> What must I do to get Python to recognize command-line arguments
>> as utf-8 Unicode?

>The last sentence reveals your problem: utf-8 is *not* unicode. It's an 
>encoding of unicode, which is a crucial difference.

> From the outside you get byte-streams, and if these happen to be 
>encoded in utf-8, you can simply decode them:

>arg = unicode(sys.argv[1], "utf-8")



