encoding of sys.argv ?
"Martin v. Löwis"
martin at v.loewis.de
Tue Oct 24 01:05:33 CEST 2006
> I use a Linux box, with French UTF-8 locales and an UTF-8 filesystem.
> sys.getdefaultencoding() is "ascii" and sys.getfilesystemencoding()
> is "utf-8". However, sys.argv is neither in ASCII (since I can pass
> French accentuated character), nor in UTF-8. It seems to be encoded
> in "latin-1", but why ?
Let me second Leo Kislov's analysis. They should be encoded in
locale.getpreferredencoding(), which should be UTF-8. Are you
*sure* they aren't encoded in this way?
On my Debian system, I get this:
martin at mira:~/tmp$ echo $LANG
martin at mira:~/tmp$ cat a.py
martin at mira:~/tmp$ python a.py Martin v. Löwis
['a.py', 'Martin', 'v.', 'L\xc3\xb6wis']
So clearly, my terminal application + shell passes them as UTF-8,
as it should. The terminal application is KDE konsole; the shell
is bash. The shell *pretty likely* passes the arguments "through"
as-read from the terminal, so if you are not seeing UTF-8, you
have managed to misconfigure your terminal.
More information about the Python-list