[Pythonmac-SIG] Getting Terminal input and output encoding
Nir Soffer
nirs at freeshell.org
Thu Feb 16 14:00:56 CET 2006
I'm trying to find a way to get the user encoding used for example for
command line arguments e.g.:
# creating an Hebrew file name...
touch עברית
./foo.py *
From my experience with Mac OS X 10.0-3, I know the foo.py will always
get hebrew-name using utf-8.
You can also see this when you type non-ASCII in the Terminal:
$ touch \327\242\327\221\327\250\327\231\327\252
Will create the file named "עברית"
I noticed that it does not matter what encoding you set in the Terminal
window setting, anything you type will use utf-8 encoding.
Anyway, I could not find any documentation about this issue, expect
this:
"All BSD system functions expect their string parameters to be in
UTF-8 encoding
and nothing else. Code that calls BSD system routines should ensure
that the contents of all const *char parameters are in canonical UTF-8
encoding."
<http://developer.apple.com/documentation/MacOSX/Conceptual/
BPInternational/Articles/FileEncodings.html#//apple_ref/doc/uid/
20002137-DontLinkElementID_4>
On Linux people are getting the encoding with:
import locale
locale.getpreferredencoding()
But on OS X getpreferredencoding() returns useless results, at least
for decoding command line arguments or printing readable output. For
example:
1. Choose "Window Settings..." in the Terminal and set the Character
Set Encoding to Unicode (UTF-8)
2. Try:
>>> import locale
>>> locale.getpreferredencoding()
'mac-roman'
I have found this code trying to correct the behavior (from bzrlib):
# work around egregious python 2.4 bug
>>> import sys
>>> sys.platform = 'posix'
>>> import locale
>>> locale.getpreferredencoding()
'US-ASCII'
>>> sys.platform = 'darwin'
Obviously this workaround does not work around this problem :-)
So my conclusion is that Mac OS X uses always utf-8 for input to the
shell. Unless I am missing something?
Next, how can you get the Terminal output encoding? For example, what
if a user changed the Character Set Encoding to Western (Mac OS Roman)
- how can you detect this setting from Python?
Best Regards,
Nir Soffer
More information about the Pythonmac-SIG
mailing list