[Tutor] (no subject)
eryk sun
eryksun at gmail.com
Mon Jul 25 07:19:46 EDT 2016
On Fri, Jul 22, 2016 at 7:38 AM, DiliupG <diliupg at gmail.com> wrote:
> I am using Python 2.7.12 on Windows 10
>
> filename = u"මේක තියෙන්නේ සිංහලෙන්.txt"
> Unsupported characters in input
That error message is from IDLE. I'm not an expert with IDLE, so I
don't know what the following hack potentially breaks, but it does
allow entering the above Unicode filename in the interactive
interpreter.
Edit "Python27\Lib\idlelib\IOBinding.py". Look for the following
section on line 34:
if sys.platform == 'win32':
# On Windows, we could use "mbcs". However, to give the user
# a portable encoding name, we need to find the code page
try:
encoding = locale.getdefaultlocale()[1]
codecs.lookup(encoding)
except LookupError:
pass
Replace the encoding value with "utf-8" as follows:
# encoding = locale.getdefaultlocale()[1]
encoding = "utf-8"
When you restart IDLE, you should see that sys.stdin.encoding is now "utf-8".
IOBinding.encoding is used by ModifiedInterpreter.runsource in
PyShell.py. When the encoding is UTF-8, it passes the Unicode source
string directly to InteractiveInterpreter.runsource, where it gets
compiled using the built-in compile() function.
Note that IDLE uses the Tk GUI toolkit, which -- at least with Python
Tkinter on Windows -- is limited to the first 65,536 Unicode
characters, i.e the Basic Multilingual Plane. The BMP includes
Sinhalese, so your filename string is fine.
More information about the Tutor
mailing list