[Python-Dev] unicode imports
Nick Coghlan
ncoghlan at gmail.com
Mon Jun 19 15:46:13 CEST 2006
Kristján V. Jónsson wrote:
> Funny that no other platforms could benefit from a unicode import path.
> Does that mean that windows will reign supreme? Please explain.
As near as I can tell, other platforms use encoded strings with the normal
(byte-based) posix file API, so the Python interpreter and the file system
simply need to agree on the encoding (typically utf-8) in order for both
filesystem access and importing from non-ASCII paths to work.
On Windows, though, most of the file system interaction code has had to be
updated to use the wide-character API where possible. import.c is one of the
few holdouts that relies entirely on the byte-based posix API.
If I had to put money on what's currently happening on your test machine, it's
that import.c is trying to do u'c:/tmp/\u814c'.encode('mbcs'), getting
'c:/tmp/?' and proceeding to do nothing useful with that path entry. Checking
the result of sys.getfilesystemencoding() should be able to confirm that.
So it looks like it ain't really gonna work properly on Windows unless
import.c is rewritten to use the Unicode-aware platform independent IO
implementation in posixmodule.c.
Until that happens (hopefully by Python 2.6), I like MvL's suggestion - look
at the 8.3 DOS name on the command prompt and put that into sys.path. ctypes
and/or pywin32 should let you get at that information programmatically.
Cheers,
Nick.
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
---------------------------------------------------------------
http://www.boredomandlaziness.org
More information about the Python-Dev
mailing list