[Python-Dev] unicode imports
"Martin v. Löwis"
martin at v.loewis.de
Mon Jun 19 22:41:27 CEST 2006
Kristján V. Jónsson wrote:
> Wouldn´t it be possible then to emulate the unix way? Simply encode
> any unicode paths to utf-8, process them as normal, and then decode
> them just prior to the actual windows io call?
That won't work. People also put path names from the ANSI code page
onto sys.path and expect that to work - it always worked, and is
a nearly-complete work-around to put directories with funny characters
onto sys.path. sys.path is a list, so we have little control over
what gets put onto it.
> Of course, once there, why not do it unicode all the way up to that
> last point? Unless there are platforms without wchar_t that would
> make sense.
Again, we can't really control that. Also, most platforms have no
wchar_t API for file IO. We would have to encode each sys.path
element for each stat() call, which would be quite expensive
> At any rate, I am trying to find a coding path of least resistance
> here. Regardless of the timeline or acceptance in mainstream python
> for this feature, it is something I will have to patch in for our
The path with least resistance should be usage of 8.3 directory names.
The one to implement in future Python versions should be the rewrite
of import.c, to operate on PyObject* instead of char*, and perform
conversion to the native API only just before calling the native API.
More information about the Python-Dev