[Python-Dev] Unicode Imports
ncoghlan at gmail.com
Sat Sep 9 07:55:56 CEST 2006
Martin v. Löwis wrote:
> Steve Holden schrieb:
>> Or simply that this inability isn't currently
>> described in a bug report on Sourceforge?
> No: sys.path is specified (originally) as containing a list of byte
> strings; it was extended to also support path importers (or whatever
> that PEP calls them). It was never extended to support Unicode strings.
> That other PEP e
That other PEP being PEP 302. That said, Unicode strings *are* permitted on
sys.path - the import system will automatically encode them to an 8-bit string
using the default filesystem encoding as part of the import process.
This works fine on Unix systems that use UTF-8 encoded strings to handle
Unicode paths at the C API level, but is screwed on Windows because the
default mbcs filesystem encoding can't handle the full range of possible
Unicode path names (such as the Chinese directories that originally gave
To get Unicode path names to work on Windows, you have to use the
Windows-specific wide character API instead of the normal C API, and the
import machinery doesn't do that.
So this is taking something that *already works properly on POSIX systems* and
making it work on Windows as well.
>> I agree it's a relatively large patch for a release candidate but if
>> prudence suggests deferring it, it should be a *definite* for 2.5.1 and
>> subsequent releases.
> I'm not so sure it should. It *is* a new feature: it makes applications
> possible which aren't possible today, and the documentation does not
> ever suggest that these applications should have been possible. In fact,
> it is common knowledge that this currently isn't supported.
It should already work fine on POSIX filesystems that use the default
filesystem encoding for path names. As far as I am aware, it is only Windows
where it doesn't work.
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
More information about the Python-Dev