[Python-Dev] unicode imports

Bob Ippolito bob at redivi.com
Fri Jun 16 19:04:41 CEST 2006


On Jun 16, 2006, at 9:02 AM, Phillip J. Eby wrote:

> At 01:29 AM 6/17/2006 +1000, Nick Coghlan wrote:
>> Kristján V. Jónsson wrote:
>>> A cursory glance at import.c shows that the import mechanism is  
>>> fairly
>>> complicated, and riddled with "char *path" thingies, and manual  
>>> string
>>> arithmetic.  Do you have any suggestions on a clean way to  
>>> unicodify the
>>> import mechanism?
>>
>> Can you install a PEP 302 path hook and importer/loader that can  
>> handle path
>> entries that are Unicode strings? (I think this would end up being  
>> the
>> parallel implementation you were talking about, though)
>>
>> If the code that traverses sys.path and sys.path_hooks is itself
>> unicode-unaware (I don't remember if it is or isn't), then you  
>> might be able
>> to trick it by poking a Unicode-savvy importer directly into the
>> path_importer_cache for affected Unicode paths.
>
> Actually, you would want to put it in sys.path_hooks, and then  
> instances
> would be placed in path_importer_cache automatically.  If you are  
> adding it
> to the path_hooks after the fact, you should simply clear the
> path_importer_cache.  Simply poking stuff into the  
> path_importer_cache is
> not a recommended approach.
>
>
>> One issue is that the package and file names still have to be  
>> valid Python
>> identifiers, which means ASCII. Unicode would be, at best,  
>> permitted only in
>> the path entries.
>
> If I understand the problem correctly, the issue is that if you  
> install
> Python itself to a Unicode directory, you'll be unable to import  
> anything
> from the standard library.  This isn't about module names, it's  
> about the
> places on the path where that stuff goes.

There's a similar issue in that if sys.prefix contains a colon,  
Python is also busted:
http://python.org/sf/1507224

Of course, that's not a Windows issue, but it is everywhere else. The  
offending code in that case is Modules/getpath.c, which probably also  
has to change in order to make unicode directories work on Win32  
(though I think there may be a separate win32 implementation of  
getpath).

-bob



More information about the Python-Dev mailing list