[Python-Dev] unicode imports

Phillip J. Eby pje at telecommunity.com
Fri Jun 16 18:02:51 CEST 2006


At 01:29 AM 6/17/2006 +1000, Nick Coghlan wrote:
>Kristján V. Jónsson wrote:
> > A cursory glance at import.c shows that the import mechanism is fairly
> > complicated, and riddled with "char *path" thingies, and manual string
> > arithmetic.  Do you have any suggestions on a clean way to unicodify the
> > import mechanism?
>
>Can you install a PEP 302 path hook and importer/loader that can handle path
>entries that are Unicode strings? (I think this would end up being the
>parallel implementation you were talking about, though)
>
>If the code that traverses sys.path and sys.path_hooks is itself
>unicode-unaware (I don't remember if it is or isn't), then you might be able
>to trick it by poking a Unicode-savvy importer directly into the
>path_importer_cache for affected Unicode paths.

Actually, you would want to put it in sys.path_hooks, and then instances 
would be placed in path_importer_cache automatically.  If you are adding it 
to the path_hooks after the fact, you should simply clear the 
path_importer_cache.  Simply poking stuff into the path_importer_cache is 
not a recommended approach.


>One issue is that the package and file names still have to be valid Python
>identifiers, which means ASCII. Unicode would be, at best, permitted only in
>the path entries.

If I understand the problem correctly, the issue is that if you install 
Python itself to a Unicode directory, you'll be unable to import anything 
from the standard library.  This isn't about module names, it's about the 
places on the path where that stuff goes.

However, if the issue is that the program works, but it puts unicode 
entries on sys.path, I would suggest simply encoding them to strings using 
the platform-appropriate codec.



More information about the Python-Dev mailing list