[Python-Dev] unicode imports

Kristján V. Jónsson kristjan at ccpgames.com
Mon Jun 19 13:12:34 CEST 2006

Ideally, I would like for python to "simply work." It seems to me that it is mostly a question of time when all modern platforms offer unicode filesystems and hence unicode APIs.  IMHO, stuff like the importer should really be written in native unicode and revert to ASCII only as a fallback for unsupporting platforms.  is WITH_UNICODE ever left undefined these days?

And sure, module names need to be python identifiers (thus ASCII), although I wouldn't be surprised if that restriction were lifted in a not too distant future :)  After all, we support the utf-8 encoding of source files, but I cannot write "kristján = 1".  But that's for a future PEP.


-----Original Message-----
From: Nick Coghlan [mailto:ncoghlan at gmail.com] 
Sent: 16. júní 2006 15:30
To: Kristján V. Jónsson
Cc: Python Dev
Subject: Re: [Python-Dev] unicode imports

Kristján V. Jónsson wrote:
> A cursory glance at import.c shows that the import mechanism is fairly 
> complicated, and riddled with "char *path" thingies, and manual string 
> arithmetic.  Do you have any suggestions on a clean way to unicodify 
> the import mechanism?

Can you install a PEP 302 path hook and importer/loader that can handle path entries that are Unicode strings? (I think this would end up being the parallel implementation you were talking about, though)

If the code that traverses sys.path and sys.path_hooks is itself unicode-unaware (I don't remember if it is or isn't), then you might be able to trick it by poking a Unicode-savvy importer directly into the path_importer_cache for affected Unicode paths.

One issue is that the package and file names still have to be valid Python identifiers, which means ASCII. Unicode would be, at best, permitted only in the path entries.


Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

More information about the Python-Dev mailing list