[Python-Dev] Import and unicode: part two

Toshio Kuratomi a.badger at gmail.com
Thu Jan 20 02:30:31 CET 2011


On Wed, Jan 19, 2011 at 07:11:52PM -0500, James Y Knight wrote:
> On Jan 19, 2011, at 6:44 PM, Toshio Kuratomi wrote:
> > This problem of which encoding to use is a problem that can be
> > seen on UNIX systems even now.  Try this:
> > 
> >  echo 'print("hi")' > café.py
> >  convmv -f utf-8 -t latin1 café.py
> >  python3 -c 'import café'
> > 
> > ASCII seems very sensible to me when faced with these ambiguities.
> > 
> > Other options I can brainstorm that could be explored:
> > 
> > * Specify an encoding per platform and stick to that.  (So, for instance,
> >  all module names on posix platforms would have to be utf-8).  Force
> >  translation between encoding when installing packages (But that doesn't
> >  help for people that are creating their modules using their own build
> >  scripts rather than distutils, copying the files using raw tar, etc.)
> > * Change import semantics to allow specifying the encoding of the module on
> >  the filesystem (seems really icky).
> 
> None of this is unique to import -- the same exact issue occurs with open(u'café'). I don't see any reason why import café should be though of as more of a problem, or treated any differently.
> 
It's unique in several ways:

1) With open, you can specify a byte string::
       open(b'caf\xe9.py').read()

   I don't know of any way to do that with import.
   This is needed when the filename is not compatible with your current
   locale.

2) import assigns a name to the module that it imports whereas open lets the
   programmer assign the name.  So even if you can specify what to use as
   a byte string for this filename on this particular filesystem you'd still
   end up with some ugly pseudo-representation of bytes when attempting to
   access it in code::
       import caf\xe9

       caf\xe9.do_something()

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110119/088901cc/attachment.pgp>


More information about the Python-Dev mailing list