[Python-bugs-list] [ python-Bugs-476326 ] Unicode and imp.find_module
noreply@sourceforge.net
noreply@sourceforge.net
Mon, 07 Jan 2002 02:55:18 -0800
Bugs item #476326, was opened at 2001-10-30 03:25
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=476326&group_id=5470
Category: Python Interpreter Core
Group: None
Status: Open
Resolution: None
Priority: 3
Submitted By: Paul Boddie (pboddie)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Unicode and imp.find_module
Initial Comment:
When a Unicode string is passed as the module name to
imp.find_module, the function fails to import the
named module even when it exists in the specified
path, returning the error message "No module
named ..." as a result.
The problem in Python 2.0 can be traced to line 922 of
Python/import.c which ensures that any strings
involved in the find_module function must be standard
Python strings and not Unicode strings, since it tests
the type of path components against &PyString_Type
explicitly.
Interestingly, the __import__ built-in function seems
to work with Unicode strings. Either way, it would be
great if this could be documented or even fixed, but I
don't know what the policy is on Unicode module names
(even when they only contain ASCII-compatible
characters).
----------------------------------------------------------------------
>Comment By: M.-A. Lemburg (lemburg)
Date: 2002-01-07 02:55
Message:
Logged In: YES
user_id=38388
The find_module() code doesn't seem to have changed between
the releases, so it should work in Python 2.0 as well.
The only parts I see in the source code which require strings
are the sys.path handling APIs. The optional second argument
to find_module() will also only accept strings. Perhaps that's where
your problem originated ?
Python 2.0 (#1, Jan 19 2001, 17:54:27)
[GCC 2.95.2 19991024 (release)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> import imp
>>> imp.find_module(u'platform')
(<open file '/home/lemburg/bin/platform.py', mode 'r' at 0x8191a78>, '/home/lemburg/bin/platform.py', ('.py', 'r', 1))
>
Can you give an example which demonstrates the problem ?
----------------------------------------------------------------------
Comment By: Paul Boddie (pboddie)
Date: 2002-01-07 02:43
Message:
Logged In: YES
user_id=226443
It must have been fixed between Python 2.0 and Python 2.1,
then, but I can't find any obvious indication of this in
Python/import.c. The platform probably shouldn't matter in
this case, but I was using Red Hat Linux 6.1 on Intel.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2002-01-05 00:04
Message:
Logged In: YES
user_id=21627
I cannot reproduce the problem in Python 2.1:
>>> import imp
>>> imp.find_module(u"string")
(<open file '/usr/local/lib/python2.2/string.py', mode 'r'
at 0x816e070>, '/usr/local/lib/python2.2/string.py', ('.py',
'r', 1))
I don't think __import__ should accept non-ASCII names. It
may be reasonable to further restrict import to verify that
the argument is a NAME, in the sense of the Python lexis;
doing so is not important, either.
I cannot see any further problem in this report, so I
suggest to close it as fixed. The test in line 922 only
checks the path, not the module name.
----------------------------------------------------------------------
Comment By: Paul Boddie (pboddie)
Date: 2001-12-03 02:59
Message:
Logged In: YES
user_id=226443
For my purposes, I just wrapped the module name in a 'str'
function call. I had Unicode strings because I was using
text from an XML document and then attempting to use such
text with the import mechanism.
One issue is whether Python would ever support importing
from files which have non-ASCII filenames. I can imagine
that certain operating systems support Unicode filenames,
for example, but then the Python language probably doesn't
support such filenames as the basis for module names when
used with the 'import' statement and other related
statements.
So, there's a wider issue of text encodings in (C)Python
scripts as part of the "comprehensive" solution to this
problem; the easy solution is just to enforce ASCII-only
module names.
----------------------------------------------------------------------
Comment By: M.-A. Lemburg (lemburg)
Date: 2001-12-01 15:01
Message:
Logged In: YES
user_id=38388
I guess Python should not except non-ASCII module names, so conversion of Unicode to ASCII should be
appropriate.
Would it suffice to only test this in find_module() or do you think that I need to dig deeper into the import
mechanism ?
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=476326&group_id=5470