[ python-Bugs-960874 ] codecs.lookup can raise exceptions other
than LookupError
SourceForge.net
noreply at sourceforge.net
Wed May 26 15:17:58 EDT 2004
Bugs item #960874, was opened at 2004-05-26 16:37
Message generated for change (Comment added) made by lemburg
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=960874&group_id=5470
Category: Unicode
Group: None
>Status: Closed
>Resolution: Wont Fix
Priority: 5
Submitted By: John Ehresman (jpe)
Assigned to: M.-A. Lemburg (lemburg)
Summary: codecs.lookup can raise exceptions other than LookupError
Initial Comment:
codecs.lookup raises ValueError when given an empty
string and UnicodeEncodeError when given a unicode
object that can't be converted to a str in the default
encoding. I'd expect it to raise LookupError when
passed any basestring instance.
For example:
Python 2.3.3 (#51, Dec 18 2003, 20:22:39) [MSC
v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more
information.
>>> import codecs
>>> codecs.lookup('')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "c:\python23\lib\encodings\__init__.py", line 84, in
search_function
globals(), locals(), _import_tail)
ValueError: Empty module name
>>> codecs.lookup(u'\uabcd')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode
character u'\uabcd' in position 0: ordinal not in range
(128)
>>>
----------------------------------------------------------------------
>Comment By: M.-A. Lemburg (lemburg)
Date: 2004-05-26 21:17
Message:
Logged In: YES
user_id=38388
I don't think we should change anything.
First of all, the lookup function interfaces to a codec
search function and these can raise all kinds of errors, so
it is not guaranteed that you will only see LookupErrors
(the same is true for most other Python APIs, e.g. most can
generate MemoryErrors). Possible other errors are
ValueErrors, NameErrors, ImportErrors, etc. etc. depending
on the search function that happens to process your request.
Second, the name you enter as argument usually maps to a
Python module and/or package name, so it *has* to be ASCII.
The fact that you can enter Unicode names for the codec name
if only by virtue of the automagical conversion of Unicode
to strings. Again, this happens in a lot of places in Python
and is not specific to lookup().
Closing this request.
----------------------------------------------------------------------
Comment By: Michael Hudson (mwh)
Date: 2004-05-26 20:53
Message:
Logged In: YES
user_id=6656
Well, *I* don't think that's a particularly good idea. I don't know if
Marc-André feels differently.
----------------------------------------------------------------------
Comment By: John Ehresman (jpe)
Date: 2004-05-26 20:47
Message:
Logged In: YES
user_id=22785
Yes, it does look like lookup('') is fixed in CVS. So the
question is whether lookup() of something that isn't
convertable in the current encoding to a char* should raise a
LookupError. I can live with it not, though if it did, it would
make it a bit easier to determine if an arbitrary unicode string
is a name of a supported encoding.
I'm willing to put together a patch to raise LookupError if
that's what the behavior should be
----------------------------------------------------------------------
Comment By: Michael Hudson (mwh)
Date: 2004-05-26 19:13
Message:
Logged In: YES
user_id=6656
This much seems to be fixed in CVS, actually :-)
----------------------------------------------------------------------
Comment By: John Ehresman (jpe)
Date: 2004-05-26 19:09
Message:
Logged In: YES
user_id=22785
The other exceptions occur when strings or unicode objects
are passed in as an argument. The string that it fails on is
the empty string (''). I can see disallowing non-ascii names,
but '' should raise a LookupError.
My use case is to see if an user supplied unicode string is a
valid encoding, so any check that the lookup function does
not do, I will need to do before calling it.
----------------------------------------------------------------------
Comment By: Michael Hudson (mwh)
Date: 2004-05-26 18:32
Message:
Logged In: YES
user_id=6656
What exactly are you complaining about? I'd expect codecs.lookup
to raise TypeError if called with no arguments or an integer.
I believe it's documented somewhere that encoding names must
be ascii only, but I must admit I don't recall where.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=960874&group_id=5470
More information about the Python-bugs-list
mailing list