[ python-Bugs-960874 ] codecs.lookup can raise exceptions other than LookupError

SourceForge.net noreply at sourceforge.net
Wed May 26 15:17:58 EDT 2004


Bugs item #960874, was opened at 2004-05-26 16:37
Message generated for change (Comment added) made by lemburg
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=960874&group_id=5470

Category: Unicode
Group: None
>Status: Closed
>Resolution: Wont Fix
Priority: 5
Submitted By: John Ehresman (jpe)
Assigned to: M.-A. Lemburg (lemburg)
Summary: codecs.lookup can raise exceptions other than LookupError

Initial Comment:
codecs.lookup raises ValueError when given an empty 
string and UnicodeEncodeError when given a unicode 
object that can't be converted to a str in the default 
encoding.  I'd expect it to raise LookupError when 
passed any basestring instance.

For example:
Python 2.3.3 (#51, Dec 18 2003, 20:22:39) [MSC 
v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more 
information.
>>> import codecs
>>> codecs.lookup('')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "c:\python23\lib\encodings\__init__.py", line 84, in 
search_function
    globals(), locals(), _import_tail)
ValueError: Empty module name
>>> codecs.lookup(u'\uabcd')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode 
character u'\uabcd' in position 0: ordinal not in range
(128)
>>>

----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2004-05-26 21:17

Message:
Logged In: YES 
user_id=38388

I don't think we should change anything.

First of all, the lookup function interfaces to a codec
search function and these can raise all kinds of errors, so
it is not guaranteed that you will only see LookupErrors
(the same is true for most other Python APIs, e.g. most can
generate MemoryErrors). Possible other errors are
ValueErrors, NameErrors, ImportErrors, etc. etc. depending
on the search function that happens to process your request.

Second, the name you enter as argument usually maps to a
Python module and/or package name, so it *has* to be ASCII.
The fact that you can enter Unicode names for the codec name
if only by virtue of the automagical conversion of Unicode
to strings. Again, this happens in a lot of places in Python
and is not specific to lookup().

Closing this request.


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-05-26 20:53

Message:
Logged In: YES 
user_id=6656

Well, *I* don't think that's a particularly good idea.  I don't know if 
Marc-André feels differently.

----------------------------------------------------------------------

Comment By: John Ehresman (jpe)
Date: 2004-05-26 20:47

Message:
Logged In: YES 
user_id=22785

Yes, it does look like lookup('') is fixed in CVS.  So the 
question is whether lookup() of something that isn't 
convertable in the current encoding to a char* should raise a 
LookupError.  I can live with it not, though if it did, it would 
make it a bit easier to determine if an arbitrary unicode string 
is a name of a supported encoding.  

I'm willing to put together a patch to raise LookupError if 
that's what the behavior should be

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-05-26 19:13

Message:
Logged In: YES 
user_id=6656

This much seems to be fixed in CVS, actually :-)

----------------------------------------------------------------------

Comment By: John Ehresman (jpe)
Date: 2004-05-26 19:09

Message:
Logged In: YES 
user_id=22785

The other exceptions occur when strings or unicode objects 
are passed in as an argument.  The string that it fails on is 
the empty string ('').  I can see disallowing non-ascii names, 
but '' should raise a LookupError.

My use case is to see if an user supplied unicode string is a 
valid encoding, so any check that the lookup function does 
not do, I will need to do before calling it.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-05-26 18:32

Message:
Logged In: YES 
user_id=6656

What exactly are you complaining about?  I'd expect codecs.lookup 
to raise TypeError if called with no arguments or an integer.

I believe it's documented somewhere that encoding names must 
be ascii only, but I must admit I don't recall where.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=960874&group_id=5470



More information about the Python-bugs-list mailing list