[Patches] [ python-Patches-571603 ] Fix bug in encodings.search_function

noreply@sourceforge.net noreply@sourceforge.net
Tue, 30 Jul 2002 01:16:55 -0700


Patches item #571603, was opened at 2002-06-20 13:39
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=571603&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Closed
Resolution: Rejected
Priority: 5
Submitted By: Geert Jansen (geertj)
Assigned to: Nobody/Anonymous (nobody)
Summary: Fix bug in encodings.search_function

Initial Comment:
Hi,

there seems to be a bug in the default encoding search 
function (search_function in encodings/__init__.py. The 
function tries to load a module with the name of the 
encoding, but it doesn't require that this module is in the 
encodings/ directory. This leads to trouble when you try 
to use an encoding that has the name of a module in the 
search path.

To demonstrate, save the following line to test.py:

print 'Just testing'.encode('test')

and run it. This results in a CodecRegistryError 
exception: "module "test" (test.pyc) failed to register"

The bug is present in 2.2.1 and in HEAD. In HEAD there 
was actually a bugfix for this but it was incomplete.

Patches for 2.2.1 and HEAD attached.

Greetings,
Geert Jansen

----------------------------------------------------------------------

>Comment By: Geert Jansen (geertj)
Date: 2002-07-30 10:16

Message:
Logged In: YES 
user_id=537938

I meant by "leak" that the module namespace and the 
encoding namespace are different namespaces and should 
therefore be insolated from each other. Symbols from one 
namespace should not turn up in the other. This is all IMHO 
of course.

But thanks for fixing this problem. Next time I send in a patch 
I'll make sure I run the test suite too... Sorry for that.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-07-30 09:47

Message:
Logged In: YES 
user_id=21627

Not sure what you mean by "leak". It is certainly desirable
that modules carry the same name as encodings; in fact,
*every* encoding implemented so far has a module with the
same name.

People have been using u"text".encode("japanese.sjis"),
given that the JapaneseCodecs package installs itself into a
Python package "japanese". That must continue to work. In
particular, you patch broke test.test_charmapcodec; make
sure you test your patches before submitting them.

To solve the problem of .encode("test") giving a registry
error, I have now changed the search_function to ignore
modules that don't have a getregentry function.



----------------------------------------------------------------------

Comment By: Geert Jansen (geertj)
Date: 2002-07-29 19:45

Message:
Logged In: YES 
user_id=537938

Hi Martin,

Isn't it wrong to let the module namespace "leak" into the 
encodings namespace? This leads to very unexpected 
behaviour. Why should it be forbidden to have a module with 
the same name as an encoding? This seems rather arbitrary 
and solely an implementation detail.

It is still very easy to add an encoding outside the encodings/ 
directory using the codecs.register() function. Or maybe there 
is another solution?

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-07-29 15:31

Message:
Logged In: YES 
user_id=21627

It's actually not a bug to pass a module outside of
encodings/; the standard search function is supposed to find
other modules as well. So I have to rever thsi change.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-07-28 13:33

Message:
Logged In: YES 
user_id=21627

Thanks for the patch; applied as __init__.py 1.9 and 1.6.12.1.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=571603&group_id=5470