[Python-Dev] registering unicode codecs

Neal Norwitz nnorwitz at gmail.com
Thu Nov 24 20:34:37 CET 2005

While running regrtest with -R to find reference leaks I found a usage
issue.  When a codec is registered it is stored in the interpreter
state and cannot be removed.  Since it is stored as a list, if you
repeated add the same search function, you will get duplicates in the
list and they can't be removed.  This shows up as a reference leak
(which it really isn't) in test_unicode with this code modified from

import codecs
def search_function(encoding):
    def encode1(input, errors="strict"):
        return 42
    return (encode1, None, None, None)



Should the search function be added to the search path if it is
already in there?  I don't understand a benefit of having duplicate
search functions.

Should users have access to the search path (through a
codecs.unregister())?  If so, should it search from the end of the
list to the beginning to remove an item?  That way the last entry
would be removed rather than the first.


More information about the Python-Dev mailing list