It is not required by the unicodec.register() API to provide a subclass of these base class, only the given methods must be present; this allows writing Codecs as extensions types. All Codecs must provide the .encode()/.decode() methods. Codecs having the .read() and/or .write() methods are considered to be StreamCodecs.
The Unicode implementation will by itself only use the stateless .encode() and .decode() methods.
All other conversion have to be done by explicitly instantiating the appropriate [Stream]Codec.
Looks okay, although I'd like someone to implement a simple shift-state-based stream codec to check this out further. I have some questions about the constructor. You seem to imply that instantiating the class without arguments creates a codec without state. That's fine. When given a stream argument, shouldn't the direction of the stream be given as an additional argument, so the proper state for encoding or decoding can be set up? I can see that for an implementation it might be more convenient to have separate classes for encoders and decoders -- certainly the state being kept is very different. Also, I don't want to ignore the alternative interface that was suggested by /F. It uses feed() similar to htmllib c.s. This has some advantages (although we might want to define some compatibility so it can also feed directly into a file). Perhaps someone should go ahead and implement prototype codecs using either paradigm and then write some simple apps, so we can make a better decision. In any case I think the specs codec registry API aren't on the critical path, integration of /F's basic unicode object is the first thing we need. --Guido van Rossum (home page: http://www.python.org/~guido/)