[Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs

Victor Stinner victor.stinner at haypocalc.com
Wed Jun 9 01:53:14 CEST 2010


There are two opposite issues in the bug tracker:

   #7475: codecs missing: base64 bz2 hex zlib ...
   -> reintroduce the codecs removed from Python3

   #8838: Remove codecs.readbuffer_encode()
   -> remove the last part of the removed codecs

If I understood correctly, the question is: should codecs module only contain 
encoding codecs, or contain also other kind of codecs.

Encoding codec API is now strict (encode: str->bytes, decode: bytes->str), 
it's not possible to reuse str.encode() or bytes.decode() for the other 
codecs. Marc-Andre Lemburg proposed to add .tranform() and .untranform() 
methods to str, bytes and bytearray types. If I understood correctly, it would 
look like:

   >>> b'abc'.transform("hex")
   '616263'
   >>> '616263'.untranform("hex")
   b'abc'

I suppose that each codec will have a different list of accepted input and 
output types. Example:

   bz2: encode:bytes->bytes, decode:bytes->bytes
   rot13: encode:str->str, decode:str->str
   hex: encode:bytes->str, decode: str->bytes

And so "abc".encode("bz2") would raise a TypeError.

--

In my opinion, we should not mix codecs of different kinds (compression, 
cipher, etc.) because the input and output types are different. It would have 
more sense to create a standard API for each kind of codec. Existing examples 
of standard APIs in Python: hashlib, shutil.make_archive(), database API, etc.

-- 
Victor Stinner
http://www.haypocalc.com/


More information about the Python-Dev mailing list