[I18n-sig] Codecs

Andy Robinson andy@reportlab.com
Sun, 4 Jun 2000 23:25:04 +0100


>
> Should codecs be returned to the user as objects instead of tuples?
> Today we have:
>
> (UTF8_encode, UTF8_decode,
>       UTF8_streamreader, UTF8_streamwriter) = codecs.lookup('UTF-8')
>
> output = UTF8_streamwriter( open( '/tmp/output', 'wb') )
>
> I think this would be a little simpler:
>
> output=codecs.lookup('UTF-8').stream_writer( open( '/tmp/output', 'wb')
> )
>
> The object solution is more extensible, requires less "bogus"
> assignments and does not require the user to remember the order of the
> return values.
>
I suggested this a while back, for a different reason.  Right now you get
four things back from lookup() relating to the given encoding.  But in many
cases there may be other encoding-specific routines of great use, and
returning an object would give us a place to hang them;  codec.repair(...)
and codec.validate(...), for example.  There are accepted and useful bits of
code around to repair Shift-JIS or EUC data in which one or two bytes are
corrupt.  We would also have a place to hang language-specific routines.

So I would be very, very happy to see codecs.lookup return a 'codec object'
with the four attributes encode, decode, streamreader() and streamwriter()
rather than a tuple.

- Andy Robinson