[I18n-sig] Codecs
Andy Robinson
andy@reportlab.com
Sun, 4 Jun 2000 23:25:04 +0100
>
> Should codecs be returned to the user as objects instead of tuples?
> Today we have:
>
> (UTF8_encode, UTF8_decode,
> UTF8_streamreader, UTF8_streamwriter) = codecs.lookup('UTF-8')
>
> output = UTF8_streamwriter( open( '/tmp/output', 'wb') )
>
> I think this would be a little simpler:
>
> output=codecs.lookup('UTF-8').stream_writer( open( '/tmp/output', 'wb')
> )
>
> The object solution is more extensible, requires less "bogus"
> assignments and does not require the user to remember the order of the
> return values.
>
I suggested this a while back, for a different reason. Right now you get
four things back from lookup() relating to the given encoding. But in many
cases there may be other encoding-specific routines of great use, and
returning an object would give us a place to hang them; codec.repair(...)
and codec.validate(...), for example. There are accepted and useful bits of
code around to repair Shift-JIS or EUC data in which one or two bytes are
corrupt. We would also have a place to hang language-specific routines.
So I would be very, very happy to see codecs.lookup return a 'codec object'
with the four attributes encode, decode, streamreader() and streamwriter()
rather than a tuple.
- Andy Robinson