[I18n-sig] Codecs for Big Five and GB 2312

M.-A. Lemburg mal@lemburg.com
Tue, 31 Oct 2000 19:38:27 +0100


"Martin v. Loewis" wrote:
> 
> > u"abc".encode("mycodecs.my_utf_8")
> >
> > and the encodings search function will take care of the rest.
> >
> > Wouldn't this solve at least some of the problems ?
> 
> I don't think so. The class of applications that I think will need
> codecs first are "internet" applications: processors of HTML, XML,
> MIME. In all these cases, some well-established encoding name is used,
> which should be provided by Python "as-is". That is, if you receive
> data in "shift-jis", having to map this to "japanese.shift_jis" is
> just the same as requiring that "japanese" is imported up-front. In
> the applications that I see, the application does not want to know
> what "shift-jis" is - it just wants Python to convert that to Unicode.

Than have your application register a new search function which
does the necessary aliasing.

Another possibility would be dropping your shift_jis.py codec
into the sitecodecs package... at your own risk, though, since
it might overwrite some already installed codec.

Using the fully qualified name helps in case you want to use
different codec implementations for the same encoding.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/