[Python-Dev] transform() and untransform() methods, and the codec registry

Nick Coghlan ncoghlan at gmail.com
Tue Dec 7 06:06:13 CET 2010

On Tue, Dec 7, 2010 at 2:46 PM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> Having all encodings accessible in a str method only promotes a
> programming style where bytes objects can contain differently encoded
> strings in different parts of the program.  Instead, well-written
> programs should decode bytes on input, do all processing with str type
> and decode on output.  When strings need to be passed to char* C APIs,
> they should be encoded in UTF-8.  Many C APIs originally designed for
> ASCII actually produce meaningful results when given  UTF-8 bytes.
> (Supporting such usage was one of the design goals of UTF-8.)

This world sounds nice, but it isn't the one that exists right now.
Practicality beats purity and all that :)


Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

More information about the Python-Dev mailing list