[Python-ideas] Python 3000 TIOBE -3%

Thu Feb 16 16:25:59 CET 2012

Paul Moore writes:

 > Add to this the fact that I *know* I've seen supposed text files with
 > mixed encoding content,

Heck, I've seen *file names* with mixed encoding content.

 > and no-one has *ever* explained how to handle that (it's basically
 > a damaged file, and so all the "right way to deal with Unicode"
 > discussions ignore it)

The right way to handle such a file is ad hoc: operate on the features
you can identify, and treats runs of bytes of unknown encoding as
atomic blobs.

In practice, there is a generic such feature that supports many
applications: runs of ASCII text.  Which is the intuition all the
pragmatists start with -- it's correct.

 > OK, so maybe I do feel somewhat insulted...

I'm sorry you feel that way.  (I've sided with the pragmatists in this
thread, but on this issue I'm a purist at heart.)