[Python-ideas] Python 3000 TIOBE -3%
Stephen J. Turnbull
stephen at xemacs.org
Thu Feb 16 16:25:59 CET 2012
Paul Moore writes:
> Add to this the fact that I *know* I've seen supposed text files with
> mixed encoding content,
Heck, I've seen *file names* with mixed encoding content.
> and no-one has *ever* explained how to handle that (it's basically
> a damaged file, and so all the "right way to deal with Unicode"
> discussions ignore it)
The right way to handle such a file is ad hoc: operate on the features
you can identify, and treats runs of bytes of unknown encoding as
atomic blobs.
In practice, there is a generic such feature that supports many
applications: runs of ASCII text. Which is the intuition all the
pragmatists start with -- it's correct.
> OK, so maybe I do feel somewhat insulted...
I'm sorry you feel that way. (I've sided with the pragmatists in this
thread, but on this issue I'm a purist at heart.)
More information about the Python-ideas
mailing list