[Python-ideas] Python 3000 TIOBE -3%

Mon Feb 13 06:54:24 CET 2012

On Mon, Feb 13, 2012 at 3:04 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> The ASCII speakers are a pretty clear-cut case.  Using 'latin-1' as
> the codec, almost all things they can do with a 100% ASCII program and
> a sanely-encoded text (which leaves out Shift JIS, Big 5, and maybe
> some obsolete Vietnamese encodings, but not much else AFAIK) will pass
> through the non-ASCII verbatim, or delete it.

I'd hazard a guess that the non-ASCII compatible encoding mostly
likely to be encountered outside Asia is UTF-16. The choice is really
between "never give me UnicodeErrors, but feel free to silently
corrupt the data stream if I do the wrong thing with that data" (i.e.
"latin-1") and "correctly handle any ASCII compatible encoding, but
still throw UnicodeEncodeError if I'm about to emit corrupted data"
("ascii+surrogateescape").

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia