Python usage numbers
Roy Smith
roy at panix.com
Sun Feb 12 17:27:34 EST 2012
In article <mailman.5739.1329084873.27778.python-list at python.org>,
Chris Angelico <rosuav at gmail.com> wrote:
> On Mon, Feb 13, 2012 at 9:07 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> > The situation before ascii is like where we ended up *before* unicode.
> > Unicode aims to replace all those byte encoding and character sets with
> > *one* byte encoding for *one* character set, which will be a great
> > simplification. It is the idea of ascii applied on a global rather that
> > local basis.
>
> Unicode doesn't deal with byte encodings; UTF-8 is an encoding, but so
> are UTF-16, UTF-32. and as many more as you could hope for. But
> broadly yes, Unicode IS the solution.
I could hope for one and only one, but I know I'm just going to be
disapointed. The last project I worked on used UTF-8 in most places,
but also used some C and Java libraries which were only available for
UTF-16. So it was transcoding hell all over the place.
Hopefully, we will eventually reach the point where storage is so cheap
that nobody minds how inefficient UTF-32 is and we all just start using
that. Life will be a lot simpler then. No more transcoding, a string
will just as many bytes as it is characters, and everybody will be happy
again.
More information about the Python-list
mailing list