python 2.7 and unicode (one more time)
marko at pacujo.net
Mon Nov 24 07:57:33 CET 2014
Gregory Ewing <greg.ewing at canterbury.ac.nz>:
> Marko Rauhamaa wrote:
>> Unicode strings is not wrong but the technical emphasis on Unicode is as
>> strange as a "tire car" or "rectangular door" when "car" and "door" are
>> what you usually mean.
> The reason Unicode gets emphasised so much is that until relatively
> recently, it *wasn't* what "string" usually meant in Python.
> When Python 3 has been around for as long as Python 2 was, things may
Yes, people call strings "Unicdoe strings" because Python2 *did have*
unicode strings separate from regular strings:
string bytes (byte string)
unicode string string
In Python2 days, Unicode was a fancy, exotic datatype for the
connoisseurs. The rest used strings. Python3 supposedly elevates Unicode
to boring normalcy. Now it's bytes that have fallen into (unmerited)
But old habits die hard; you call cars "automobile cars" instead of
"cars" since, after all, "cars" were always pulled by horses...
PS Maybe interestingly, Guile went through an analogous transition. As
of Guile 2.0,
a character is anything in the Unicode Character Database.
Strings are fixed-length sequences of characters.
A bytevector is a raw bit string.
However, Guile 1.8 still had:
The Guile implementation of character sets currently deals only with
and there were no bytevectors.
More information about the Python-list