Article on the future of Python
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Wed Sep 26 03:23:47 EDT 2012
On Tue, 25 Sep 2012 23:35:39 -0700, wxjmfauth wrote:
> Py 3.3 succeeded to somehow kill unicode and it has been transformed
> into an "American" product for "American" users.
For the first time in Python's history, Python on 32-bit systems handles
strings containing Supplementary Multilingual Plane characters correctly,
and it does so without doubling or quadrupling the amount of memory every
single string takes up.
Strings are ubiquitous in Python -- every module, every variable, every
function, every class is associated with at least one and often many
strings, and they are nearly all ASCII strings. The overhead of using
four bytes instead of one for every string is considerable.
Python finally has correct unicode handling for characters beyond the BMP,
and it does so with more efficient strings that potentially use as little
as one quarter of the memory that they otherwise would use, at the cost
of a small slowdown in the artificial and unrealistic case that you
repeatedly create millions of strings and then just throw them away
immediately. Most realistic cases of string handling are unchanged in
speed, either trivially faster or trivially slower. The real saving is in
memory.
According to wxjmfauth, this has "killed" unicode. Judge for yourself his
credibility. The best I can determine, he believes this because Americans
aren't made to suffer for using mostly ASCII strings.
--
Steven
More information about the Python-list
mailing list