Coding systems are political (was Exended ASCII and code pages)
rustompmody at gmail.com
Sat Jun 4 23:54:17 EDT 2016
On Monday, May 30, 2016 at 12:16:55 AM UTC+5:30, Terry Reedy wrote:
> On 5/29/2016 2:12 AM, Rustom Mody wrote:
> > In short that a € costs more than a $ is a combination of the factors
> > - a natural cause -- there are a million chars to encode (lets assume that the
> > million of Unicode is somehow God-given AS A SET)
> > - an artificial political one -- out of the million-factorial permutations of
> > that million, the one that the Unicode consortium chose is towards satisfying the
> > equation: Keep ASCII users undisturbed and happy
> From the Python developer viewpoint, Unicode might as well be a fact of
> nature. I also note that in English text, a (phoneme) char conveys
> about 6 bits of information, while in Chinese text, a (word) char
> conveys perhaps 15 bits of information. So I argue that Python 3.3+'s
> FSR is being fair in using 1 byte for the first and most often 2 bytes
> for the other.
Almost a fact of nature -- thats right
Im making no complaint against python
Or unicode for that matter.
Bismarck's well-known quote: Politics is the art of the possible
not so well-known additional clause "... the art of the second best"
Unicode's relation to ASCII is analogous to C++ relation to C.
Ask a typical C++ programmer about style/paradigm etc and you'll hear something
unctuous about how C-style is terrible.
Then ask why the question of C arises at all when its so unfit and obsolete
ie why build C++ on a C base
And you'll get vague, philosophical BS on pragmatism etc
In short when it suits exploit C, when it suits abuse it.
Unicode is likewise:
The whole point of unicode is to go beyond ASCII
And yet ASCII is allocated the prime real-estate of the lowest 128
of ASCII -- all the control-char wastage preserved intact
More information about the Python-list