Coding systems are political (was Exended ASCII and code pages)

Rustom Mody rustompmody at gmail.com
Sat Jun 4 23:54:17 EDT 2016


On Monday, May 30, 2016 at 12:16:55 AM UTC+5:30, Terry Reedy wrote:
> On 5/29/2016 2:12 AM, Rustom Mody wrote:
> 
> > In short that a € costs more than a $ is a combination of the factors
> > - a natural cause -- there are a million chars to encode (lets assume that the
> > million of Unicode is somehow God-given AS A SET)
> > - an artificial political one -- out of the million-factorial permutations of
> > that million, the one that the Unicode consortium chose is towards satisfying the
> > equation: Keep ASCII users undisturbed and happy
> 
>  From the Python developer viewpoint, Unicode might as well be a fact of 
> nature.  I also note that in English text, a (phoneme) char conveys 
> about 6 bits of information, while in Chinese text, a (word) char 
> conveys perhaps 15 bits of information.  So I argue that Python 3.3+'s 
> FSR is being fair in using 1 byte for the first and most often 2 bytes 
> for the other.

Almost a fact of nature -- thats right
Im making no complaint against python
Or unicode for that matter.

Bismarck's well-known quote: Politics is the art of the possible
not so well-known additional clause "... the art of the second best"

Unicode's relation to ASCII is analogous to C++ relation to C.
Ask a typical C++ programmer about style/paradigm etc and you'll hear something
unctuous about how C-style is terrible.
Then ask why the question of C arises at all when its so unfit and obsolete
ie why build C++ on a C base
And you'll get vague, philosophical BS on pragmatism etc

In short when it suits exploit C, when it suits abuse it.

Unicode is likewise:
The whole point of unicode is to go beyond ASCII
And yet ASCII is allocated the prime real-estate of the lowest 128
of ASCII -- all the control-char wastage preserved intact


More information about the Python-list mailing list