How to waste computer memory?
Marko Rauhamaa
marko at pacujo.net
Sun Mar 20 11:36:09 EDT 2016
Ben Bacarisse <ben.usenet at bsb.me.uk>:
> It's 21. The reason being (or at least part of the reason being) that
> 21 bits can be UTF-8 encoded in 4 bytes: 11110xxx 10xxxxxx 10xxxxxx
> 10xxxxxx (3 + 3*6).
I bet the reason is UTF-16. Microsoft and Sun/Oracle would have insisted
on a maximum of 4 bytes per character. UTF-16 can just barely squeeze 21
bits into the scheme and only at the expense of creating an ugly hole
inside Unicode. Politics, politics.
Marko
More information about the Python-list
mailing list