Flexible string representation, unicode, typography, ...
Ben Finney
ben+python at benfinney.id.au
Sat Aug 25 03:54:55 EDT 2012
wxjmfauth at gmail.com writes:
> Unicode design: a flat table of code points, where all code
> points are "equals".
Yes, Unicode's design entails a flat table of hundreds of thousands of
code points, expansible in future.
This is in direct conflict with the design of all significant computers
we need to write software for: data stored and transported as 8-bit
bytes, which can only ever hold 256 different values, no expansion.
> As soon as one attempts to escape from this rule, one has to
> "pay" for it.
Yes, in either direction; the conflict means that trade-offs need to be
made.
See this presentation by Ned Batchelder, “Pragmatic Unicode”
<URL:http://nedbatchelder.com/text/unipain.html>, which lays out the
fundamental conflict of representing human text in computer data; and
several practical approaches to deal with it.
--
\ “I busted a mirror and got seven years bad luck, but my lawyer |
`\ thinks he can get me five.” —Steven Wright |
_o__) |
Ben Finney
More information about the Python-list
mailing list