Grapheme clusters, a.k.a.real characters
Rhodri James
rhodri at kynesim.co.uk
Thu Jul 20 12:46:46 EDT 2017
On 20/07/17 16:18, Rustom Mody wrote:
> So coming to the point:
> Its not whether Einstein or Mencken¹ is right but rather that Mencken applies to
> 1 whereas Einstein applies to 3
>
> And (IMHO) text should be squarely classed in 3 not 1
>
> The gmas of this world have made shopping lists, written (and taught to write)
> letters [my gpa wrote books] long before CS and before any of us existed.
>
> And if suddenly text has moved from being obvious to anyone to something arcane
> involving
> - codepoints (which are abstract and platonic)
> - (≠) glyphs
> - (that fit into) octets (whatever that may be except they are not bytes)
> - And all other manner of Unicode-gobbledygook
> Something somewhere is wrong
The something that is wrong is a failure to consider the necessary
_depth_ of knowledge. The shallow (read: obvious and intuitive)
definition of text works just fine in the context of grandma's shopping
list or granddad's book, localised environments with heavily
circumscribed usage patterns. It breaks down in the global environments
we've been talking about in much the same way that the obvious and
intuitive definition of numbers breaks down when you start considering
infinities, or Newtonian mechanics breaks down near the speed of light,
or pretty much everything intuitive breaks down at quantum scales.
--
Rhodri James *-* Kynesim Ltd
More information about the Python-list
mailing list