Glyphs and graphemes [was Re: Cult-like behaviour]
Richard at Damon-family.org
Mon Jul 16 21:48:42 EDT 2018
> On Jul 16, 2018, at 9:21 PM, Steven D'Aprano <steve+comp.lang.python at pearwood.info> wrote:
>> On Mon, 16 Jul 2018 19:02:36 -0400, Richard Damon wrote:
>> You are defining a variable/fixed width codepoint set. Many others want
>> to deal with CHARACTER sets.
> Good luck coming up with a universal, objective, language-neutral,
> consistent definition for a character.
Who says there needs to be one. A good engineer will use the definition that is most appropriate to the task at hand. Some things need very solid definitions, and some things don’t.
This goes back to my original point, where I said some people consider UTF-32 as a variable width encoding. For very many things, practically, the ‘codepoint’ isn’t the important thing, so the fact that every UTF-32 code point takes the same number of bytes or code words isn’t that important. They are dealing with something that needs to be rendered and preserving larger units, like the grapheme is important.
>> This doesn’t mean that UTF-32 is an awful system, just that it isn’t the
>> magical cure that some were hoping for.
> Nobody ever claimed it was, except for the people railing that since it
> isn't a magically system we ought to go back to the Good Old Days of code
> page hell, or even further back when everyone just used ASCII.
Sometimes ASCII is good enough, especially on a small machine with limited resources. Sometimes you do need to use a ‘Code Page’ because of limited resources and that unit will only be able to talk a single language because of that too). Sometimes you have the luxury of being able to use a somewhat complete Unicode implementation. Sometimes you are never going to be displaying anything, and you can mostly just treat everything as a bag of bytes. You use the tool that is right for the job.
> Steven D'Aprano
> "Ever since I learned about confirmation bias, I've been seeing
> it everywhere." -- Jon Ronson
More information about the Python-list