Glyphs and graphemes [was Re: Cult-like behaviour]

Steven D'Aprano steve+comp.lang.python at
Tue Jul 17 03:50:57 EDT 2018

On Tue, 17 Jul 2018 08:26:45 +0300, Marko Rauhamaa wrote:

> Steven D'Aprano <steve+comp.lang.python at>:
>> On Mon, 16 Jul 2018 22:51:32 +0300, Marko Rauhamaa wrote:
>>> UTF-8 bytes can only represent the first 128 code points of Unicode.
>> This is DailyWTF material. Perhaps you want to rethink your wording and
>> maybe even learn a bit more about Unicode and the UTF encodings before
>> making such statements.
>> The idea that UTF-8 bytes cannot represent the whole of Unicode is not
>> even wrong. Of course a *single* byte cannot, but a single byte is not
>> "UTF-8 bytes".
> So I hope that by now you have understood my point and been able to
> decide if you agree with it or not.

If your point was not what you wrote, then no, I'm sorry, my crystal ball 
unexpectedly broke down (why it didn't foresee its own failure I'll never 
know...). I can't tell what you are thinking, only what you write. 
Sometimes I can guess (like my earlier guess that you meant grapheme, 
rather than glyph) but in this case, if you mean something other than 

"UTF-8 bytes can only represent the first 128 code points of Unicode"

I'm flummoxed.

Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

More information about the Python-list mailing list