Glyphs and graphemes [was Re: Cult-like behaviour]

Marko Rauhamaa marko at
Mon Jul 16 16:54:35 EDT 2018

Chris Angelico <rosuav at>:
> Challenge: Reverse a string in UTF-8.

Counter-challenge: Reverse a Unicode string:

   >>> s = "a\u0304e"
   >>> s
   >>> L = list(s)
   >>> L.reverse()
   >>> "".join(L)

> Challenge: Center text in UTF-8.

Counter-challenge: Center a Unicode string:

   >>> t = s * 3
   >>> t

> Challenge: Given a (non-initial) character in a buffer of UTF-8 bytes,
> find the immediately preceding character.

The counter-challenge is left as an exercise for the reader.

> All of these are fundamentally difficult by nature, but if you index
> by code points, you eliminate one level of difficulty; indexing by
> bytes retains all the existing difficulty and adds another layer.

Oh, sorry. I thought you were suggesting Unicode strings would make the
challenges somehow easy.


More information about the Python-list mailing list