Grapheme clusters, a.k.a.real characters
Steve D'Aprano
steve+python at pearwood.info
Thu Jul 13 20:35:41 EDT 2017
>From time to time, people discover that Python's string algorithms work on code
points rather than "real characters", which can lead to anomalies like the
following:
s = 'xäex'
s = unicodedata.normalize('NFD', s)
print(s)
print(s[::-1])
which results in:
xäex
xëax
If you're interested in this issue, there's an issue on the bug tracker about
it, which is seeing some activity.
http://bugs.python.org/issue30717
--
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.
More information about the Python-list
mailing list