[issue19819] reversing a Unicode ligature doesn't work
Christian Heimes
report at bugs.python.org
Thu Nov 28 01:07:37 CET 2013
Christian Heimes added the comment:
There is no ligature for "lff", just "ffl". Ligatures are treated as one char. I guess Python would have to grow a str.reverse() method to handle ligatures and combining chars correctly.
At work I ran into the issue with ligatures and combining chars multiple times in medieval and early modern age scripts. Eventually I started to normalize all incoming data to NFKC. That solves most of the issues.
s = b'ba\xef\xac\x84e'.decode('utf-8')
>>> print("".join(reversed(s)))
efflab
>>> print("".join(reversed(unicodedata.normalize("NFKC", s))))
elffab
----------
nosy: +christian.heimes
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19819>
_______________________________________
More information about the Python-bugs-list
mailing list