[issue19819] reversing a Unicode ligature doesn't work

Christian Heimes report at bugs.python.org
Thu Nov 28 01:07:37 CET 2013


Christian Heimes added the comment:

There is no ligature for "lff", just "ffl". Ligatures are treated as one char. I guess Python would have to grow a str.reverse() method to handle ligatures and combining chars correctly.

At work I ran into the issue with ligatures and combining chars multiple times in medieval and early modern age scripts. Eventually I started to normalize all incoming data to NFKC. That solves most of the issues.

s = b'ba\xef\xac\x84e'.decode('utf-8')
>>> print("".join(reversed(s)))
efflab
>>> print("".join(reversed(unicodedata.normalize("NFKC", s))))
elffab

----------
nosy: +christian.heimes

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19819>
_______________________________________


More information about the Python-bugs-list mailing list