[issue10254] unicodedata.normalize('NFC', s) regression
report at bugs.python.org
Tue Dec 21 20:24:03 CET 2010
Alexander Belopolsky <belopolsky at users.sourceforge.net> added the comment:
In the new patch, issue10254b.diff, I've added a test that would crash unpatched code:
>>> unicodedata.normalize('NFC', 'C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸C̸Ç')
Martin, I still feel uneasy about the fixed size of the skipped buffer. It is not obvious that skipped combining characters always get removed from the buffer before the next starter is processed.
I would really like another pair of eyes to look at this code before it goes in especially to 2.6.
IIRC, you did some stress testing on random data. I wonder if you could test this code after tightening the assert to cskipped < 4. (The current theory is that this should be enough.)
Added file: http://bugs.python.org/file20131/issue10254b.diff
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list