[Python-Dev] Bug in PyLocale_strcoll
Andreas Degert
ad at papyrus-gmbh.de
Sat Nov 20 21:42:53 CET 2004
Hello,
I think I found a bug in PyLocale_strcoll() (Python 2.3.4). When used
with 2 unicode strings, it converts them to wchar strings and uses
wcscoll. The bug is that the wchar strings are not 0-terminated.
Checking with
assert(ws1[len1-1] == 0 && ws2[len2-1] == 0);
right before the line
result = PyInt_FromLong(wcscoll(ws1, ws2));
confirms the bug. I'm not quite sure what the best fix is.
PyUnicode_AsWideChar() copies the unicode chars, but not the
terminating 0-char of the unicode string (which is not used in python,
but its there anyhow, if I understand the implementation
correctly). So one fix would be to change PyUnicode_AsWideChar to copy
the terminating 0-char if there's enough space in the output
buffer. Another fix would be to terminate the strings in
PyLocale_strcoll() before using them:
----------------------------------------------------------
--- _localemodule.c~ Sat Nov 20 21:33:17 2004
+++ _localemodule.c Sat Nov 20 21:35:04 2004
@@ -353,15 +353,19 @@
PyErr_NoMemory();
goto done;
}
- if (PyUnicode_AsWideChar((PyUnicodeObject*)os1, ws1, len1) == -1)
+ len1 = PyUnicode_AsWideChar((PyUnicodeObject*)os1, ws1, len1);
+ if (len1 == -1)
goto done;
+ ws1[len1-1] = 0;
ws2 = PyMem_MALLOC(len2 * sizeof(wchar_t));
if (!ws2) {
PyErr_NoMemory();
goto done;
}
- if (PyUnicode_AsWideChar((PyUnicodeObject*)os2, ws2, len2) == -1)
+ len2 = PyUnicode_AsWideChar((PyUnicodeObject*)os2, ws2, len2);
+ if (len2 == -1)
goto done;
+ ws2[len2-1] = 0;
/* Collate the strings. */
result = PyInt_FromLong(wcscoll(ws1, ws2));
done:
----------------------------------------------------------
cheers
Andreas
More information about the Python-Dev
mailing list