[Python-Dev] Hash collision security issue (now public)
Victor Stinner
victor.stinner at haypocalc.com
Mon Jan 9 10:53:19 CET 2012
> That said, I don't think smallest-format is actually enforced with
> anything stronger than comments (such as in unicodeobject.h struct
> PyASCIIObject) and asserts (mostly calling
> _PyUnicode_CheckConsistency). I don't have any insight on how
> prevalent non-conforming strings will be in practice, or whether
> supporting their equality will be required as a bugfix.
If you are only Python, you cannot create a string in a non canonical form.
If you use the C API, you can create a string in a non canonical form
using PyUnicode_New() + PyUnicode_WRITE, or
PyUnicode_FromUnicode(NULL, length) (or
PyUnicode_FromStringAndSize(NULL, length)) + direct access to the
Py_UNICODE* string. If you create strings in a non canonical form, it
is a bug in your application and Python doesn't help you. But how
could Python help you? Expose a function to check your newly creating
string? There is already _PyUnicode_CheckConsistency() which is slow
(O(n)) because it checks each character, it is only used in debug
mode.
Victor
More information about the Python-Dev
mailing list