[New-bugs-announce] [issue2620] Multiple buffer overflows in unicode processing

Justin Ferguson report at bugs.python.org
Sat Apr 12 00:35:38 CEST 2008

New submission from Justin Ferguson <justin.ferguson at ioactive.com>:

174 static
 175 int unicode_resize(register PyUnicodeObject *unicode,
 176                       Py_ssize_t length)
 177 {
 202     oldstr = unicode->str;
 203     PyMem_RESIZE(unicode->str, Py_UNICODE, length + 1);
 209     unicode->str[length] = 0;
 210     unicode->length = length;

 95 #define PyMem_RESIZE(p, type, n) \
 96   ( assert((n) <= PY_SIZE_MAX / sizeof(type)) , \
 97         ( (p) = (type *) PyMem_REALLOC((p), (n) * sizeof(type)) ) )

The unicode_resize() function acts essentially as a wrapper to
realloc(), it accomplishes this via the PyMem_RESIZE() macro which
factors the size with the size of the type, in this case it multiplies
by two as Py_UNICODE is typedef'd to a wchar_t. When resizing large
strings, this results in an incorrect allocation that in turn leads to
buffer overflow.

This is specific to the Unicode objects, however I would not be
surprised to see that other types have this complication as well. Please
see attached proof of concepts.

components: Interpreter Core
files: python-2.5.2-unicode_resize-utf7.py
messages: 65379
nosy: jnferguson
severity: normal
status: open
title: Multiple buffer overflows in unicode processing
type: security
versions: Python 2.5
Added file: http://bugs.python.org/file10011/python-2.5.2-unicode_resize-utf7.py

Tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list