Is there some reason why we can't incorporate a free snprintf implementation? There is a list available at http://www.ijs.si/software/snprintf/.
Looks like the time machine is at work again: the version we use *is* a free snprintf implementation. If you want to replace it with a different one, you should indicate specifically which one you'd like to use instead. I think Mark Martinec's implementation (the top one on the URL you give) is unacceptable, because the license is too restrictive: we must incoporate the package in its entirety, i.e. redistribution of portions seems not to be licensed by the Frontier Artistic License. I don't have the time to review 10 other implementations for their suitability both in terms of licensing and correctness. Instead, I'd rather review the three occurrences of PyOS_snprintf, to determine quickly that you will have a hard time to overflow that buffer; *it is not at all easy*. Even if it does overflow, you will get a fatal error, rather than silent memory corruption. That is good enough for me. Regards, Martin
Martin v. Loewis wrote:
Looks like the time machine is at work again: the version we use *is* a free snprintf implementation.
Are we looking at the same mysnprintf? The one I have starts off: static int myvsnprintf(char *str, size_t size, const char *format, va_list va) { char *buffer = PyMem_Malloc(size + 512); int len; if (buffer == NULL) return -1; len = vsprintf(buffer, format, va); ... That doesn't look safe to me. Is there another snprintf implementation in the Python source tree? I can't find it. If there is then why is mysnprintf around? Neil
"Martin v. Loewis" wrote:
Is there some reason why we can't incorporate a free snprintf implementation? There is a list available at http://www.ijs.si/software/snprintf/.
Looks like the time machine is at work again: the version we use *is* a free snprintf implementation.
Well, let's say it's a free snprintf emulation ;-)
If you want to replace it with a different one, you should indicate specifically which one you'd like to use instead. I think Mark Martinec's implementation (the top one on the URL you give) is unacceptable, because the license is too restrictive: we must incoporate the package in its entirety, i.e. redistribution of portions seems not to be licensed by the Frontier Artistic License.
I don't have the time to review 10 other implementations for their suitability both in terms of licensing and correctness.
Instead, I'd rather review the three occurrences of PyOS_snprintf, to determine quickly that you will have a hard time to overflow that buffer; *it is not at all easy*. Even if it does overflow, you will get a fatal error, rather than silent memory corruption. That is good enough for me.
Note that the version in Python does not result in *stack* overflows which are the type of buffer overflow usually used in exploits. PyOS_snprintf() allocates a buffer on the heap and then let's sprintf() write there -- it then checks for an overflow and causes a fatal error if it finds that sprintf() failed to manage with the size + 512 bytes it had for formatting the string. The only attack on this kind of emulation is a denial of service attack. In the 3 cases where this API is used in Python, an overflow is not possible (unless the native sprintf() implementation is broken). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/
M.-A. Lemburg wrote:
Note that the version in Python does not result in *stack* overflows which are the type of buffer overflow usually used in exploits. ... The only attack on this kind of emulation is a denial of service attack.
That is a bold statement to make. It is also not true. Heap overflows _can_ be exploited to execute arbitrary code. I believe there was a phrack article a few years ago on the subject.
In the 3 cases where this API is used in Python, an overflow is not possible (unless the native sprintf() implementation is broken).
That may be the case today but I'm sure that snprintf will start getting more use now that it is available. We really should have a better implementation than mysnprintf. Neil
Neil Schemenauer wrote:
M.-A. Lemburg wrote:
Note that the version in Python does not result in *stack* overflows which are the type of buffer overflow usually used in exploits. ... The only attack on this kind of emulation is a denial of service attack.
That is a bold statement to make. It is also not true. Heap overflows _can_ be exploited to execute arbitrary code. I believe there was a phrack article a few years ago on the subject.
I know that they can be exploited (should have phrased the reply more carefully), but I don't think that the exploits described in phrack apply to Python's use of the memory buffer. In case sprintf() overflows, Python will detect this and immediately dump core. I don't see how this could be used by an attacker, except for killing off processes (the DOS attack); the exploit described in Phrack 57 (http://www.phrack.org/) only works on systems which use Doug Lea's malloc implementation, don't define snprintf() in their C lib and have sudo installed. Should be a rather small share of installed OSes ;-)
In the 3 cases where this API is used in Python, an overflow is not possible (unless the native sprintf() implementation is broken).
That may be the case today but I'm sure that snprintf will start getting more use now that it is available. We really should have a better implementation than mysnprintf.
No objection at all -- I wrote the emulation simply to add at least some level of protection against buffer overflows for platforms which don't provide snprintf() in their own C lib. Before that Python used sprintf(). I suppose we could use the code from stringobject.c:PyString_FromFormatV() as starting point for our own little snprintf() implementation... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/
"M" == M
writes:
M> I suppose we could use the code from M> stringobject.c:PyString_FromFormatV() as starting point for our M> own little snprintf() implementation... Aside: PyString_FromFormatV() started life out as PyErr_Format(). mailman-has-its-own-vsnprintf()-but-its-GPL'd-ly y'rs, -Barry
Grepping through the Python source code there are 191 usages of sprintf() -- shouldn't these be modified to use PyOS_snprintf() instead ? Python/getargs.c would be a particularly important case to fix, since the sprintf()s in there are not protected against buffer overflows -- it seems that long function names could be used to exploit this, e.g. in multi-user environments like Zope to obtain admin priviledges. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/
Grepping through the Python source code there are 191 usages of sprintf() -- shouldn't these be modified to use PyOS_snprintf() instead ?
Not necessarily. sprintf is perfectly ok if used correctly (i.e. if you can guarantee an upper bound on the resulting string length, and compute this bound either statically or dynamically).
Python/getargs.c would be a particularly important case to fix, since the sprintf()s in there are not protected against buffer overflows -- it seems that long function names could be used to exploit this, e.g. in multi-user environments like Zope to obtain admin priviledges.
That indeed appears to be the case. However, given char buf[256]; sprintf(p, "%s() ", fname); I think the correct reformulation should be char buf[256]; sprintf(p, "%.100s() ", fname); In seterror, you add then a number of strings containing each a %d (adding 20 bytes worst-case each), where the loop should terminate if there are only, say, 140 bytes left; the final printf could then use %.100s. Alternatively, you could use "%.*s" through-out, operating with the lengths of the strings themselves. Regards, Martin
"Martin v. Loewis" wrote:
Grepping through the Python source code there are 191 usages of sprintf() -- shouldn't these be modified to use PyOS_snprintf() instead ?
Not necessarily. sprintf is perfectly ok if used correctly (i.e. if you can guarantee an upper bound on the resulting string length, and compute this bound either statically or dynamically).
This is done in most cases, indeed. Still I think we need some auditing here and having all audited sprintf() uses renamed to say PyOS_snprintf() would make auditing future Python releases a lot easier.
Python/getargs.c would be a particularly important case to fix, since the sprintf()s in there are not protected against buffer overflows -- it seems that long function names could be used to exploit this, e.g. in multi-user environments like Zope to obtain admin priviledges.
That indeed appears to be the case. However, given
char buf[256]; sprintf(p, "%s() ", fname);
I think the correct reformulation should be
char buf[256]; sprintf(p, "%.100s() ", fname);
Right.
In seterror, you add then a number of strings containing each a %d (adding 20 bytes worst-case each), where the loop should terminate if there are only, say, 140 bytes left; the final printf could then use %.100s.
Alternatively, you could use "%.*s" through-out, operating with the lengths of the strings themselves.
I think that would make the code much more complicated. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/
participants (4)
-
barry@zope.com
-
M.-A. Lemburg
-
Martin v. Loewis
-
Neil Schemenauer