[Patches] [ python-Patches-1454485 ] patch for SIGSEGV in arraymodule.c
SourceForge.net
noreply at sourceforge.net
Wed Mar 29 09:52:47 CEST 2006
Patches item #1454485, was opened at 2006-03-20 04:44
Message generated for change (Comment added) made by nnorwitz
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1454485&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core (C code)
Group: Python 2.4
Status: Open
Resolution: None
Priority: 7
Submitted By: Baris Metin (tbmetin)
Assigned to: Neal Norwitz (nnorwitz)
Summary: patch for SIGSEGV in arraymodule.c
Initial Comment:
Array module fails handling utf-8 strings giving a
SIGSEGV. Attached patch seems to do the trick...
gdb> run
(no debugging symbols found)
(no debugging symbols found)
[Thread debugging using libthread_db enabled]
[New Thread -1480337216 (LWP 31303)]
Python 2.4.2 (#1, Mar 20 2006, 12:08:06)
[GCC 3.4.5] on linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>> import array
>>> x = array.array("u")
>>> x.append(u"barıÅ")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: array item must be unicode character
>>> x.append("barıÅ")
>>> x
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1480337216 (LWP 31303)]
Error while running hook_stop:
Invalid type combination in ordering comparison.
0xa7ee0799 in PyUnicodeUCS4_FromUnicode ()
from /usr/lib/libpython2.4.so.1.0
----------------------------------------------------------------------
>Comment By: Neal Norwitz (nnorwitz)
Date: 2006-03-28 23:52
Message:
Logged In: YES
user_id=33168
Attached is an updated patch to only do the (unsigned) cast
in unicodeobject.c. The test included in the patch still
crashes the interpreter, this time in unicodectype.c.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2006-03-26 13:36
Message:
Logged In: YES
user_id=21627
The second part of the patch (checking that *u is not
negative) is definitely right.
The first part (requiring an even number of bytes for a u#
argument) probably requires discussion on python-dev (or
this patch should be assigned to MAL): I don't think it
should be allowed to pass a non-Unicode object to u# in the
first place.
In particular, if you pass a byte string, there would be an
implicit assumption that the byte encoding is the same
internal representation as a Py_UNICODE. This is bad -
Python normally assumes the encoding of a string is the
system encoding, which normally is ASCII.
Of course, changing the call to a type error for 2.4.3
probably won't work, either, because it might break existing
code.
Anyway, I believe the latter fix alone should fix the crash:
the current getargs implementation will round down to the
next multiple of sizeof(Py_UNICODE), thanks the integer
division. u_setitem will then refuse the call if the length
is not 1. IOW, it is possible to append between 4 and 7
bytes to a Unicode array.
I wonder why the patch fixes the problem: *u should be an
unsigned, and comparing an unsigned with a signed should
convert the signed to unsigned, no?
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2006-03-24 23:29
Message:
Logged In: YES
user_id=33168
Verified for 2.4 and head. The probably could exist w/ucs2
also if you use 'bar' (I think).
I agree with Nick, this patch doesn't really solve the
problem. The attached patch fixes the crash more generally,
but I'm think there is a better solution. I hope Martin has
time to review this and suggest a better fix.
Martin, the change in getargs ensures that we don't try to
convert an 8-bit string of length 5 to unicode. The change
in unicodeobject ensures that we don't reference the array
with a negative offset as happens if the buffer conversion
succeeds with an invalid unicode character.
----------------------------------------------------------------------
Comment By: Nick Coghlan (ncoghlan)
Date: 2006-03-24 06:43
Message:
Logged In: YES
user_id=1038590
To get the effect of the patch, it should be sufficient to
just change the format character to an uppercase U.
That doesn't seem like the right fix though - the actual
explosion isn't happening until later when the array
elements are being converted to Unicode for output.
----------------------------------------------------------------------
Comment By: Baris Metin (tbmetin)
Date: 2006-03-24 01:19
Message:
Logged In: YES
user_id=1045504
I'm able to reproduce the bug with 2.5a0 SVN (r43289).
Please try with --enable-unicode=ucs4
The patch is against svn too..
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2006-03-23 22:11
Message:
Logged In: YES
user_id=33168
With the stock 2.4.2 on my linux box I was able to reproduce
this. I couldn't reproduce with 2.4.3c1. Can you verify
this is fixed in 2.4.3?
Sagol.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1454485&group_id=5470
More information about the Patches
mailing list