[Trent]
1. There are a fairly large number of downcasting cases in the Python code (not necessarily tied to PyArg_ParseTuple results). I was wondering if you think a generalized check on each such downcast would be advisable. This would take the form of some macro that would do a bounds check before doing the cast. For example (a common one is the cast of strlen's size_t return value to int, because Python strings use int for their length, this is a downcast on 64-bit systems):
size_t len = strlen(s); obj = PyString_FromStringAndSize(s, len);
would become
size_t len = strlen(s); obj = PyString_FromStringAndSize(s, CAST_TO_INT(len));
CAST_TO_INT would ensure that 'len'did not overflow and would raise an exception otherwise.
[Guido]
How would the CAST_TO_INT macro signal an erro? C doesn't have exceptions. If we have to add checks, I'd prefer to write
size_t len = strlen(s); if (INT_OVERFLOW(len)) return NULL; /* Or whatever is appropriate in this context */ obj = PyString_FromStringAndSize(s, len);
Of course we have to add checks -- strlen doesn't return an int! It hasn't since about a year after Python was first written (ANSI C changed the rules, and Python is long overdue in catching up -- if you want people to stop passing multiple args to append, set a good example in our use of C <0.5 wink>). [Trent]
I would like other people's opinion on this kind of change. There are three possible answers:
Please don't change the rating scheme we've been using: -1 is a veto, +1 is a hurrah, -0 and +0 are obvious <ahem>.
+1 this is a bad change idea because...<reason> -1 this is a good idea, go for it
That one, except spelled +1.
+0 (mostly likely) This is probably a good idea for some case where the overflow *could* happen, however the strlen example that you gave is *not* such a situation. As Tim Peters said: 2GB limit on string lengths is a good assumption/limitation.
No, it's a defensible limitation, but it's *never* a valid assumption. The check isn't needed anywhere we can prove a priori that it could never fail (in which case we're not assuming anything), but it's always needed when we can't so prove (in which case skipping the check would be a bad asssuption). In the absence of any context, your strlen example above definitely needs the check. An alternative would be to promote the size member from int to size_t; that's no actual change on the 32-bit machines Guido generally assumes without realizing it, and removes an arbitrary (albeit defensible) limitation on some 64-bit machines at the cost of (just possibly, due to alignment vagaries) boosting var objects' header size on the latter. correctness-doesn't-happen-by-accident-ly y'rs - tim