python/dist/src/Python bltinmodule.c, 2.292.10.1, 2.292.10.2
Update of /cvsroot/python/python/dist/src/Python In directory sc8-pr-cvs1:/tmp/cvs-serv2443 Modified Files: Tag: release23-maint bltinmodule.c Log Message: changed builtin_sum to use PyNumber_InPlaceAdd -- unchanged semantics but fixes performance bug with sum(lotsoflists, []). Index: bltinmodule.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v retrieving revision 2.292.10.1 retrieving revision 2.292.10.2 diff -C2 -d -r2.292.10.1 -r2.292.10.2 *** bltinmodule.c 18 Aug 2003 18:34:09 -0000 2.292.10.1 --- bltinmodule.c 25 Oct 2003 12:47:09 -0000 2.292.10.2 *************** *** 1841,1845 **** break; } ! temp = PyNumber_Add(result, item); Py_DECREF(result); Py_DECREF(item); --- 1841,1845 ---- break; } ! temp = PyNumber_InPlaceAdd(result, item); Py_DECREF(result); Py_DECREF(item);
Modified Files: Tag: release23-maint bltinmodule.c Log Message: changed builtin_sum to use PyNumber_InPlaceAdd -- unchanged semantics but fixes performance bug with sum(lotsoflists, []).
I think this ought to be reverted, both in 2.3 and 2.4. Consider this code: empty = [] for i in range(10): print sum([[x] for x in range(i)], empty) The output used to be: [] [0] [0, 1] [0, 1, 2] [0, 1, 2, 3] [0, 1, 2, 3, 4] [0, 1, 2, 3, 4, 5] [0, 1, 2, 3, 4, 5, 6] [0, 1, 2, 3, 4, 5, 6, 7] [0, 1, 2, 3, 4, 5, 6, 7, 8] But now it is: [] [0] [0, 0, 1] [0, 0, 1, 0, 1, 2] [0, 0, 1, 0, 1, 2, 0, 1, 2, 3] [0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4] [0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5] [0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 6] [0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 7] [0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, 4, 5, 6, 7, 8] --Guido van Rossum (home page: http://www.python.org/~guido/)
On Saturday 25 October 2003 11:18 pm, Guido van Rossum wrote:
Modified Files: Tag: release23-maint bltinmodule.c Log Message: changed builtin_sum to use PyNumber_InPlaceAdd -- unchanged semantics but fixes performance bug with sum(lotsoflists, []).
I think this ought to be reverted, both in 2.3 and 2.4. Consider this code:
I have reverted it; it's obviously true that, by causing side effects on the 2nd argument, the fix as I had commited it could change semantics. I apologize for not thinking of this (and adding the missing unit-tests to catch this, see next paragraph). If it was up to me I would still pursue the possibility of using PyNumber_InPlaceAdd, for example by only doing so if the second argument could first be successfully copy.copy'ed into the result and falling back to PyNumber_Add otherwise. The alternative of leaving sum as a performance trap for the unwary -- an "attractive nuisance" in legal terms -- would appear to me to be such a bad situation, as to warrant such effort (including adding unit-tests to ensure sum does not alter its second argument, works correctly with a non-copyable 2nd argument, etc). However, it's YOUR decision, and you have already made it clear in another mail that your objections to remedying this performance bug are such that no possible solution will satisfy them. If a type gives different results for "a = a + b" vs "a += b", there is no way sum can find this out; and while, were it my decision, I would not care to support such weird cases at such a huge performance price, it's not my decision. Similarly for types which let you do "a = copy.copy(b)" but do NOT return a valid copy of b, or return b itself even though it's mutable, and so on weirdly. I'm just very sad that I didn't think of this performance-trap back when the specs of sum were first being defined. Oh well:-(. Can I at least add a warning about this performance trap to the docs at http://www.python.org/doc/current/lib/built-in-funcs.html ? Alex
participants (3)
-
aleax@users.sourceforge.net
-
Alex Martelli
-
Guido van Rossum