[Patches] [ python-Patches-590843 ] list sort perf boost
noreply@sourceforge.net
noreply@sourceforge.net
Sun, 04 Aug 2002 14:43:07 -0700
Patches item #590843, was opened at 2002-08-04 15:22
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590843&group_id=5470
Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: list sort perf boost
Initial Comment:
I don't really love this patch--the names suck. There
are also 2 warnings. The warnings can be removed by
casting, but I wanted to see if anybody had better ideas.
However, I can't let Tim have all the fun optimizing
the hell out of list sort. This patch gets rid of
COMPARE == NULL in the ISLT macro. A structure is
created which holds both the cmp function and
PyObject/Py_LT.
I get a speed up of between ~1-3%. Using the largest
#s, 7.67 -> 7.59 for *sort.
----------------------------------------------------------------------
>Comment By: Tim Peters (tim_one)
Date: 2002-08-04 17:43
Message:
Logged In: YES
user_id=31435
Heh -- cute! It doesn't make a measurable difference on my
MSVC6 box, but I can believe it helps a bit elsewhere.
However, the warnings are trying to tell us something:
casting PyObject_RichCompareBool to a type that takes a
void* 3rd argument instead of an int is likely to work on
boxes where sizeof(void*) == sizeof(int) (like my box, and
presumably yours too), but it's lying in a potentially
dangerous way on other boxes. If sizeof(void*) > sizeof
(int), it's still likely to work on big-endian machines, but it
gets very iffy on little-endian boxes
(PyObject_RichCompareBool can pick up just the least-
significant bits then).
On the third hand, Py_LT is currently a #define for 0, so it
may well work (albeit by accident) even then.
I don't think there's enough gain here to justify that level of
obscure x-platform pain, so sorry to say I'm inclined to
reject this.
A different route you may want to pursue: checking for
failure after every function call is likely a significant expense
(test, branch, test, branch, test, branch, ...), which
judicious use of setjmp/longjmp could alleviate. The tricky
bit there is ensuring that failures in merge_lo and merge_hi
arrange to get the temp array copied back in to the merge
area (I expect they'd have to leave clues in the MergeState
struct, which the longjmp target could act on).
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590843&group_id=5470