A Friday 08 February 2008, Charles R Harris escrigué:
On Feb 8, 2008 8:58 AM, Francesc Altet <faltet@carabos.com> wrote:
A Friday 08 February 2008, Charles R Harris escrigué:
Also, in the context of my work in indexing, and because of the slowness of the current implementation in NumPy, I've ended with an implementation of the quicksort method for 1-D array strings. For moderately large arrays, it is about 2.5x-3x faster than the (supposedly) mergesort version in NumPy, not only due to the quicksort, but also because I've implemented a couple of macros for efficient string swapping and copy. If this is of interest for NumPy developers, tell me and I will provide the code.
I have some code for this too and was going to merge it. Send yours along and I'll get to it this weekend.
Ok, great. I'm attaching it. Tell me if you need some clarification on the code.
I ran a few timing tests. On my machine strncmp is about 100x faster than opt_strncmp, but sSWAP (with some fixes), is about 10x faster then using the memcpy in a recent compiler. Does this match with your experience.
Well, I've run some more exhaustive tests on my laptop (Pentium 4 @ 2 GHz, Ubuntu 7.10, gcc 4.1.3, using -O3 optlevel) with the next sa1 array: numpy.random.seed(1) nelem = 10000 a = numpy.random.rand(nelem) sa1 = a.astype('S16') And I've chosen the next benchmark: /* start: the start of data for sa1 array ss: the length of the string type (16) num: the number of elements in sa1 (10000) */ int sort_S(char *start, int ss, npy_intp num) { char *pl = start; int a = 0; npy_intp i, j; for (i=0; i<num; i++) for (j=0; j<num; j++) a += opt_strncmp(pl+i*ss, pl+j*ss, ss); return a; } So, when I choose the next implementation of opt_strncmp: static int inline opt_strncmp1(char *a, char *b, intp n) { intp i; for (i=0; i<n; i++) { if (a[i] > b[i]) return i+1; if (a[i] < b[i]) return -(i+1); } return 0; } I get a time of 1.70 s. When using the next implementation: static int inline opt_strncmp2(char *a, char *b, intp n) { intp i; for (i=0; i<n; i++) { if (a[i] != b[i]) return (((unsigned char *)a)[i] - ((unsigned char *)b)[i]); } return 0; } I get a time of 1.24 s. The original strncmp gives me 2.05 seconds with this setup. So, in short and using my laptop: original strncmp: 2.05 s opt_stncmp1: 1.70 s ; speed-up: 20% opt_stncmp2: 1.24 s ; speed-up: 65% Just in case, I've also run the benchmark in an Opteron server (2 GHz, SuSe 10.1, gcc 4.2.1, using -O3 optlevel). Here are the results: original strncmp: 1.25 s opt_stncmp1: 1.61 s ; speed-up: -30% opt_stncmp2: 1.03 s ; speed-up: 20% So, I was wrong: opt_stncmp2 is quite more efficient than opt_stncmp1, and definitely better than the original strncmp. So, I would say that using opt_stncmp2 is always the best bet. I'm not certain why are you seeing that original strncmp is 100x faster than opt_strncmpX :-/ Cheers, --
0,0< Francesc Altet http://www.carabos.com/ V V Cárabos Coop. V. Enjoy Data "-"