A Thursday 14 February 2008, escriguéreu:
In any case, if anybody have access to an Opteron machine and gcc 4.2.3, it would be great if he can run the benchmark and contribute his feedback.
Here it is with gcc 4.2.3 on an Opteron 246 (2.0 GHz):
uller:~$ ./sort423_O2 # with -O2 Benchmark with 1000000 strings of size 15 C qsort with C style compare: 0.770000 C qsort with Python style compare: 0.740000 NumPy newqsort: 0.630000
uller:~$ ./sort423_O3 # with -O3 Benchmark with 1000000 strings of size 15 C qsort with C style compare: 0.640000 C qsort with Python style compare: 0.660000 NumPy newqsort: 0.400000
And here are my timings with gcc 4.1.3 and using a similar Opteron than yours (270 @ 2.0 GHz): With -O2: Benchmark with 1000000 strings of size 15 C qsort with C style compare: 0.750000 C qsort with Python style compare: 0.700000 NumPy newqsort: 0.690000 With -O3: Benchmark with 1000000 strings of size 15 C qsort with C style compare: 0.670000 C qsort with Python style compare: 0.620000 NumPy newqsort: 0.380000 So, it seems clear that the GCC people has fixed in 4.2.3 the problem with the optimizer introduced in 4.2.1. Very good! By the way, it's nice to see the wide range of platforms that this list allows to test out :-) Cheers, --
0,0< Francesc Altet http://www.carabos.com/ V V Cárabos Coop. V. Enjoy Data "-"
Hi, I confirmed the gcc 4.2.3 performance for the Opteron: Benchmark with 1000000 strings of size 15 C qsort with C style compare: 0.630000 C qsort with Python style compare: 0.630000 NumPy newqsort: 0.360000 I also installed the Intel icc 10.1 compiler on my Opteron system but I did not use any flags: $ /opt/intel/cc/10.1.008/bin/icc sort-string-bench.c -o icc_sort $ ./icc_sort Benchmark with 1000000 strings of size 15 C qsort with C style compare: 1.030000 C qsort with Python style compare: 0.960000 NumPy newqsort: 0.530000 Just glad to be able to contribute something, Bruce On Thu, Feb 14, 2008 at 9:34 AM, Francesc Altet <faltet@carabos.com> wrote:
A Thursday 14 February 2008, escriguéreu:
In any case, if anybody have access to an Opteron machine and gcc 4.2.3, it would be great if he can run the benchmark and contribute his feedback.
Here it is with gcc 4.2.3 on an Opteron 246 (2.0 GHz):
uller:~$ ./sort423_O2 # with -O2
Benchmark with 1000000 strings of size 15 C qsort with C style compare: 0.770000 C qsort with Python style compare: 0.740000 NumPy newqsort: 0.630000
uller:~$ ./sort423_O3 # with -O3
Benchmark with 1000000 strings of size 15 C qsort with C style compare: 0.640000 C qsort with Python style compare: 0.660000 NumPy newqsort: 0.400000
And here are my timings with gcc 4.1.3 and using a similar Opteron than yours (270 @ 2.0 GHz):
With -O2:
Benchmark with 1000000 strings of size 15 C qsort with C style compare: 0.750000 C qsort with Python style compare: 0.700000 NumPy newqsort: 0.690000
With -O3:
Benchmark with 1000000 strings of size 15 C qsort with C style compare: 0.670000 C qsort with Python style compare: 0.620000 NumPy newqsort: 0.380000
So, it seems clear that the GCC people has fixed in 4.2.3 the problem with the optimizer introduced in 4.2.1. Very good!
By the way, it's nice to see the wide range of platforms that this list allows to test out :-)
Cheers,
--
0,0< Francesc Altet http://www.carabos.com/ V V Cárabos Coop. V. Enjoy Data "-"
Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
A Thursday 14 February 2008, Bruce Southey escrigué:
Hi, I confirmed the gcc 4.2.3 performance for the Opteron:
Benchmark with 1000000 strings of size 15 C qsort with C style compare: 0.630000 C qsort with Python style compare: 0.630000 NumPy newqsort: 0.360000
I also installed the Intel icc 10.1 compiler on my Opteron system but I did not use any flags: $ /opt/intel/cc/10.1.008/bin/icc sort-string-bench.c -o icc_sort $ ./icc_sort Benchmark with 1000000 strings of size 15 C qsort with C style compare: 1.030000 C qsort with Python style compare: 0.960000 NumPy newqsort: 0.530000
That's excellent Bruce. Definitely it looks like the problem with the optimizer in 4.2.1 has been fixed in 4.2.3. And why you haven't used optimization flags with ICC? just curious... Cheers, --
0,0< Francesc Altet http://www.carabos.com/ V V Cárabos Coop. V. Enjoy Data "-"
I successfully compiled a shared library for use with CTypes and linked it to an external library (Gnu Scientific Library) on Mac OS X 10.4. I hope this helps Mac people and anyone else who wants to use CTypes to access their own C extensions and use other C libraries in the process. I want to thank several people on this list who gave me many helpful suggestions and asked me good questions. I also want to thank the several people who kept nudging me to try CTypes even though I was reluctant. It is much easier than programming an extension all in C. Below are 4 files that enable building of a C shared library in Mac OS X (10.4) that can be used with CTypes to call a function from the Gnu Scientific Library (a Bessel function program gsl_sf_bessel_J0). You can see that the idea is pretty simple. The code requires that you have ctypes (in site-packages) and GSL (dynlib version in /usr/local/lib) or your desired C library installed. I suspect on other platforms what will be different will be the make file. I do not know enough to provide Linux or Windows versions. I'm sorry. Note: This works best if the libraries are shared (e.g. the GSL library to use is the dynlib version). That way only the code that's needed is loaded when the C functions are called from python. Comments welcome. Of course, I am responsible for any and all mistakes. So, I make no guarantees or warrenties. These are examples and should not be used where loss of property, life, or other dangers exist. #==== Source code 'bess.c' ====================== #include <stdio.h> #include "bess.h" #include <gsl/gsl_sf_bessel.h> /* Must include the header to define the function for compiler */ /* ---- test fcns ------------------------- */ #ifdef __cplusplus extern "C" { #endif double J0_bess (double x) { /* Call the GSL Bessel function order 0 of the first kind */ double y = gsl_sf_bessel_J0 (x); /* Print the value right here */ printf ("J0(%g) = %.18e\n", x, y); return y; } #ifdef __cplusplus } #endif #==== Header file 'bess.h' ===================== /* ---- Prototypes -------------------- */ #ifdef __cplusplus extern "C" { #endif double J0_bess(double x); #ifdef __cplusplus } #endif #==== Make file 'bess.make' =================== # ---- Link to existing library in this directory ------------ bess.so: bess.o bess.mak gcc -bundle -flat_namespace -undefined suppress -o bess.so bess.o -lgsl # ---- gcc C compile ------------------ bess.o: bess.c bess.h bess.mak gcc -c bess.c -o bess.o #==== Python file 'bess.py' ======================= #!/usr/local/bin/pythonw import numpy as N import ctypes as C # Put the name of your library in place of 'bess.so' and the path to # it in place of the path below in load_library _bess = N.ctypeslib.load_library('bess.so', '/Users/loupecora/Code_py/test_folder/ctypes_tests/test3ctypes/simplelink-GSL/') _bess.J0_bess.restype = C.c_double _bess.J0_bess.argtypes = [C.c_double] def fcn_J0(x): return _bess.J0_bess(x) x = 0.2 y = fcn_J0(x) print "x, y: %e %.18e" % (x, y) #==== Typical output =============== # The first line is printed from the shared library function J0_bess # The second line is from the python code that called the shared lib. function J0(0.2) = 9.900249722395765284e-01 x, y: 2.000000e-01 9.900249722395765284e-01 -- Lou Pecora, my views are my own. ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
participants (3)
-
Bruce Southey -
Francesc Altet -
Lou Pecora