Francesc Altet wrote:
http://oprofile.sourceforge.net
It is a nice way to do profiling at C level on Linux machines. Running the Paulo benchmarks through oprofile can surely bring some light.
I ran the following code snippet (timed under a Timeit instance) through the oprofile profiler for both NumPy and Numeric, to look at indexing speeds. op = "b = A[::2,::2]; d = A[1:80,:]" This is what I found: overall 61242 53.9606 /usr/bin/python 17647 15.5488 /usr/lib/python2.4/site-packages/numpy/core/multiarray.so 15942 14.0466 /lib/tls/libc-2.3.3.so 7158 6.3069 /no-vmlinux 6995 6.1633 /usr/lib/python2.4/site-packages/Numeric/_numpy.so Showing that more time is spent in NumPy than in Numeric doing indexing... Here's the breakdown for NumPy samples % symbol name 2353 13.3337 PyArray_PyIntAsIntp # This is also slower --- called more often? 2060 11.6734 PyArray_MapIterNew # This calls fancy_indexing_check. 1980 11.2200 slice_GetIndices 1631 9.2424 parse_index 1149 6.5110 arraymapiter_dealloc # Interesting this is taking so long? 1142 6.4714 array_subscript 1121 6.3524 _IsAligned 1069 6.0577 array_dealloc 780 4.4200 fancy_indexing_check 684 3.8760 PyArray_NewFromDescr 627 3.5530 parse_subindex 538 3.0487 PyArray_DescrFromType 534 3.0260 array_subscript_nice 455 2.5783 _IsContiguous 370 2.0967 _IsFortranContiguous 334 1.8927 slice_coerce_index 294 1.6660 PyArray_UpdateFlags 234 1.3260 anonymous symbol from section .plt 161 0.9123 PyArray_Return 120 0.6800 array_alloc 2 0.0113 PyArray_Broadcast 2 0.0113 PyArray_IterNew 1 0.0057 LONG_setitem 1 0.0057 PyArray_EquivTypes 1 0.0057 PyArray_FromAny 1 0.0057 PyArray_FromStructInterface 1 0.0057 PyArray_IntpConverter 1 0.0057 PyArray_SetNumericOps 1 0.0057 initialize_numeric_types Here's the breakdown for Numeric: 1577 22.5447 slice_GetIndices 1155 16.5118 parse_index 912 13.0379 PyArray_FromDimsAndDataAndDescr 792 11.3224 array_subscript 675 9.6497 PyArray_IntegerAsInt 517 7.3910 parse_subindex 401 5.7327 array_dealloc 379 5.4182 slice_coerce_index 339 4.8463 array_subscript_nice 161 2.3016 anonymous symbol from section .plt 82 1.1723 PyArray_Return 5 0.0715 do_sliced_copy Anybody interested in optimization? -Travis