On Wed, 01 May 2013, Sebastian Berg wrote:
btw -- is there something like panda's vbench for numpy? i.e. where it would be possible to track/visualize such performance improvements/hits?
Sorry if it seemed harsh, but only skimmed mails and it seemed a bit like the an obvious piece was missing... There are no benchmark tests I am aware of. You can try:
a = np.random.random((1000, 1000))
and then time a.sum(1) and a.sum(0), on 1.7. the fast axis (1), is only slightly faster then the sum over the slow axis. On earlier numpy versions you will probably see something like half the speed for the slow axis (only got ancient or 1.7 numpy right now, so reluctant to give exact timings).
FWIW -- just as a cruel first attempt look at
http://www.onerussian.com/tmp/numpy-vbench-20130506/vb_vb_reduce.html
why float16 case is so special?
I have pushed this really coarse setup (based on some elderly copy of pandas' vbench) to https://github.com/yarikoptic/numpy-vbench
if you care to tune it up/extend and then I could fire it up again on that box (which doesn't do anything else ATM AFAIK). Since majority of time is spent actually building it (did it with ccache though) it would be neat if you come up with more of benchmarks to run which you might think could be interesting/important.
On Mon, May 6, 2013 at 10:32 AM, Yaroslav Halchenko lists@onerussian.com wrote:
On Wed, 01 May 2013, Sebastian Berg wrote:
btw -- is there something like panda's vbench for numpy? i.e. where it would be possible to track/visualize such performance improvements/hits?
Sorry if it seemed harsh, but only skimmed mails and it seemed a bit like the an obvious piece was missing... There are no benchmark tests I am aware of. You can try:
a = np.random.random((1000, 1000))
and then time a.sum(1) and a.sum(0), on 1.7. the fast axis (1), is only slightly faster then the sum over the slow axis. On earlier numpy versions you will probably see something like half the speed for the slow axis (only got ancient or 1.7 numpy right now, so reluctant to give exact timings).
FWIW -- just as a cruel first attempt look at
http://www.onerussian.com/tmp/numpy-vbench-20130506/vb_vb_reduce.html
why float16 case is so special?
I have pushed this really coarse setup (based on some elderly copy of pandas' vbench) to https://github.com/yarikoptic/numpy-vbench
if you care to tune it up/extend and then I could fire it up again on that box (which doesn't do anything else ATM AFAIK). Since majority of time is spent actually building it (did it with ccache though) it would be neat if you come up with more of benchmarks to run which you might think could be interesting/important.
nice results
Thanks Yaroslav,
Josef my default: axis=0
-- Yaroslav O. Halchenko, Ph.D. http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org Senior Research Associate, Psychological and Brain Sciences Dept. Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Mon, 2013-05-06 at 10:32 -0400, Yaroslav Halchenko wrote:
On Wed, 01 May 2013, Sebastian Berg wrote:
btw -- is there something like panda's vbench for numpy? i.e. where it would be possible to track/visualize such performance improvements/hits?
Sorry if it seemed harsh, but only skimmed mails and it seemed a bit like the an obvious piece was missing... There are no benchmark tests I am aware of. You can try:
a = np.random.random((1000, 1000))
and then time a.sum(1) and a.sum(0), on 1.7. the fast axis (1), is only slightly faster then the sum over the slow axis. On earlier numpy versions you will probably see something like half the speed for the slow axis (only got ancient or 1.7 numpy right now, so reluctant to give exact timings).
FWIW -- just as a cruel first attempt look at
http://www.onerussian.com/tmp/numpy-vbench-20130506/vb_vb_reduce.html
why float16 case is so special?
Float16 is special, it is cpu-bound -- not memory bound as most reductions -- because it is not a native type. First thought it was weird, but it actually makes sense, if you have a and b as float16:
a + b
is actually more like (I believe...):
float16(float32(a) + float32(b))
This means there is type casting going on *inside* the ufunc! Normally casting is handled outside the ufunc (by the buffered iterator). Now I did not check, but when the iteration order is not optimized, the ufunc *can* simplify this to something similar to this (along the reduction axis):
result = float32(a[0]) for i in xrange(a[1:]): result += float32(a.next()) return float16(result)
While for "optimized" iteration order, this cannot happen because the intermediate result is always written back.
This means for optimized iteration order a single conversion to float is necessary (in the inner loop), while for unoptimized iteration order two conversions to float and one back is done. Since this conversion is costly, the memory throughput is actually not important (no gain from buffering). This leads to the visible slowdown. This is of course a bit annoying, but not sure how you would solve it (Have the dtype signal that it doesn't even want iteration order optimization? Try to do move those weird float16 conversations from the ufunc to the iterator somehow?).
I have pushed this really coarse setup (based on some elderly copy of pandas' vbench) to https://github.com/yarikoptic/numpy-vbench
if you care to tune it up/extend and then I could fire it up again on that box (which doesn't do anything else ATM AFAIK). Since majority of time is spent actually building it (did it with ccache though) it would be neat if you come up with more of benchmarks to run which you might think could be interesting/important.
I think this is pretty cool! Probably would be a while until there are many tests, but if you or someone could set such thing up it could slowly grow when larger code changes are done?
Regards,
Sebastian
On Mon, 06 May 2013, Sebastian Berg wrote:
if you care to tune it up/extend and then I could fire it up again on that box (which doesn't do anything else ATM AFAIK). Since majority of time is spent actually building it (did it with ccache though) it would be neat if you come up with more of benchmarks to run which you might think could be interesting/important.
I think this is pretty cool! Probably would be a while until there are many tests, but if you or someone could set such thing up it could slowly grow when larger code changes are done?
that is the idea but it would be nice to gather such simple benchmark-tests. if you could hint on the numpy functionality you think especially worth benchmarking (I know -- there is a lot of things which could be set to be benchmarked) -- that would be a nice starting point: just list functionality/functions you consider of primary interest. and either it is worth testing for different types or just a gross estimate (e.g. for the selection of types in a loop)
As for myself -- I guess I will add fancy indexing and slicing tests.
Adding them is quite easy: have a look at https://github.com/yarikoptic/numpy-vbench/blob/master/vb_reduce.py which is actually a bit more cumbersome because of running them for different types. This one is more obvious: https://github.com/yarikoptic/numpy-vbench/blob/master/vb_io.py
On Mon, 2013-05-06 at 12:11 -0400, Yaroslav Halchenko wrote:
On Mon, 06 May 2013, Sebastian Berg wrote:
if you care to tune it up/extend and then I could fire it up again on that box (which doesn't do anything else ATM AFAIK). Since majority of time is spent actually building it (did it with ccache though) it would be neat if you come up with more of benchmarks to run which you might think could be interesting/important.
I think this is pretty cool! Probably would be a while until there are many tests, but if you or someone could set such thing up it could slowly grow when larger code changes are done?
that is the idea but it would be nice to gather such simple benchmark-tests. if you could hint on the numpy functionality you think especially worth benchmarking (I know -- there is a lot of things which could be set to be benchmarked) -- that would be a nice starting point: just list functionality/functions you consider of primary interest. and either it is worth testing for different types or just a gross estimate (e.g. for the selection of types in a loop)
As for myself -- I guess I will add fancy indexing and slicing tests.
Indexing/assignment was the first thing I thought of too (also because fancy indexing/assignment really could use some speedups...). Other then that maybe some timings for small arrays/scalar math, but that might be nice for that GSoC project.
Maybe array creation functions, just to see if performance bugs should sneak into something that central. But can't think of something else that isn't specific functionality.
- Sebastian
Adding them is quite easy: have a look at https://github.com/yarikoptic/numpy-vbench/blob/master/vb_reduce.py which is actually a bit more cumbersome because of running them for different types. This one is more obvious: https://github.com/yarikoptic/numpy-vbench/blob/master/vb_io.py
On 7 May 2013 13:47, Sebastian Berg sebastian@sipsolutions.net wrote:
Indexing/assignment was the first thing I thought of too (also because fancy indexing/assignment really could use some speedups...). Other then that maybe some timings for small arrays/scalar math, but that might be nice for that GSoC project.
Why not going bigger? Ufunc operations on big arrays, CPU and memory bound.
Also, what about interfacing with other packages? It may increase the compiling overhead, but I would like to see Cython in action (say, only last version, maybe it can be fixed).
Hi Guys,
not quite the recommendations you expressed, but here is my ugly attempt to improve benchmarks coverage:
http://www.onerussian.com/tmp/numpy-vbench-20130701/index.html
initially I also ran those ufunc benchmarks per each dtype separately, but then resulting webpage is loong which brings my laptop on its knees by firefox. So I commented those out for now, and left only "summary" ones across multiple datatypes.
There is a bug in sphinx which forbids embedding some figures for vb_random "as is", so pardon that for now...
I have not set cpu affinity of the process (but ran it at nice -10), so may be that also contributed to variance of benchmark estimates. And there probably could be more of goodies (e.g. gc control etc) to borrow from https://github.com/pydata/pandas/blob/master/vb_suite/test_perf.py which I have just discovered to minimize variance.
nothing really interesting was pin-pointed so far, besides that
- svd became a bit faster since few months back ;-)
http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_linalg.html
- isnan (and isinf, isfinite) got improved
http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-i...
- right_shift got a miniscule slowdown from what it used to be?
http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-r...
As before -- current code of those benchmarks collection is available at http://github.com/yarikoptic/numpy-vbench/pull/new/master
if you have specific snippets you would like to benchmark -- just state them here or send a PR -- I will add them in.
Cheers,
On Tue, 07 May 2013, Daπid wrote:
On 7 May 2013 13:47, Sebastian Berg sebastian@sipsolutions.net wrote:
Indexing/assignment was the first thing I thought of too (also because fancy indexing/assignment really could use some speedups...). Other then that maybe some timings for small arrays/scalar math, but that might be nice for that GSoC project.
Why not going bigger? Ufunc operations on big arrays, CPU and memory bound.
Also, what about interfacing with other packages? It may increase the compiling overhead, but I would like to see Cython in action (say, only last version, maybe it can be fixed). _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
FWIW -- updated plots with contribution from Julian Taylor http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_indexing.html#mmap... ;-)
On Mon, 01 Jul 2013, Yaroslav Halchenko wrote:
Hi Guys,
not quite the recommendations you expressed, but here is my ugly attempt to improve benchmarks coverage:
http://www.onerussian.com/tmp/numpy-vbench-20130701/index.html
initially I also ran those ufunc benchmarks per each dtype separately, but then resulting webpage is loong which brings my laptop on its knees by firefox. So I commented those out for now, and left only "summary" ones across multiple datatypes.
There is a bug in sphinx which forbids embedding some figures for vb_random "as is", so pardon that for now...
I have not set cpu affinity of the process (but ran it at nice -10), so may be that also contributed to variance of benchmark estimates. And there probably could be more of goodies (e.g. gc control etc) to borrow from https://github.com/pydata/pandas/blob/master/vb_suite/test_perf.py which I have just discovered to minimize variance.
nothing really interesting was pin-pointed so far, besides that
- svd became a bit faster since few months back ;-)
http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_linalg.html
- isnan (and isinf, isfinite) got improved
http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-i...
- right_shift got a miniscule slowdown from what it used to be?
http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-r...
As before -- current code of those benchmarks collection is available at http://github.com/yarikoptic/numpy-vbench/pull/new/master
if you have specific snippets you would like to benchmark -- just state them here or send a PR -- I will add them in.
Cheers,
On Tue, 07 May 2013, Daπid wrote:
On 7 May 2013 13:47, Sebastian Berg sebastian@sipsolutions.net wrote:
Indexing/assignment was the first thing I thought of too (also because fancy indexing/assignment really could use some speedups...). Other then that maybe some timings for small arrays/scalar math, but that might be nice for that GSoC project.
Why not going bigger? Ufunc operations on big arrays, CPU and memory bound.
Also, what about interfacing with other packages? It may increase the compiling overhead, but I would like to see Cython in action (say, only last version, maybe it can be fixed). _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Julian Taylor contributed some benchmarks he was "concerned" about, so now the collection is even better.
I will keep updating tests on the same url: http://www.onerussian.com/tmp/numpy-vbench/ [it is now running and later I will upload with more commits for higher temporal fidelity]
of particular interest for you might be: some minor consistent recent losses in http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-floa... http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-floa... http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-int1... http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-int8
seems have lost more than 25% of performance throughout the timeline http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#memcpy-int8
"fast" calls to all/any seemed to be hurt twice in their life time now running *3 times slower* than in 2011 -- inflection points correspond to regressions and/or their fixes in those functions to bring back performance on "slow" cases (when array traversal is needed, e.g. on arrays of zeros for any)
http://www.onerussian.com/tmp/numpy-vbench/vb_vb_reduce.html#numpy-all-fast http://www.onerussian.com/tmp/numpy-vbench/vb_vb_reduce.html#numpy-any-fast
Enjoy
On Mon, 01 Jul 2013, Yaroslav Halchenko wrote:
FWIW -- updated plots with contribution from Julian Taylor http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_indexing.html#mmap... ;-)
On Mon, 01 Jul 2013, Yaroslav Halchenko wrote:
Hi Guys,
not quite the recommendations you expressed, but here is my ugly attempt to improve benchmarks coverage:
http://www.onerussian.com/tmp/numpy-vbench-20130701/index.html
initially I also ran those ufunc benchmarks per each dtype separately, but then resulting webpage is loong which brings my laptop on its knees by firefox. So I commented those out for now, and left only "summary" ones across multiple datatypes.
There is a bug in sphinx which forbids embedding some figures for vb_random "as is", so pardon that for now...
I have not set cpu affinity of the process (but ran it at nice -10), so may be that also contributed to variance of benchmark estimates. And there probably could be more of goodies (e.g. gc control etc) to borrow from https://github.com/pydata/pandas/blob/master/vb_suite/test_perf.py which I have just discovered to minimize variance.
nothing really interesting was pin-pointed so far, besides that
- svd became a bit faster since few months back ;-)
http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_linalg.html
- isnan (and isinf, isfinite) got improved
http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-i...
- right_shift got a miniscule slowdown from what it used to be?
http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-r...
As before -- current code of those benchmarks collection is available at http://github.com/yarikoptic/numpy-vbench/pull/new/master
if you have specific snippets you would like to benchmark -- just state them here or send a PR -- I will add them in.
Cheers,
On Tue, 07 May 2013, Daπid wrote:
On 7 May 2013 13:47, Sebastian Berg sebastian@sipsolutions.net wrote:
Indexing/assignment was the first thing I thought of too (also because fancy indexing/assignment really could use some speedups...). Other then that maybe some timings for small arrays/scalar math, but that might be nice for that GSoC project.
Why not going bigger? Ufunc operations on big arrays, CPU and memory bound.
Also, what about interfacing with other packages? It may increase the compiling overhead, but I would like to see Cython in action (say, only last version, maybe it can be fixed). _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
and to put so far reported findings into some kind of automated form, please welcome
http://www.onerussian.com/tmp/numpy-vbench/#benchmarks-performance-analysis
This is based on a simple 1-way anova of last 10 commits and some point in the past where 10 other commits had smallest timing and were significantly different from the last 10 commits.
"Possible recent" is probably too noisy and not sure if useful -- it should point to a closest in time (to the latest commits) diff where a significant excursion from current performance was detected. So per se it has nothing to do with the initial detected performance hit, but in some cases seems still to reasonably locate commits hitting on performance.
Enjoy,
On Tue, 09 Jul 2013, Yaroslav Halchenko wrote:
Julian Taylor contributed some benchmarks he was "concerned" about, so now the collection is even better.
I will keep updating tests on the same url: http://www.onerussian.com/tmp/numpy-vbench/ [it is now running and later I will upload with more commits for higher temporal fidelity]
of particular interest for you might be: some minor consistent recent losses in http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-floa... http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-floa... http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-int1... http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-int8
seems have lost more than 25% of performance throughout the timeline http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#memcpy-int8
"fast" calls to all/any seemed to be hurt twice in their life time now running *3 times slower* than in 2011 -- inflection points correspond to regressions and/or their fixes in those functions to bring back performance on "slow" cases (when array traversal is needed, e.g. on arrays of zeros for any)
http://www.onerussian.com/tmp/numpy-vbench/vb_vb_reduce.html#numpy-all-fast http://www.onerussian.com/tmp/numpy-vbench/vb_vb_reduce.html#numpy-any-fast
Enjoy
On Mon, 01 Jul 2013, Yaroslav Halchenko wrote:
FWIW -- updated plots with contribution from Julian Taylor http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_indexing.html#mmap... ;-)
On Mon, 01 Jul 2013, Yaroslav Halchenko wrote:
Hi Guys,
not quite the recommendations you expressed, but here is my ugly attempt to improve benchmarks coverage:
http://www.onerussian.com/tmp/numpy-vbench-20130701/index.html
initially I also ran those ufunc benchmarks per each dtype separately, but then resulting webpage is loong which brings my laptop on its knees by firefox. So I commented those out for now, and left only "summary" ones across multiple datatypes.
There is a bug in sphinx which forbids embedding some figures for vb_random "as is", so pardon that for now...
I have not set cpu affinity of the process (but ran it at nice -10), so may be that also contributed to variance of benchmark estimates. And there probably could be more of goodies (e.g. gc control etc) to borrow from https://github.com/pydata/pandas/blob/master/vb_suite/test_perf.py which I have just discovered to minimize variance.
nothing really interesting was pin-pointed so far, besides that
- svd became a bit faster since few months back ;-)
http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_linalg.html
- isnan (and isinf, isfinite) got improved
http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-i...
- right_shift got a miniscule slowdown from what it used to be?
http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-r...
As before -- current code of those benchmarks collection is available at http://github.com/yarikoptic/numpy-vbench/pull/new/master
if you have specific snippets you would like to benchmark -- just state them here or send a PR -- I will add them in.
Cheers,
On Tue, 07 May 2013, Daπid wrote:
On 7 May 2013 13:47, Sebastian Berg sebastian@sipsolutions.net wrote:
Indexing/assignment was the first thing I thought of too (also because fancy indexing/assignment really could use some speedups...). Other then that maybe some timings for small arrays/scalar math, but that might be nice for that GSoC project.
Why not going bigger? Ufunc operations on big arrays, CPU and memory bound.
Also, what about interfacing with other packages? It may increase the compiling overhead, but I would like to see Cython in action (say, only last version, maybe it can be fixed). _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
I have just added a few more benchmarks, and here they come http://www.onerussian.com/tmp/numpy-vbench/vb_vb_linalg.html#numpy-linalg-pi... it seems to be very recent so my only check based on 10 commits didn't pick it up yet so they are not present in the summary table.
could well be related to 80% faster det()? ;)
norm was hit as well a bit earlier, might well be within these commits: https://github.com/numpy/numpy/compare/24a0aa5...29dcc54 I will rerun now benchmarking for the rest of commits (was running last in the day iirc)
Cheers,
On Tue, 16 Jul 2013, Yaroslav Halchenko wrote:
and to put so far reported findings into some kind of automated form, please welcome
http://www.onerussian.com/tmp/numpy-vbench/#benchmarks-performance-analysis
This is based on a simple 1-way anova of last 10 commits and some point in the past where 10 other commits had smallest timing and were significantly different from the last 10 commits.
"Possible recent" is probably too noisy and not sure if useful -- it should point to a closest in time (to the latest commits) diff where a significant excursion from current performance was detected. So per se it has nothing to do with the initial detected performance hit, but in some cases seems still to reasonably locate commits hitting on performance.
Enjoy,
The biggest ~recent change in master's linalg was the switch to gufunc back ends - you might want to check for that event in your commit log. On 19 Jul 2013 23:08, "Yaroslav Halchenko" lists@onerussian.com wrote:
I have just added a few more benchmarks, and here they come
http://www.onerussian.com/tmp/numpy-vbench/vb_vb_linalg.html#numpy-linalg-pi... it seems to be very recent so my only check based on 10 commits didn't pick it up yet so they are not present in the summary table.
could well be related to 80% faster det()? ;)
norm was hit as well a bit earlier, might well be within these commits: https://github.com/numpy/numpy/compare/24a0aa5...29dcc54 I will rerun now benchmarking for the rest of commits (was running last in the day iirc)
Cheers,
On Tue, 16 Jul 2013, Yaroslav Halchenko wrote:
and to put so far reported findings into some kind of automated form, please welcome
http://www.onerussian.com/tmp/numpy-vbench/#benchmarks-performance-analysis
This is based on a simple 1-way anova of last 10 commits and some point in the past where 10 other commits had smallest timing and were
significantly
different from the last 10 commits.
"Possible recent" is probably too noisy and not sure if useful -- it
should
point to a closest in time (to the latest commits) diff where a significant excursion from current performance was detected. So per se
it has
nothing to do with the initial detected performance hit, but in some
cases
seems still to reasonably locate commits hitting on performance.
Enjoy,
-- Yaroslav O. Halchenko, Ph.D. http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org Senior Research Associate, Psychological and Brain Sciences Dept. Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
20.07.2013 01:38, Nathaniel Smith kirjoitti:
The biggest ~recent change in master's linalg was the switch to gufunc back ends - you might want to check for that event in your commit log.
That was in mid-April, which doesn't match with the location of the uptick in the graph.
Pauli
At some point I hope to tune up the report with an option of viewing the plot using e.g. nvd3 JS so it could be easier to pin point/analyze interactively.
On Sat, 20 Jul 2013, Pauli Virtanen wrote:
20.07.2013 01:38, Nathaniel Smith kirjoitti:
The biggest ~recent change in master's linalg was the switch to gufunc back ends - you might want to check for that event in your commit log.
That was in mid-April, which doesn't match with the location of the uptick in the graph.
On Mon, Jul 22, 2013 at 10:55 AM, Yaroslav Halchenko lists@onerussian.comwrote:
At some point I hope to tune up the report with an option of viewing the plot using e.g. nvd3 JS so it could be easier to pin point/analyze interactively.
shameless plug... the soon-to-be-finalized matplotlib-1.3 has a WebAgg backend that allows for interactivity.
Cheers! Ben Root
On Mon, 22 Jul 2013, Benjamin Root wrote:
At some point I hope to tune up the report with an option of viewing the plot using e.g. nvd3 JS so it could be easier to pin point/analyze interactively.
shameless plug... the soon-to-be-finalized matplotlib-1.3 has a WebAgg backend that allows for interactivity.
"that's just sick!"
do you know about any motion in python-sphinx world on supporting it?
is there any demo page you would recommend to assess what to expect supported in upcoming webagg?
On Mon, Jul 22, 2013 at 1:28 PM, Yaroslav Halchenko lists@onerussian.comwrote:
On Mon, 22 Jul 2013, Benjamin Root wrote:
At some point I hope to tune up the report with an option of
viewing the
plot using e.g. nvd3 JS so it could be easier to pin point/analyze interactively.
shameless plug... the soon-to-be-finalized matplotlib-1.3 has a WebAgg backend that allows for interactivity.
"that's just sick!"
do you know about any motion in python-sphinx world on supporting it?
is there any demo page you would recommend to assess what to expect supported in upcoming webagg?
Oldie but goodie: http://mdboom.github.io/blog/2012/10/11/matplotlib-in-the-browser-its-coming... Official Announcement: http://matplotlib.org/1.3.0/users/whats_new.html#webagg-backend
Note, this is different than what is now available in IPython Notebook (it isn't really interactive there). As for what is supported, just about everything you can do normally, can be done in WebAgg. I have no clue about sphinx-level support.
Now, back to your regularly scheduled program.
Cheers! Ben Root
On 7/19/13, Yaroslav Halchenko lists@onerussian.com wrote:
I have just added a few more benchmarks, and here they come http://www.onerussian.com/tmp/numpy-vbench/vb_vb_linalg.html#numpy-linalg-pi... it seems to be very recent so my only check based on 10 commits didn't pick it up yet so they are not present in the summary table.
could well be related to 80% faster det()? ;)
norm was hit as well a bit earlier,
Well, this is embarrassing: https://github.com/numpy/numpy/pull/3539
Thanks for benchmarks! I'm now an even bigger fan. :)
Warren
might well be within these commits:
https://github.com/numpy/numpy/compare/24a0aa5...29dcc54 I will rerun now benchmarking for the rest of commits (was running last in the day iirc)
Cheers,
On Tue, 16 Jul 2013, Yaroslav Halchenko wrote:
and to put so far reported findings into some kind of automated form, please welcome
http://www.onerussian.com/tmp/numpy-vbench/#benchmarks-performance-analysis
This is based on a simple 1-way anova of last 10 commits and some point in the past where 10 other commits had smallest timing and were significantly different from the last 10 commits.
"Possible recent" is probably too noisy and not sure if useful -- it should point to a closest in time (to the latest commits) diff where a significant excursion from current performance was detected. So per se it has nothing to do with the initial detected performance hit, but in some cases seems still to reasonably locate commits hitting on performance.
Enjoy,
-- Yaroslav O. Halchenko, Ph.D. http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org Senior Research Associate, Psychological and Brain Sciences Dept. Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Fri, 19 Jul 2013, Warren Weckesser wrote:
Well, this is embarrassing: https://github.com/numpy/numpy/pull/3539
Thanks for benchmarks! I'm now an even bigger fan. :)
Great to see that those came of help! I thought to provide a detailed details (benchmarking all recent commits) to provide exact point of regression, but embarrassingly I made that run outside of the benchmarking chroot, so consistency was not guaranteed. Anyways -- rerunning it correctly now (with recent commits included).
Added some basic constructors benchmarks: http://www.onerussian.com/tmp/numpy-vbench/vb_vb_core.html quite a bit of fresh enhancements are present (cool) but also some freshly discovered elderly hits, e.g.
http://www.onerussian.com/tmp/numpy-vbench/vb_vb_core.html#numpy-identity-10... http://www.onerussian.com/tmp/numpy-vbench/vb_vb_core.html#numpy-ones-100
Cheers,
On Fri, 19 Jul 2013, Yaroslav Halchenko wrote:
I have just added a few more benchmarks, and here they come http://www.onerussian.com/tmp/numpy-vbench/vb_vb_linalg.html#numpy-linalg-pi... it seems to be very recent so my only check based on 10 commits didn't pick it up yet so they are not present in the summary table.
could well be related to 80% faster det()? ;)
norm was hit as well a bit earlier, might well be within these commits: https://github.com/numpy/numpy/compare/24a0aa5...29dcc54 I will rerun now benchmarking for the rest of commits (was running last in the day iirc)
Cheers,
I am glad to announce that now you can see benchmark timing plots for multiple branches, thus being able to spot regressions in maintenance branches and compare enhancements in relation to previous releases.
e.g. * improving upon 1.7.x but still lacking behind 1.6.x http://www.onerussian.com/tmp/numpy-vbench/vb_vb_core.html#numpy-identity-10... http://www.onerussian.com/tmp/numpy-vbench/vb_vb_function_base.html#percenti... ...
* what seems to be a regression caught/fixed in 1.7.x: http://www.onerussian.com/tmp/numpy-vbench/vb_vb_indexing.html#a-indexes-flo...
* or not (yet) fixed in 1.7.x http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-comp...
summary table generation is not yet adjusted for this multi-branch changes, so there might be misleading results there: http://www.onerussian.com/tmp/numpy-vbench/index.html#benchmarks-performance...
Cheers,
FWIW -- updated runs of the benchmarks are available at http://yarikoptic.github.io/numpy-vbench which now include also maintenance/1.8.x branch (no divergences were detected yet). There are only recent improvements as I see and no new (but some old ones are still there, some might be specific to my CPU here) performance regressions.
Cheers,
On 6 September 2013 21:21, Yaroslav Halchenko lists@onerussian.com wrote:
some old ones are still there, some might be specific to my CPU here
How long does one run take? Maybe I can run it in my machine (Intel i5) for comparison.
On Fri, 06 Sep 2013, Daπid wrote:
some old ones are still there, some might be specific to my CPU here
How long does one run take? Maybe I can run it in my machine (Intel i5) for comparison.
In current configuration where I "target" benchmark run to around 200ms (thus possibly jumping up to 400ms) and thus 1-2 sec for 3 actual runs to figure out min among those -- on that elderly box it takes about a day to run "end of the day" commits (iirc around 400 of them) and then 3-4 days for a full run (all commits). I am not sure if targetting 200ms is of any benefit, as opposed to 100ms which would run twice faster.
you are welcome to give it a shout right away http://github.com/yarikoptic/numpy-vbench
it is still a bit ad-hoc and I also use additional shell wrapper to set cpu affinity (taskset -cp 1) and renice to -10 the benchmarking process.
On Fri, Sep 6, 2013 at 3:21 PM, Yaroslav Halchenko lists@onerussian.com wrote:
FWIW -- updated runs of the benchmarks are available at http://yarikoptic.github.io/numpy-vbench which now include also maintenance/1.8.x branch (no divergences were detected yet). There are only recent improvements as I see and no new (but some old ones are still there, some might be specific to my CPU here) performance regressions.
You would have enough data to add some quality control bands to the charts (like cusum chart for example). Then it would be possible to send a congratulation note or ring an alarm bell without looking at all the plots.
Josef
Cheers,
Yaroslav O. Halchenko, Ph.D. http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org Senior Research Associate, Psychological and Brain Sciences Dept. Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Fri, 06 Sep 2013, josef.pktd@gmail.com wrote:
On Fri, Sep 6, 2013 at 3:21 PM, Yaroslav Halchenko lists@onerussian.com wrote:
FWIW -- updated runs of the benchmarks are available at http://yarikoptic.github.io/numpy-vbench which now include also maintenance/1.8.x branch (no divergences were detected yet). There are only recent improvements as I see and no new (but some old ones are still there, some might be specific to my CPU here) performance regressions.
You would have enough data to add some quality control bands to the charts (like cusum chart for example). Then it would be possible to send a congratulation note or ring an alarm bell without looking at all the plots.
well -- I did cook up some basic "detector" but I believe I haven't adjusted it for multiple branches yet: http://yarikoptic.github.io/numpy-vbench/#benchmarks-performance-analysis you are welcome to introduce additional (or replacement) detection goodness http://github.com/yarikoptic/vbench/blob/HEAD/vbench/analysis.py and plotting is done here I believe: https://github.com/yarikoptic/vbench/blob/HEAD/vbench/benchmark.py#L155
On Fri, Sep 6, 2013 at 1:21 PM, Yaroslav Halchenko lists@onerussian.comwrote:
FWIW -- updated runs of the benchmarks are available at http://yarikoptic.github.io/numpy-vbench which now include also maintenance/1.8.x branch (no divergences were detected yet). There are only recent improvements as I see and no new (but some old ones are still there, some might be specific to my CPU here) performance regressions.
This work is really nice. Thank you Yaroslav.
Chuck