[Numpy-discussion] Re : [Newbie] Fast plotting

Jean-Baptiste Rudant boogaloojb at yahoo.fr
Tue Jan 6 08:38:27 EST 2009


Hello,

I'm not an expert. Something exists in matplotlib, but it's not very efficient.

import matplotlib.mlab
import numpy
N = 1000
X  = numpy.random.randint(0, 10, N)
Y = numpy.random.random(N)
recXY = numpy.rec.fromarrays((X, Y), names='x, y')
summary = matplotlib..mlab.rec_groupby(recXY, ('x',), (('y', numpy.mean, 'y_avg'),))

Jean-Baptiste Rudant




________________________________
De : Franck Pommereau <pommereau at univ-paris12.fr>
À : Discussion of Numerical Python <numpy-discussion at scipy.org>
Envoyé le : Mardi, 6 Janvier 2009, 10h35mn 01s
Objet : [Numpy-discussion] [Newbie] Fast plotting

Hi all, and happy new year!

I'm new to NumPy and searching a way to compute from a set of points
(x,y) the mean value of y values associated to each distinct x value.
Each point corresponds to a measure in a benchmark (x = parameter,  y =
computation time) and I'd like to plot the graph of mean computation
time wrt parameter values. (I know how to plot, but not how to compute
mean values.)

My points are stored as two arrays X, Y (same size).
In pure Python, I'd do as follows:

s = {} # sum of y values for each distinct x (as keys)
n = {} # number of summed values (same keys)
for x, y in zip(X, Y) :
    s[x] = s.get(x, 0.0) + y
    n[x] = n.get(x, 0) + 1
new_x = array(list(sorted(s)))
new_y = array([s[x]/n[x] for x in sorted(s)])

Unfortunately, this code is much too slow because my arrays have
millions of elements. But I'm pretty sure that NumPy offers a way to
handle this more elegantly and much faster.

As a bonus, I'd be happy if the solution would allow me to compute also
standard deviation, min, max, etc.

Thanks in advance for any help!
Franck
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion at scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion



      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20090106/a44d4e07/attachment.html>


More information about the NumPy-Discussion mailing list