[Numpy-discussion] Initial implementation of histogram_discrete()

Priit Laes plaes at plaes.org
Sat Nov 14 06:53:35 EST 2009


Ühel kenal päeval, R, 2009-11-13 kell 13:36, kirjutas Ernest Adrogué:
> 13/11/09 @ 09:41 (+0200), thus spake Priit Laes:
> > Does anyone have a scenario where one would actually have both negative
> > and positive numbers (integers) in the list?
> 
> Yes: when you have a random variable that is the difference
> of two (discrete) random variables. For example, if you measure
> the difference in number of days off per week because of sickness
> between two groups of people, you would end up with a discrete
> variable with both positive and negative integers.
> 
> > So, how about numpy.histogram_discrete() that returns data the way
> > histogram() does: a list containing histogram values (ie counts) and
> > list of sorted items from min(input)...max(input). ?
> 
> In my humble opinion, it would be nice.
\o/

I have pushed the preliminary version to:
http://github.com/plaes/numpy/commits/histogram_discrete

It can currently handle datasets with negative items and weights. I'm
also planning to add optional range argument to the function, but I
first need to figure out how to parse the range=(min, max) using C
API... ;)

numpy.histogram_discrete() returns list containing histogram value and
bins (hopefully this is the right definition)

hist, bins = numpy.histogram_discrete(data)

Example:
In [1]: import numpy
In [2]: data = numpy.random.poisson(3, 300)
In [3]: numpy.histogram_discrete(data)
Out[3]: 
[array([15, 50, 72, 59, 52, 34,  8,  7,  3]),
 array([0, 1, 2, 3, 4, 5, 6, 7, 8])]
In [4]:
In [5]: data = [-1, 5]
In [6]: numpy.histogram_discrete(data, weights=[2, 0])
Out[6]: 
[array([ 2.,  0.,  0.,  0.,  0.,  0.,  0.]),
 array([-1,  0,  1,  2,  3,  4,  5])]


Priit :)



More information about the NumPy-Discussion mailing list