[Numpy-discussion] Initial implementation of histogram_discrete()
Priit Laes
plaes at plaes.org
Sat Nov 14 06:53:35 EST 2009
Ühel kenal päeval, R, 2009-11-13 kell 13:36, kirjutas Ernest Adrogué:
> 13/11/09 @ 09:41 (+0200), thus spake Priit Laes:
> > Does anyone have a scenario where one would actually have both negative
> > and positive numbers (integers) in the list?
>
> Yes: when you have a random variable that is the difference
> of two (discrete) random variables. For example, if you measure
> the difference in number of days off per week because of sickness
> between two groups of people, you would end up with a discrete
> variable with both positive and negative integers.
>
> > So, how about numpy.histogram_discrete() that returns data the way
> > histogram() does: a list containing histogram values (ie counts) and
> > list of sorted items from min(input)...max(input). ?
>
> In my humble opinion, it would be nice.
\o/
I have pushed the preliminary version to:
http://github.com/plaes/numpy/commits/histogram_discrete
It can currently handle datasets with negative items and weights. I'm
also planning to add optional range argument to the function, but I
first need to figure out how to parse the range=(min, max) using C
API... ;)
numpy.histogram_discrete() returns list containing histogram value and
bins (hopefully this is the right definition)
hist, bins = numpy.histogram_discrete(data)
Example:
In [1]: import numpy
In [2]: data = numpy.random.poisson(3, 300)
In [3]: numpy.histogram_discrete(data)
Out[3]:
[array([15, 50, 72, 59, 52, 34, 8, 7, 3]),
array([0, 1, 2, 3, 4, 5, 6, 7, 8])]
In [4]:
In [5]: data = [-1, 5]
In [6]: numpy.histogram_discrete(data, weights=[2, 0])
Out[6]:
[array([ 2., 0., 0., 0., 0., 0., 0.]),
array([-1, 0, 1, 2, 3, 4, 5])]
Priit :)
More information about the NumPy-Discussion
mailing list