Normalized histogram for data ranges 0 .. 1 returns PDF > 1

Hi all, I'm using numpy.histogram with normed=True with 1D data ranging 0 .. 1. The results return probabilities greater than 1. The trapezoidal integral returns 1, but I'm afraid this is due to the bin assigned values. Example follows:
from numpy import * a = arange(0, 1, 0.1) histogram(a, normed=True) (array([ 1.11111111, 1.11111111, 1.11111111, 1.11111111, 1.11111111, 1.11111111, 1.11111111, 1.11111111, 1.11111111, 1.11111111]), array([ 0. , 0.09, 0.18, 0.27, 0.36, 0.45, 0.54, 0.63, 0.72, 0.81, 0.9 ]))
Is that normal? If not, does anyone encountered that before? Ideas welcome! Thanks, Manos._

On Tue, Feb 2, 2010 at 8:05 AM, Manos Tsagias <tsagias@gmail.com> wrote:
Hi all, I'm using numpy.histogram with normed=True with 1D data ranging 0 .. 1. The results return probabilities greater than 1. The trapezoidal integral returns 1, but I'm afraid this is due to the bin assigned values. Example follows:
from numpy import * a = arange(0, 1, 0.1) histogram(a, normed=True) (array([ 1.11111111, 1.11111111, 1.11111111, 1.11111111, 1.11111111, 1.11111111, 1.11111111, 1.11111111, 1.11111111, 1.11111111]), array([ 0. , 0.09, 0.18, 0.27, 0.36, 0.45, 0.54, 0.63, 0.72, 0.81, 0.9 ])) Is that normal? If not, does anyone encountered that before? Ideas welcome! Thanks, Manos._
histogram with normed=True has the interpretation of a pdf of a continuous random variable not discrete. The pdf of a continuous distribution can be anything greater or equal zero. On [0,1] it has to have a part that is larger than 1 unless the distribution is uniform in order to integrate to 1. It's a sometimes-asked-question, there are more explanations on the mailing list. Josef
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (2)
-
josef.pktd@gmail.com
-
Manos Tsagias