[Numpy-discussion] histogram2d bug?

David Huard david.huard at gmail.com
Thu Apr 19 14:18:33 EDT 2007


Hi Emanuele,

The bug is due to a part of the code that shifts the last bin's position to
make sure the array's maximum value is counted in the last bin, and not as
an outlier. To do so, the code computes an approximate precision used the
shift the bin edge by amount small compared to the array's value. In your
example, since all values in x are identical, the precision is ``infinite''.
So my question is, what kind of behaviour would you be expecting in this
case for the automatic placement of bin edges ?

That is, given
x : array of identical values, eg. [0, 0, 0, 0, 0, ..., 0]
smin, smax = x.min(), x.max()
How do you select the bin edges ?

One solution is to use the same scheme used by histogram:
if smin == smax:
    edges[i] = linspace(smin-.5, smax+.5, nbin[i]+1)

Would that be ok ?

David


 I'll submit a patch.

2007/4/19, Emanuele Olivetti <emanuele at relativita.com>:
>
> An even simpler example generating the same error:
>
> import numpy
> x = numpy.array([0,0])
> numpy.histogram2d(x,x)
>
>
> HTH,
>
> Emanuele
>
> Emanuele Olivetti wrote:
> > While using histogram2d on simple examples I got these errors:
> >
> > import numpy
> > x = numpy.array([0,0])
> > y = numpy.array([0,1])
> > numpy.histogram2d(x,y,bins=[2,2])
> > -----------------------------------------------------------------
> > Warning: divide by zero encountered in log10
> >
> ---------------------------------------------------------------------------
> > exceptions.OverflowError                             Traceback (most
> > recent call last)
> >
> > /home/ele/<ipython console>
> >
> > /usr/lib/python2.4/site-packages/numpy/lib/twodim_base.py in
> > histogram2d(x, y, bins, range, normed, weights)
> >     180     if N != 1 and N != 2:
> >     181         xedges = yedges = asarray(bins, float)
> >     182         bins = [xedges, yedges]
> > --> 183     hist, edges = histogramdd([x,y], bins, range, normed,
> weights)
> >     184     return hist, edges[0], edges[1]
> >
> > /usr/lib/python2.4/site-packages/numpy/lib/function_base.py in
> > histogramdd(sample, bins, range, normed, weights)
> >     206         decimal = int(-log10(dedges[i].min())) +6
> >     207         # Find which points are on the rightmost edge.
> > --> 208         on_edge = where(around(sample[:,i], decimal) ==
> > around(edges[i][-1], decimal))[0]
> >     209         # Shift these points one bin to the left.
> >     210         Ncount[i][on_edge] -= 1
> >
> > /usr/lib/python2.4/site-packages/numpy/core/fromnumeric.py in round_(a,
> > decimals, out)
> >     687     except AttributeError:
> >     688         return _wrapit(a, 'round', decimals, out)
> > --> 689     return round(decimals, out)
> >     690
> >     691 around = round_
> >
> > OverflowError: long int too large to convert to int
> > -----------------
> >
> > numpy.__version__
> > '1.0.3.dev3719'
> >
> > Hope this report helps,
> >
> > Emanuele
> >
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070419/1e4ccd3b/attachment.html>


More information about the NumPy-Discussion mailing list