[Numpy-discussion] numpy histogram normed=True (bug / confusing behavior)
Bruce Southey
bsouthey at gmail.com
Mon Aug 30 11:39:51 EDT 2010
On 08/30/2010 09:19 AM, Benjamin Root wrote:
> On Mon, Aug 30, 2010 at 8:29 AM, David Huard <david.huard at gmail.com
> <mailto:david.huard at gmail.com>> wrote:
>
> Thanks for the feedback,
>
> As far as I understand it, the proposition is to keep histogram as
> it is for 1.5, then in 2.0, deprecate normed=True but keep the
> buggy behavior, while adding a density keyword that fixes the bug.
> In a later release, we could then get rid of normed. While the bug
> won't be present in histogramdd and histogram2d, the keyword
> change should be mirrored in those functions as well.
>
> I personally am not too keen on changing the keyword normed for
> density. I feel we are trading clarity for a few new users against
> additional trouble for many existing users. We could mitigate this
> by first documenting the change in the docstring and live with
> both keywords for a few years before raising a DeprecationWarning.
>
> Since this has a direct impact on matloblib's hist, I'd be keen to
> hears the devs on this.
>
> David
>
>
> I am not a dev, but I would like to give a word of warning from
> matplotlib.
>
> In matplotlib, the bar/hist family of functions grew organically as
> the devs took on various requests to add keywords and such to modify
> the style and behavior of those graphing functions. It has now become
> an unmaintainable mess, prompting discussions on how to rip it out and
> replace it with a cleaner implementation. While everyone agrees that
> it needs to be done, we all don't want to break backwards compatibility.
>
> My personal feeling is that a function should do one thing, and do
> that one thing well. So, to me, that means that histogram() should
> return an array of counts and the bins for those counts. Anything
> more is merely window dressing to me. With this information, one can
> easily compute a cumulative distribution function, and/or normalize
> the result. The idea is that if there is nothing special that needs
> to be done within the histogram algorithm to accommodate these extra
> features, then they belong outside the function.
>
> My 2 cents,
> Ben Root
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
+1 for Ben's approach.
This is very similar to my view regarding to the contingency table class
proposed for scipy ( http://projects.scipy.org/scipy/ticket/1258)
<http://projects.scipy.org/scipy/ticket/1258>. We need to provide the
core functionality that other approaches such as density estimation can
use but not be limited to specific details.
Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100830/dee7f5ba/attachment.html>
More information about the NumPy-Discussion
mailing list