[Numpy-discussion] numpy.mean still broken for largefloat32arrays

Mon Jul 28 10:31:41 EDT 2014

Sebastian:

Those are good points. Indeed iteration order may already produce different
results, even though the semantics of numpy suggest identical operations.
Still, I feel this different behavior without any semantical clues is
something to be minimized.

Indeed copying might have large speed implications. But on second thought,
does it? Either the data is already aligned and no copy is required, or it
isn't aligned, and we need one pass of cache inefficient access to the data
anyway. Infact, if we had one low level function which does
cache-intelligent transposition of numpy arrays (using some block
strategy), this might be faster even than the current simple reduction
operations when forced to work on awkwardly aligned data. Ideally, this
intelligent access and intelligent reduction would be part of a single pass
of course; but that wouldn't really fit within the numpy design, and merely
such an intelligent transpose would provide most of the benefit I think. Or
is the mechanism behind ascontiguousarray already intelligent in this sense?

On Mon, Jul 28, 2014 at 4:06 PM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Mo, 2014-07-28 at 15:35 +0200, Sturla Molden wrote:
> > On 28/07/14 15:21, alex wrote:
> >
> > > Are you sure they always give different results?  Notice that
> > > np.ones((N,2)).mean(0)
> > > np.ones((2,N)).mean(1)
> > > compute means of different axes on transposed arrays so these
> > > differences 'cancel out'.
> >
> > They will be if different algorithms are used. np.ones((N,2)).mean(0)
> > will have larger accumulated rounding error than np.ones((2,N)).mean(1),
> > if only the latter uses the divide-and-conquer summation.
> >
>
> What I wanted to point out is that to some extend the algorithm does not
> matter. You will not necessarily get identical results already if you
> use a different iteration order, and we have been doing that for years
> for speed reasons. All libs like BLAS do the same.
> Yes, the new changes make this much more dramatic, but they only make
> some paths much better, never worse. It might be dangerous, but only in
> the sense that you test it with the good path and it works good enough,
> but later (also) use the other one in some lib. I am not even sure if I
>
> > I would suggest that in the first case we try to copy the array to a
> > temporary contiguous buffer and use the same divide-and-conquer
> > algorithm, unless some heuristics on memory usage fails.
> >
>
> Sure, but you have to make major changes to the buffered iterator to do
> that without larger speed implications. It might be a good idea, but it
> requires someone who knows this stuff to spend a lot of time and care in
> the depths of numpy.
>
> > Sturla
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140728/41c62811/attachment.html>