[SciPy-user] difference between different mean()s
Hello I was wondering what the difference is between numpy.mean() and the scipy.ndimage.mean() method? I get different answers. Also is there a difference in using the different std() and median() method etc.? I am applying the methods on 2-D arrays. Cheers Magnus -- View this message in context: http://www.nabble.com/difference-between-different-mean%28%29s-tp25383547p25... Sent from the Scipy-User mailing list archive at Nabble.com.
Can you show us a minimal example where you get different behavior? I'm getting the same result for simple 2d arrays like x = arange(100) x.shape = (10,10) scipy.ndimage,mean(x) -> 49.5 np.mean(x) -> 49.5 Max On Thu, Sep 10, 2009 at 8:43 AM, n.l.o <magnusp@astro.su.se> wrote:
Hello
I was wondering what the difference is between numpy.mean() and the scipy.ndimage.mean() method?
I get different answers.
Also is there a difference in using the different std() and median() method etc.?
I am applying the methods on 2-D arrays.
Cheers Magnus -- View this message in context: http://www.nabble.com/difference-between-different-mean%28%29s-tp25383547p25... Sent from the Scipy-User mailing list archive at Nabble.com.
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
ehum,then I must have done something wrong. I get the same results doing your example. But, when I use my data, i.e. a cube of shape (30,512,512) and run the different median I get different answers. Although not with your example data (taking shape to be 10,2,5 or something). (data at http://magnusp.homeip.net/data0.fits) code: a = pyfits('data0.fits') a.mean() 90.328727213541669 ndimage.mean(a) 93.617742029825848 weird, or is it just me again? Max Shron wrote:
Can you show us a minimal example where you get different behavior? I'm getting the same result for simple 2d arrays like x = arange(100) x.shape = (10,10) scipy.ndimage,mean(x) -> 49.5 np.mean(x) -> 49.5
Max
On Thu, Sep 10, 2009 at 8:43 AM, n.l.o <magnusp@astro.su.se> wrote:
Hello
I was wondering what the difference is between numpy.mean() and the scipy.ndimage.mean() method?
I get different answers.
Also is there a difference in using the different std() and median() method etc.?
I am applying the methods on 2-D arrays.
Cheers Magnus -- View this message in context: http://www.nabble.com/difference-between-different-mean%28%29s-tp25383547p25... Sent from the Scipy-User mailing list archive at Nabble.com.
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
-- View this message in context: http://www.nabble.com/difference-between-different-mean%28%29s-tp25383547p25... Sent from the Scipy-User mailing list archive at Nabble.com.
Thu, 10 Sep 2009 13:24:19 -0700, n.l.o wrote:
ehum,then I must have done something wrong. I get the same results doing your example.
But, when I use my data, i.e. a cube of shape (30,512,512) and run the different median I get different answers. Although not with your example data (taking shape to be 10,2,5 or something).
(data at http://magnusp.homeip.net/data0.fits) code:
a = pyfits('data0.fits') a.mean() 90.328727213541669 ndimage.mean(a) 93.617742029825848
weird, or is it just me again?
You have 32-bit single-precision float data, and so numpy.mean uses a 32- bit float accumulator to compute the mean. If you use doubles (64-bit) for the accumulator, you get the same result as ndimage (which also uses double):
a.mean(dtype=np.float64) 93.617742029825848
I think a remark on this should be added to the documentation for mean() and other accumulator methods -- it's sort of a trap for the unwary. -- Pauli Virtanen
(Please keep this discussion on the list, thanks!) Fri, 11 Sep 2009, n.l.o wrote:
Fri 11 Sep 2009, Pauli Virtanen wrote:
a.mean(dtype=np.float64) 93.617742029825848
Thanks for the very informative reply. So if I understand correctly, I should use the 'a.mean()' method, since it uses the same dtype; 'float32'?
No, you should specify a higher-accuracy accumulator, or use the ndimage routine. 93.6177 is the more correct answer. This is a generic floating point issue: if you do (in C) float item[LARGENUM]; float c; for (k = 0; k < N; ++k) { c += item[k]; } c /= N; you get a less accurate answer than with float item[LARGENUM]; double c; for (k = 0; k < N; ++k) { c += item[k]; } c /= N; because of accumulated loss of precision in the + operations. -- Pauli Virtanen
participants (3)
-
Max Shron
-
n.l.o
-
Pauli Virtanen