zscore axis functionality is borked
![](https://secure.gravatar.com/avatar/754bcef9ca26e411bb3a93deaead218f.jpg?s=120&d=mm&r=g)
axis=0 (the default) works fine. axis=1, etc, is clearly wrong. Am I misunderstanding how to use this, or is this a bug? In [16]: i = rand(4,4) In [17]: i Out[17]: array([[ 0.85367762, 0.25348857, 0.23572615, 0.50403358], [ 0.70199066, 0.81872151, 0.47357357, 0.20425537], [ 0.31042673, 0.25837984, 0.73550134, 0.57970176], [ 0.42828877, 0.60988596, 0.04059321, 0.73944219]]) In [18]: zscore(i, axis=0) Out[18]: array([[ 1.30128758, -0.96195723, -0.52119142, -0.01453907], [ 0.59653471, 1.38544585, 0.39284654, -1.55756529], [-1.22271057, -0.94164388, 1.39942427, 0.37494213], [-0.67511172, 0.51815526, -1.27107939, 1.19716222]]) In [19]: zscore(i[:,0]) Out[19]: array([ 1.30128758, 0.59653471, -1.22271057, -0.67511172]) In [20]: zscore(i[:,0])==zscore(i,axis=0)[:,0] Out[20]: array([ True, True, True, True], dtype=bool) In [21]: zscore(i, axis=1) Out[21]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #The above is obviously wrong, as everything has a negative z score In [22]: zscore(i[0,:]) Out[22]: array([ 1.56824016, -0.83321371, -0.90428403, 0.16925757]) In [23]: zscore(i[0,:])==zscore(i,axis=1)[0,:] Out[23]: array([False, False, False, False], dtype=bool) #Using axis=1 produces different results from taking a row directly. In [24]: zscore(i, axis=-1) Out[24]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #Getting rows by using axis=-1 is no better (this is the same result as axis=1
![](https://secure.gravatar.com/avatar/ad13088a623822caf74e635a68a55eae.jpg?s=120&d=mm&r=g)
On Wed, Nov 30, 2011 at 3:25 PM, Alacast <alacast@gmail.com> wrote:
axis=0 (the default) works fine. axis=1, etc, is clearly wrong. Am I misunderstanding how to use this, or is this a bug?
In [16]: i = rand(4,4)
In [17]: i Out[17]: array([[ 0.85367762, 0.25348857, 0.23572615, 0.50403358], [ 0.70199066, 0.81872151, 0.47357357, 0.20425537], [ 0.31042673, 0.25837984, 0.73550134, 0.57970176], [ 0.42828877, 0.60988596, 0.04059321, 0.73944219]])
In [18]: zscore(i, axis=0) Out[18]: array([[ 1.30128758, -0.96195723, -0.52119142, -0.01453907], [ 0.59653471, 1.38544585, 0.39284654, -1.55756529], [-1.22271057, -0.94164388, 1.39942427, 0.37494213], [-0.67511172, 0.51815526, -1.27107939, 1.19716222]])
In [19]: zscore(i[:,0]) Out[19]: array([ 1.30128758, 0.59653471, -1.22271057, -0.67511172])
In [20]: zscore(i[:,0])==zscore(i,axis=0)[:,0] Out[20]: array([ True, True, True, True], dtype=bool)
In [21]: zscore(i, axis=1) Out[21]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #The above is obviously wrong, as everything has a negative z score
In [22]: zscore(i[0,:]) Out[22]: array([ 1.56824016, -0.83321371, -0.90428403, 0.16925757])
In [23]: zscore(i[0,:])==zscore(i,axis=1)[0,:] Out[23]: array([False, False, False, False], dtype=bool) #Using axis=1 produces different results from taking a row directly.
In [24]: zscore(i, axis=-1) Out[24]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #Getting rows by using axis=-1 is no better (this is the same result as axis=1
This looks like a serious bug to me. I don't know what happened here (. The docstring example also has negative numbers only. ??? I'm looking into it Thanks for reporting Josef
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
![](https://secure.gravatar.com/avatar/ad13088a623822caf74e635a68a55eae.jpg?s=120&d=mm&r=g)
On Wed, Nov 30, 2011 at 3:45 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 3:25 PM, Alacast <alacast@gmail.com> wrote:
axis=0 (the default) works fine. axis=1, etc, is clearly wrong. Am I misunderstanding how to use this, or is this a bug?
In [16]: i = rand(4,4)
In [17]: i Out[17]: array([[ 0.85367762, 0.25348857, 0.23572615, 0.50403358], [ 0.70199066, 0.81872151, 0.47357357, 0.20425537], [ 0.31042673, 0.25837984, 0.73550134, 0.57970176], [ 0.42828877, 0.60988596, 0.04059321, 0.73944219]])
In [18]: zscore(i, axis=0) Out[18]: array([[ 1.30128758, -0.96195723, -0.52119142, -0.01453907], [ 0.59653471, 1.38544585, 0.39284654, -1.55756529], [-1.22271057, -0.94164388, 1.39942427, 0.37494213], [-0.67511172, 0.51815526, -1.27107939, 1.19716222]])
In [19]: zscore(i[:,0]) Out[19]: array([ 1.30128758, 0.59653471, -1.22271057, -0.67511172])
In [20]: zscore(i[:,0])==zscore(i,axis=0)[:,0] Out[20]: array([ True, True, True, True], dtype=bool)
In [21]: zscore(i, axis=1) Out[21]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #The above is obviously wrong, as everything has a negative z score
In [22]: zscore(i[0,:]) Out[22]: array([ 1.56824016, -0.83321371, -0.90428403, 0.16925757])
In [23]: zscore(i[0,:])==zscore(i,axis=1)[0,:] Out[23]: array([False, False, False, False], dtype=bool) #Using axis=1 produces different results from taking a row directly.
In [24]: zscore(i, axis=-1) Out[24]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #Getting rows by using axis=-1 is no better (this is the same result as axis=1
This looks like a serious bug to me. I don't know what happened here (.
The docstring example also has negative numbers only.
???
I'm looking into it
Thanks for reporting
a misplaced axis: if axis>0 then it calculates x - mean/std instead of (x - mean) / std now, how did this go through the testing ? Josef
Josef
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
![](https://secure.gravatar.com/avatar/b0f62d137f9ea1d0b6cc4e7e6f61b119.jpg?s=120&d=mm&r=g)
On Wed, Nov 30, 2011 at 2:54 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 3:45 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 3:25 PM, Alacast <alacast@gmail.com> wrote:
axis=0 (the default) works fine. axis=1, etc, is clearly wrong. Am I misunderstanding how to use this, or is this a bug?
In [16]: i = rand(4,4)
In [17]: i Out[17]: array([[ 0.85367762, 0.25348857, 0.23572615, 0.50403358], [ 0.70199066, 0.81872151, 0.47357357, 0.20425537], [ 0.31042673, 0.25837984, 0.73550134, 0.57970176], [ 0.42828877, 0.60988596, 0.04059321, 0.73944219]])
In [18]: zscore(i, axis=0) Out[18]: array([[ 1.30128758, -0.96195723, -0.52119142, -0.01453907], [ 0.59653471, 1.38544585, 0.39284654, -1.55756529], [-1.22271057, -0.94164388, 1.39942427, 0.37494213], [-0.67511172, 0.51815526, -1.27107939, 1.19716222]])
In [19]: zscore(i[:,0]) Out[19]: array([ 1.30128758, 0.59653471, -1.22271057, -0.67511172])
In [20]: zscore(i[:,0])==zscore(i,axis=0)[:,0] Out[20]: array([ True, True, True, True], dtype=bool)
In [21]: zscore(i, axis=1) Out[21]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #The above is obviously wrong, as everything has a negative z score
In [22]: zscore(i[0,:]) Out[22]: array([ 1.56824016, -0.83321371, -0.90428403, 0.16925757])
In [23]: zscore(i[0,:])==zscore(i,axis=1)[0,:] Out[23]: array([False, False, False, False], dtype=bool) #Using axis=1 produces different results from taking a row directly.
In [24]: zscore(i, axis=-1) Out[24]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #Getting rows by using axis=-1 is no better (this is the same result as axis=1
This looks like a serious bug to me. I don't know what happened here (.
The docstring example also has negative numbers only.
???
I'm looking into it
Thanks for reporting
a misplaced axis: if axis>0 then it calculates x - mean/std instead of (x - mean) / std
now, how did this go through the testing ?
There is only one test for zscore, on a 1-d sample without the axis keyword. Warren
Josef
Josef
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
![](https://secure.gravatar.com/avatar/ad13088a623822caf74e635a68a55eae.jpg?s=120&d=mm&r=g)
On Wed, Nov 30, 2011 at 4:02 PM, Warren Weckesser <warren.weckesser@enthought.com> wrote:
On Wed, Nov 30, 2011 at 2:54 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 3:45 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 3:25 PM, Alacast <alacast@gmail.com> wrote:
axis=0 (the default) works fine. axis=1, etc, is clearly wrong. Am I misunderstanding how to use this, or is this a bug?
In [16]: i = rand(4,4)
In [17]: i Out[17]: array([[ 0.85367762, 0.25348857, 0.23572615, 0.50403358], [ 0.70199066, 0.81872151, 0.47357357, 0.20425537], [ 0.31042673, 0.25837984, 0.73550134, 0.57970176], [ 0.42828877, 0.60988596, 0.04059321, 0.73944219]])
In [18]: zscore(i, axis=0) Out[18]: array([[ 1.30128758, -0.96195723, -0.52119142, -0.01453907], [ 0.59653471, 1.38544585, 0.39284654, -1.55756529], [-1.22271057, -0.94164388, 1.39942427, 0.37494213], [-0.67511172, 0.51815526, -1.27107939, 1.19716222]])
In [19]: zscore(i[:,0]) Out[19]: array([ 1.30128758, 0.59653471, -1.22271057, -0.67511172])
In [20]: zscore(i[:,0])==zscore(i,axis=0)[:,0] Out[20]: array([ True, True, True, True], dtype=bool)
In [21]: zscore(i, axis=1) Out[21]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #The above is obviously wrong, as everything has a negative z score
In [22]: zscore(i[0,:]) Out[22]: array([ 1.56824016, -0.83321371, -0.90428403, 0.16925757])
In [23]: zscore(i[0,:])==zscore(i,axis=1)[0,:] Out[23]: array([False, False, False, False], dtype=bool) #Using axis=1 produces different results from taking a row directly.
In [24]: zscore(i, axis=-1) Out[24]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #Getting rows by using axis=-1 is no better (this is the same result as axis=1
This looks like a serious bug to me. I don't know what happened here (.
The docstring example also has negative numbers only.
???
I'm looking into it
Thanks for reporting
a misplaced axis: if axis>0 then it calculates x - mean/std instead of (x - mean) / std
now, how did this go through the testing ?
There is only one test for zscore, on a 1-d sample without the axis keyword.
which just show that we shouldn't trust changesets that say "stats: rewrite of zscore functions, ticket:1083 regression tests pass, still need tests for enhancements" http://projects.scipy.org/scipy/changeset/6169 my mistake (maybe January 2nd wasn't a good day.) Josef
Warren
Josef
Josef
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
![](https://secure.gravatar.com/avatar/b0f62d137f9ea1d0b6cc4e7e6f61b119.jpg?s=120&d=mm&r=g)
On Wed, Nov 30, 2011 at 3:05 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 4:02 PM, Warren Weckesser <warren.weckesser@enthought.com> wrote:
On Wed, Nov 30, 2011 at 2:54 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 3:45 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 3:25 PM, Alacast <alacast@gmail.com> wrote:
axis=0 (the default) works fine. axis=1, etc, is clearly wrong. Am I misunderstanding how to use this, or is this a bug?
In [16]: i = rand(4,4)
In [17]: i Out[17]: array([[ 0.85367762, 0.25348857, 0.23572615, 0.50403358], [ 0.70199066, 0.81872151, 0.47357357, 0.20425537], [ 0.31042673, 0.25837984, 0.73550134, 0.57970176], [ 0.42828877, 0.60988596, 0.04059321, 0.73944219]])
In [18]: zscore(i, axis=0) Out[18]: array([[ 1.30128758, -0.96195723, -0.52119142, -0.01453907], [ 0.59653471, 1.38544585, 0.39284654, -1.55756529], [-1.22271057, -0.94164388, 1.39942427, 0.37494213], [-0.67511172, 0.51815526, -1.27107939, 1.19716222]])
In [19]: zscore(i[:,0]) Out[19]: array([ 1.30128758, 0.59653471, -1.22271057, -0.67511172])
In [20]: zscore(i[:,0])==zscore(i,axis=0)[:,0] Out[20]: array([ True, True, True, True], dtype=bool)
In [21]: zscore(i, axis=1) Out[21]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #The above is obviously wrong, as everything has a negative z score
In [22]: zscore(i[0,:]) Out[22]: array([ 1.56824016, -0.83321371, -0.90428403, 0.16925757])
In [23]: zscore(i[0,:])==zscore(i,axis=1)[0,:] Out[23]: array([False, False, False, False], dtype=bool) #Using axis=1 produces different results from taking a row directly.
In [24]: zscore(i, axis=-1) Out[24]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #Getting rows by using axis=-1 is no better (this is the same result
as
axis=1
This looks like a serious bug to me. I don't know what happened here (.
The docstring example also has negative numbers only.
???
I'm looking into it
Thanks for reporting
a misplaced axis: if axis>0 then it calculates x - mean/std instead of (x - mean) / std
now, how did this go through the testing ?
There is only one test for zscore, on a 1-d sample without the axis keyword.
which just show that we shouldn't trust changesets that say
"stats: rewrite of zscore functions, ticket:1083 regression tests pass, still need tests for enhancements"
http://projects.scipy.org/scipy/changeset/6169
my mistake (maybe January 2nd wasn't a good day.)
Josef
Thanks for the link. Looks like zmap has the same bug. :( Warren
![](https://secure.gravatar.com/avatar/ad13088a623822caf74e635a68a55eae.jpg?s=120&d=mm&r=g)
On Wed, Nov 30, 2011 at 4:10 PM, Warren Weckesser <warren.weckesser@enthought.com> wrote:
On Wed, Nov 30, 2011 at 3:05 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 4:02 PM, Warren Weckesser <warren.weckesser@enthought.com> wrote:
On Wed, Nov 30, 2011 at 2:54 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 3:45 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 3:25 PM, Alacast <alacast@gmail.com> wrote:
axis=0 (the default) works fine. axis=1, etc, is clearly wrong. Am I misunderstanding how to use this, or is this a bug?
In [16]: i = rand(4,4)
In [17]: i Out[17]: array([[ 0.85367762, 0.25348857, 0.23572615, 0.50403358], [ 0.70199066, 0.81872151, 0.47357357, 0.20425537], [ 0.31042673, 0.25837984, 0.73550134, 0.57970176], [ 0.42828877, 0.60988596, 0.04059321, 0.73944219]])
In [18]: zscore(i, axis=0) Out[18]: array([[ 1.30128758, -0.96195723, -0.52119142, -0.01453907], [ 0.59653471, 1.38544585, 0.39284654, -1.55756529], [-1.22271057, -0.94164388, 1.39942427, 0.37494213], [-0.67511172, 0.51815526, -1.27107939, 1.19716222]])
In [19]: zscore(i[:,0]) Out[19]: array([ 1.30128758, 0.59653471, -1.22271057, -0.67511172])
In [20]: zscore(i[:,0])==zscore(i,axis=0)[:,0] Out[20]: array([ True, True, True, True], dtype=bool)
In [21]: zscore(i, axis=1) Out[21]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #The above is obviously wrong, as everything has a negative z score
In [22]: zscore(i[0,:]) Out[22]: array([ 1.56824016, -0.83321371, -0.90428403, 0.16925757])
In [23]: zscore(i[0,:])==zscore(i,axis=1)[0,:] Out[23]: array([False, False, False, False], dtype=bool) #Using axis=1 produces different results from taking a row directly.
In [24]: zscore(i, axis=-1) Out[24]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #Getting rows by using axis=-1 is no better (this is the same result as axis=1
This looks like a serious bug to me. I don't know what happened here (.
The docstring example also has negative numbers only.
???
I'm looking into it
Thanks for reporting
a misplaced axis: if axis>0 then it calculates x - mean/std instead of (x - mean) / std
now, how did this go through the testing ?
There is only one test for zscore, on a 1-d sample without the axis keyword.
which just show that we shouldn't trust changesets that say
"stats: rewrite of zscore functions, ticket:1083 regression tests pass, still need tests for enhancements"
http://projects.scipy.org/scipy/changeset/6169
my mistake (maybe January 2nd wasn't a good day.)
Josef
Thanks for the link. Looks like zmap has the same bug. :(
copy paste errors? I just don't know why I didn't do basic checks like this in the final version
assert_equal(zscore(x.T, axis=0).T, zscore(x, axis=1)) a = zscore(x, axis=1) a.var(1) array([ 1., 1., 1., 1.]) a.mean(1) array([ 0.00000000e+00, -1.11022302e-16, 0.00000000e+00, 1.94289029e-16])
Josef
Warren
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
![](https://secure.gravatar.com/avatar/b0f62d137f9ea1d0b6cc4e7e6f61b119.jpg?s=120&d=mm&r=g)
On Wed, Nov 30, 2011 at 3:25 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 4:10 PM, Warren Weckesser <warren.weckesser@enthought.com> wrote:
On Wed, Nov 30, 2011 at 3:05 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 4:02 PM, Warren Weckesser <warren.weckesser@enthought.com> wrote:
On Wed, Nov 30, 2011 at 2:54 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 3:45 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 3:25 PM, Alacast <alacast@gmail.com>
> axis=0 (the default) works fine. axis=1, etc, is clearly wrong. Am I > misunderstanding how to use this, or is this a bug? > > In [16]: i = rand(4,4) > > In [17]: i > Out[17]: > array([[ 0.85367762, 0.25348857, 0.23572615, 0.50403358], > [ 0.70199066, 0.81872151, 0.47357357, 0.20425537], > [ 0.31042673, 0.25837984, 0.73550134, 0.57970176], > [ 0.42828877, 0.60988596, 0.04059321, 0.73944219]]) > > In [18]: zscore(i, axis=0) > Out[18]: > array([[ 1.30128758, -0.96195723, -0.52119142, -0.01453907], > [ 0.59653471, 1.38544585, 0.39284654, -1.55756529], > [-1.22271057, -0.94164388, 1.39942427, 0.37494213], > [-0.67511172, 0.51815526, -1.27107939, 1.19716222]]) > > In [19]: zscore(i[:,0]) > Out[19]: array([ 1.30128758, 0.59653471, -1.22271057, -0.67511172]) > > In [20]: zscore(i[:,0])==zscore(i,axis=0)[:,0] > Out[20]: array([ True, True, True, True], dtype=bool) > > In [21]: zscore(i, axis=1) > Out[21]: > array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], > [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], > [-2.09968257, -2.15172946, -1.67460796, -1.83040754], > [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) > #The above is obviously wrong, as everything has a negative z score > > In [22]: zscore(i[0,:]) > Out[22]: array([ 1.56824016, -0.83321371, -0.90428403, 0.16925757]) > > In [23]: zscore(i[0,:])==zscore(i,axis=1)[0,:] > Out[23]: array([False, False, False, False], dtype=bool) > #Using axis=1 produces different results from taking a row
wrote: directly.
> > In [24]: zscore(i, axis=-1) > Out[24]: > array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], > [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], > [-2.09968257, -2.15172946, -1.67460796, -1.83040754], > [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) > #Getting rows by using axis=-1 is no better (this is the same result > as > axis=1
This looks like a serious bug to me. I don't know what happened here (.
The docstring example also has negative numbers only.
???
I'm looking into it
Thanks for reporting
a misplaced axis: if axis>0 then it calculates x - mean/std instead of (x - mean) / std
now, how did this go through the testing ?
There is only one test for zscore, on a 1-d sample without the axis keyword.
which just show that we shouldn't trust changesets that say
"stats: rewrite of zscore functions, ticket:1083 regression tests pass, still need tests for enhancements"
http://projects.scipy.org/scipy/changeset/6169
my mistake (maybe January 2nd wasn't a good day.)
Josef
Thanks for the link. Looks like zmap has the same bug. :(
copy paste errors?
I just don't know why I didn't do basic checks like this in the final version
assert_equal(zscore(x.T, axis=0).T, zscore(x, axis=1)) a = zscore(x, axis=1) a.var(1) array([ 1., 1., 1., 1.]) a.mean(1) array([ 0.00000000e+00, -1.11022302e-16, 0.00000000e+00, 1.94289029e-16])
Josef
Ticket: http://projects.scipy.org/scipy/ticket/1575 Pull request: https://github.com/scipy/scipy/pull/116 Warren
![](https://secure.gravatar.com/avatar/ad13088a623822caf74e635a68a55eae.jpg?s=120&d=mm&r=g)
On Sat, Dec 17, 2011 at 11:20 AM, Warren Weckesser <warren.weckesser@enthought.com> wrote:
On Wed, Nov 30, 2011 at 3:25 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 4:10 PM, Warren Weckesser <warren.weckesser@enthought.com> wrote:
On Wed, Nov 30, 2011 at 3:05 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 4:02 PM, Warren Weckesser <warren.weckesser@enthought.com> wrote:
On Wed, Nov 30, 2011 at 2:54 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 3:45 PM, <josef.pktd@gmail.com> wrote: > On Wed, Nov 30, 2011 at 3:25 PM, Alacast <alacast@gmail.com> > wrote: >> axis=0 (the default) works fine. axis=1, etc, is clearly wrong. >> Am I >> misunderstanding how to use this, or is this a bug? >> >> In [16]: i = rand(4,4) >> >> In [17]: i >> Out[17]: >> array([[ 0.85367762, 0.25348857, 0.23572615, 0.50403358], >> [ 0.70199066, 0.81872151, 0.47357357, 0.20425537], >> [ 0.31042673, 0.25837984, 0.73550134, 0.57970176], >> [ 0.42828877, 0.60988596, 0.04059321, 0.73944219]]) >> >> In [18]: zscore(i, axis=0) >> Out[18]: >> array([[ 1.30128758, -0.96195723, -0.52119142, -0.01453907], >> [ 0.59653471, 1.38544585, 0.39284654, -1.55756529], >> [-1.22271057, -0.94164388, 1.39942427, 0.37494213], >> [-0.67511172, 0.51815526, -1.27107939, 1.19716222]]) >> >> In [19]: zscore(i[:,0]) >> Out[19]: array([ 1.30128758, 0.59653471, -1.22271057, >> -0.67511172]) >> >> In [20]: zscore(i[:,0])==zscore(i,axis=0)[:,0] >> Out[20]: array([ True, True, True, True], dtype=bool) >> >> In [21]: zscore(i, axis=1) >> Out[21]: >> array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], >> [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], >> [-2.09968257, -2.15172946, -1.67460796, -1.83040754], >> [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) >> #The above is obviously wrong, as everything has a negative z >> score >> >> In [22]: zscore(i[0,:]) >> Out[22]: array([ 1.56824016, -0.83321371, -0.90428403, >> 0.16925757]) >> >> In [23]: zscore(i[0,:])==zscore(i,axis=1)[0,:] >> Out[23]: array([False, False, False, False], dtype=bool) >> #Using axis=1 produces different results from taking a row >> directly. >> >> In [24]: zscore(i, axis=-1) >> Out[24]: >> array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], >> [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], >> [-2.09968257, -2.15172946, -1.67460796, -1.83040754], >> [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) >> #Getting rows by using axis=-1 is no better (this is the same >> result >> as >> axis=1 > > This looks like a serious bug to me. I don't know what happened > here > (. > > The docstring example also has negative numbers only. > > ??? > > I'm looking into it > > Thanks for reporting
a misplaced axis: if axis>0 then it calculates x - mean/std instead of (x - mean) / std
now, how did this go through the testing ?
There is only one test for zscore, on a 1-d sample without the axis keyword.
which just show that we shouldn't trust changesets that say
"stats: rewrite of zscore functions, ticket:1083 regression tests pass, still need tests for enhancements"
http://projects.scipy.org/scipy/changeset/6169
my mistake (maybe January 2nd wasn't a good day.)
Josef
Thanks for the link. Looks like zmap has the same bug. :(
copy paste errors?
I just don't know why I didn't do basic checks like this in the final version
assert_equal(zscore(x.T, axis=0).T, zscore(x, axis=1)) a = zscore(x, axis=1) a.var(1) array([ 1., 1., 1., 1.]) a.mean(1) array([ 0.00000000e+00, -1.11022302e-16, 0.00000000e+00, 1.94289029e-16])
Josef
Ticket: http://projects.scipy.org/scipy/ticket/1575 Pull request: https://github.com/scipy/scipy/pull/116
Thanks Warren, good to see you and Ralf taking care of stats. Josef
Warren
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
![](https://secure.gravatar.com/avatar/b0f62d137f9ea1d0b6cc4e7e6f61b119.jpg?s=120&d=mm&r=g)
On Wed, Nov 30, 2011 at 2:45 PM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 30, 2011 at 3:25 PM, Alacast <alacast@gmail.com> wrote:
axis=0 (the default) works fine. axis=1, etc, is clearly wrong. Am I misunderstanding how to use this, or is this a bug?
In [16]: i = rand(4,4)
In [17]: i Out[17]: array([[ 0.85367762, 0.25348857, 0.23572615, 0.50403358], [ 0.70199066, 0.81872151, 0.47357357, 0.20425537], [ 0.31042673, 0.25837984, 0.73550134, 0.57970176], [ 0.42828877, 0.60988596, 0.04059321, 0.73944219]])
In [18]: zscore(i, axis=0) Out[18]: array([[ 1.30128758, -0.96195723, -0.52119142, -0.01453907], [ 0.59653471, 1.38544585, 0.39284654, -1.55756529], [-1.22271057, -0.94164388, 1.39942427, 0.37494213], [-0.67511172, 0.51815526, -1.27107939, 1.19716222]])
In [19]: zscore(i[:,0]) Out[19]: array([ 1.30128758, 0.59653471, -1.22271057, -0.67511172])
In [20]: zscore(i[:,0])==zscore(i,axis=0)[:,0] Out[20]: array([ True, True, True, True], dtype=bool)
In [21]: zscore(i, axis=1) Out[21]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #The above is obviously wrong, as everything has a negative z score
In [22]: zscore(i[0,:]) Out[22]: array([ 1.56824016, -0.83321371, -0.90428403, 0.16925757])
In [23]: zscore(i[0,:])==zscore(i,axis=1)[0,:] Out[23]: array([False, False, False, False], dtype=bool) #Using axis=1 produces different results from taking a row directly.
In [24]: zscore(i, axis=-1) Out[24]: array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], [-2.09968257, -2.15172946, -1.67460796, -1.83040754], [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) #Getting rows by using axis=-1 is no better (this is the same result as axis=1
This looks like a serious bug to me. I don't know what happened here (.
The docstring example also has negative numbers only.
???
I'm looking into it
This is a bug in zscore. There is a misplaced parenthesis in the code. This return ((a - np.expand_dims(mns, axis=axis) / np.expand_dims(sstd,axis=axis))) should be this return ((a - np.expand_dims(mns, axis=axis)) / np.expand_dims(sstd,axis=axis)) Warren
Thanks for reporting
Josef
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
participants (3)
-
Alacast
-
josef.pktd@gmail.com
-
Warren Weckesser