I have a question about histogram2d. Say I do something like: import numpy from numpy import random import pylab x=random.rand(1000)-0.5 y=random.rand(1000)*10-5 xbins=numpy.linspace(-10,10,100) ybins=numpy.linspace(-10,10,100) h,x,y=numpy.histogram2d(x,y,bins=[xbins,ybins]) pylab.imshow(h,interpolation='nearest') pylab.show() The output is attached. I think I would have expected the transpose of what numpy histogram2d returned, so the tight x distribution appears along the x axis in the image. Maybe I am thinking about this incorrectly, or there is a convention I am unfamiliar with. If the behavior is correct, could the docstring include a comment explaining the orientation of the histogram array? Thanks, Darren -- Darren S. Dale, Ph.D. Staff Scientist Cornell High Energy Synchrotron Source Cornell University 275 Wilson Lab Rt. 366 & Pine Tree Road Ithaca, NY 14853 darren.dale@cornell.edu office: (607) 255-3819 fax: (607) 255-9001 http://www.chess.cornell.edu
I'm new to using numpy. Today I experimented a bit with indexing motivated by the finding that although a[a>0.5] and a[where(a>0.5)] give the same expected result (elements of a greater than 0.5) a[argwhere(a>0.5)] results in something else (rows of a in different order). I tried to figure out when indexing will yield rows and when it will give me an element and I could not find a simple rule. I systematically tried and got the follwing: ----------------------------------
from scipy import * a = random.rand(10).reshape(2,5) a array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]])
a[0,1] # shape([0,1]) = (2,) 0.767957427399
a[[0],[1]] # shape([[0],[1]]) = (2, 1) array([ 0.76795743])
a[[0,1]] # shape([[0,1]]) = (1, 2) array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]])
a[[[0,1]]] # shape([[[0,1]]]) = (1, 1, 2) array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]])
a[[[0],[1]]] # shape([[[0],[1]]]) = (1, 2, 1) array([ 0.76795743])
a[[[0]],[[1]]] # shape([[[0]],[[1]]]) = (2, 1, 1) array([[ 0.76795743]])
a[[[[0,1]]]] # shape([[[[0,1]]]]) = (1, 1, 1, 2) array([[[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]])
a[[[[0],[1]]]] # shape([[[[0],[1]]]]) = (1, 1, 2, 1) array([[[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062]],
[[ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]])
a[[[[0]],[[1]]]] # shape([[[[0]],[[1]]]]) = (1, 2, 1, 1) array([[ 0.76795743]])
a[[[[0]]],[[[1]]]] # shape([[[[0]]],[[[1]]]]) = (2, 1, 1, 1) array([[[ 0.76795743]]])
Can anyone explain this? Thank you very much, Raul
On Fri, May 30, 2008 at 12:36 AM, Raul Kompass <rkompass@gmx.de> wrote:
I'm new to using numpy. Today I experimented a bit with indexing motivated by the finding that although a[a>0.5] and a[where(a>0.5)] give the same expected result (elements of a greater than 0.5) a[argwhere(a>0.5)] results in something else (rows of a in different order).
I tried to figure out when indexing will yield rows and when it will give me an element and I could not find a simple rule.
I systematically tried and got the follwing: ----------------------------------
from scipy import * a = random.rand(10).reshape(2,5) a array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]])
a[[0],[1]] # shape([[0],[1]]) = (2, 1) array([ 0.76795743])
a[[0,1]] # shape([[0,1]]) = (1, 2) array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]])
a[[[0,1]]] # shape([[[0,1]]]) = (1, 1, 2) array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]])
a[[[0],[1]]] # shape([[[0],[1]]]) = (1, 2, 1) array([ 0.76795743])
a[[[0]],[[1]]] # shape([[[0]],[[1]]]) = (2, 1, 1) array([[ 0.76795743]])
a[[[[0,1]]]] # shape([[[[0,1]]]]) = (1, 1, 1, 2) array([[[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]])
a[[[[0],[1]]]] # shape([[[[0],[1]]]]) = (1, 1, 2, 1) array([[[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062]],
[[ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]])
a[[[[0]],[[1]]]] # shape([[[[0]],[[1]]]]) = (1, 2, 1, 1) array([[ 0.76795743]])
a[[[[0]]],[[[1]]]] # shape([[[[0]]],[[[1]]]]) = (2, 1, 1, 1) array([[[ 0.76795743]]])
Can anyone explain this?
Thank you very much,
Hi, I don't have time to give a comprehensive answer - but I think I can offer a simple rule. The thing you are indexing (a) is 2 dimensional, so if you provide 2 arguments to index with (ie a[something, something]) you will get single elements - if you only provide a single argument (ie a[something]) it will pull out rows corresponding to the indexing. If you want just a specific element you have to add a second argument. Also - the outer [ ]'s in your indexing operations are just the syntax for indexing. So your shape comments are wrong:
a[0,1] # shape([0,1]) = (2,) 0.767957427399 you are indexing here with two scalars, 0,1.
a[[0,1]] # shape([[0,1]]) = (1, 2) array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]])
You are indexing here with a 1d list [0,1]. Since you don't provide a column index you get rows 0 and 1. If you do a[ [0,1] , [0,1] ] then you get element [0,0] and element [0,1]. Hope this helps, Robin
On Fri, May 30, 2008 at 12:57 AM, Robin <robince@gmail.com> wrote:
You are indexing here with a 1d list [0,1]. Since you don't provide a column index you get rows 0 and 1. If you do a[ [0,1] , [0,1] ] then you get element [0,0] and element [0,1].
Whoops - you get [0,0] and [1,1]. Robin
On Thu, May 29, 2008 at 4:36 PM, Raul Kompass <rkompass@gmx.de> wrote:
I'm new to using numpy. Today I experimented a bit with indexing motivated by the finding that although a[a>0.5] and a[where(a>0.5)] give the same expected result (elements of a greater than 0.5) a[argwhere(a>0.5)] results in something else (rows of a in different order).
I tried to figure out when indexing will yield rows and when it will give me an element and I could not find a simple rule.
I systematically tried and got the follwing: ----------------------------------
from scipy import * a = random.rand(10).reshape(2,5) a array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]])
a[0,1] # shape([0,1]) = (2,) 0.767957427399
a[[0],[1]] # shape([[0],[1]]) = (2, 1) array([ 0.76795743])
a[[0,1]] # shape([[0,1]]) = (1, 2) array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]])
a[[[0,1]]] # shape([[[0,1]]]) = (1, 1, 2) array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]])
a[[[0],[1]]] # shape([[[0],[1]]]) = (1, 2, 1) array([ 0.76795743])
a[[[0]],[[1]]] # shape([[[0]],[[1]]]) = (2, 1, 1) array([[ 0.76795743]])
a[[[[0,1]]]] # shape([[[[0,1]]]]) = (1, 1, 1, 2) array([[[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]])
a[[[[0],[1]]]] # shape([[[[0],[1]]]]) = (1, 1, 2, 1) array([[[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062]],
[[ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]])
a[[[[0]],[[1]]]] # shape([[[[0]],[[1]]]]) = (1, 2, 1, 1) array([[ 0.76795743]])
a[[[[0]]],[[[1]]]] # shape([[[[0]]],[[[1]]]]) = (2, 1, 1, 1) array([[[ 0.76795743]]])
Looks confusing to me too. I guess it's best to take it one step at a time.
import numpy as np a = np.arange(6).reshape(2,3) a[0,1] 1
That's not surprising.
a[[0,1]]
That one looks odd. But it is just shorthand for:
a[[0,1],:]
So rows 0 and 1 and all columns. array([[0, 1, 2], [3, 4, 5]]) This gives the same thing:
a[0:2,:]
array([[0, 1, 2], [3, 4, 5]]) Only it's not quite the same thing. a[[0,1],:] returns a copy and a[0:2,:] returns a view
a[[0,1],:].flags.owndata True a[0:2,:].flags.owndata False
On Thu, 29 May 2008, Keith Goodman apparently wrote:
a[[0,1]] That one looks odd. But it is just shorthand for: a[[0,1],:]
Do you mean that ``a[[0,1],:]`` is a more primitive expression than ``a[[0,1]]``? In what sense, and does it ever matter? Is ``a[[0,1]]`` completely equivalent to ``a[[0,1],...]`` and ``a[[0,1],:]``? Thanks, Alan Isaac
On Thu, May 29, 2008 at 6:32 PM, Alan G Isaac <aisaac@american.edu> wrote:
On Thu, 29 May 2008, Keith Goodman apparently wrote:
a[[0,1]] That one looks odd. But it is just shorthand for: a[[0,1],:]
Do you mean that ``a[[0,1],:]`` is a more primitive expression than ``a[[0,1]]``? In what sense, and does it ever matter?
Is ``a[[0,1]]`` completely equivalent to ``a[[0,1],...]`` and ``a[[0,1],:]``?
I can see how the difference between a[0,1] a[[0,1]] is not obvious at first, especially if you come from octave/matlab. The first example has an obvious i and j. But the second example doesn't. So I tried to point out that i=[0,1] and j=:.
On Thu, 29 May 2008, Keith Goodman apparently wrote:
a[[0,1]] That one looks odd. But it is just shorthand for: a[[0,1],:]
On Thu, May 29, 2008 at 6:32 PM, Alan G Isaac <aisaac@american.edu> wrote:
Do you mean that ``a[[0,1],:]`` is a more primitive expression than ``a[[0,1]]``? In what sense, and does it ever matter?
Is ``a[[0,1]]`` completely equivalent to ``a[[0,1],...]`` and ``a[[0,1],:]``?
On Thu, 29 May 2008, Keith Goodman apparently wrote:
I can see how the difference between
a[0,1] a[[0,1]]
is not obvious at first, especially if you come from octave/matlab. The first example has an obvious i and j. But the second example doesn't. So I tried to point out that i=[0,1] and j=:.
My questions were real questions, not rhetorical. Anyway... I think the initial mind-bender is the difference between a[(0,1)] and a[[0,1]]. The latter might also be written a[[0,1],], which I think links to your point. Cheers, Alan
On Thu, May 29, 2008 at 8:50 PM, Alan G Isaac <aisaac@american.edu> wrote:
Is ``a[[0,1]]`` completely equivalent to ``a[[0,1],...]`` and ``a[[0,1],:]``?
They look, smell, and taste the same. But I can't read array's __getitem__ since it is in C instead of python.
np.index_exp[[0,1]] ([0, 1],) np.index_exp[[0,1],:] ([0, 1], slice(None, None, None)) np.index_exp[[0,1],...] ([0, 1], Ellipsis)
a[[0,1]].flags
C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
a[[0,1],:].flags
C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
a[[0,1],...].flags
C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False Now, pop quiz, what does this mean:
a[[[[[[[[[[[[[[[[0,1]]]]]]]]]]]]]]]]
array([[[[[[[[[[[[[[[0, 1, 2], [3, 4, 5]]]]]]]]]]]]]]]) ?
Hi Raul, There are a some points that might help you with indexing: 1) a[obj] is (basically) equivalent to a.__getitem__(numpy.index_exp[obj]) 2) obj is always converted to a tuple if it isn't one already: * numpy.index_exp[0,1] == (0,1) * numpy.index_exp[(0,1)] == (0,1) * numpy.index_exp[[0,1]] == ([0,1],) 3) There are two basic kinds of indexing: a) Simple or slice-based indexing where the indexing tuple (obj) consists of just integers, slice objects, or Ellipses and the returned array is a "view" of the original array (no memory is copied). b) Fancy or advanced indexing which occurs when anything else (e.g. a list) is used in the indexing tuple and the returned array is a "copy" of the original array for largely technical reasons. 4) If the length of the indexing tuple is smaller than the number of dimensions in the array, the remaining un-indexed dimensions are returned. It is equivalent to appending slice(None) to the indexing tuple. 5) For fancy indexing using lists (and nested lists) in the indexing tuple, the shape of the array is the shape of the indexing (nested) list plus the shape of the un-indexed dimensions. Raul Kompass wrote:
I systematically tried and got the follwing: ----------------------------------
from scipy import * a = random.rand(10).reshape(2,5) a array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]])
a[0,1] # shape([0,1]) = (2,) 0.767957427399
Equivalent to a[(0,1)] so the indexing tuple selects a single element of the 2d array.
a[[0],[1]] # shape([[0],[1]]) = (2, 1) array([ 0.76795743])
Equivalent to a[([0], [1])] so the indexing tuple selects the same single element of the 2d array as before except now it is a 1-d array because fancy indexing is used [0] and [1] are lists.
a[[0,1]] # shape([[0,1]]) = (1, 2) array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]])
Equivalent to a[([0,1],)] so the indexing tuple is of length 1 and the shape of the resulting array is 2-d (the indexing list is 1-d and the un-indexed portion is 1-d). Rows 0 and 1 are selected from a. Equivalent to stacking a[0] and a[1] on top of each other.
a[[[0,1]]] # shape([[[0,1]]]) = (1, 1, 2) array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]])
The shape here I can't quite explain at the moment especially because a[ [[0,1]],] is shaped differently and probably shouldn't be. It looks like a[ <nested_list> ] has one smaller dimension than a[ <nested_list>, ] (notice the comma...) The rest of them follow from this pattern. -Travis
Hi Darren, If I remember correctly, the thinking under the current behavior is that it preserves similarity of results with histogramdd, where the histogram is oriented in the numpy order (columns, rows). I thought that making histogram2d(x,y) return something different than histogramdd([x,y]) was probably worst than satisfying the cartesian convention. Regards, David 2008/5/29 Darren Dale <darren.dale@cornell.edu>:
I have a question about histogram2d. Say I do something like:
import numpy from numpy import random import pylab
x=random.rand(1000)-0.5 y=random.rand(1000)*10-5
xbins=numpy.linspace(-10,10,100) ybins=numpy.linspace(-10,10,100) h,x,y=numpy.histogram2d(x,y,bins=[xbins,ybins])
pylab.imshow(h,interpolation='nearest') pylab.show()
The output is attached. I think I would have expected the transpose of what numpy histogram2d returned, so the tight x distribution appears along the x axis in the image. Maybe I am thinking about this incorrectly, or there is a convention I am unfamiliar with. If the behavior is correct, could the docstring include a comment explaining the orientation of the histogram array?
Thanks, Darren
-- Darren S. Dale, Ph.D. Staff Scientist Cornell High Energy Synchrotron Source Cornell University 275 Wilson Lab Rt. 366 & Pine Tree Road Ithaca, NY 14853
darren.dale@cornell.edu office: (607) 255-3819 fax: (607) 255-9001 http://www.chess.cornell.edu
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Hi David, In that case, I suggest histogram2d could be improved with a brief comment in the docstring to indicate how the output is formatted. Cheers, Darren On Thursday 29 May 2008 8:21:58 pm David Huard wrote:
Hi Darren,
If I remember correctly, the thinking under the current behavior is that it preserves similarity of results with histogramdd, where the histogram is oriented in the numpy order (columns, rows). I thought that making histogram2d(x,y) return something different than histogramdd([x,y]) was probably worst than satisfying the cartesian convention.
Regards,
David
2008/5/29 Darren Dale <darren.dale@cornell.edu>:
I have a question about histogram2d. Say I do something like:
import numpy from numpy import random import pylab
x=random.rand(1000)-0.5 y=random.rand(1000)*10-5
xbins=numpy.linspace(-10,10,100) ybins=numpy.linspace(-10,10,100) h,x,y=numpy.histogram2d(x,y,bins=[xbins,ybins])
pylab.imshow(h,interpolation='nearest') pylab.show()
The output is attached. I think I would have expected the transpose of what numpy histogram2d returned, so the tight x distribution appears along the x axis in the image. Maybe I am thinking about this incorrectly, or there is a convention I am unfamiliar with. If the behavior is correct, could the docstring include a comment explaining the orientation of the histogram array?
Thanks, Darren
-- Darren S. Dale, Ph.D. Staff Scientist Cornell High Energy Synchrotron Source Cornell University 275 Wilson Lab Rt. 366 & Pine Tree Road Ithaca, NY 14853
darren.dale@cornell.edu office: (607) 255-3819 fax: (607) 255-9001 http://www.chess.cornell.edu
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
participants (7)
-
Alan G Isaac -
Darren Dale -
David Huard -
Keith Goodman -
Raul Kompass -
Robin -
Travis E. Oliphant