During my NumPy Tutorial at the SciPy conference last month, somebody
asked the question about the memory requirements of index arrays that I
gave the wrong impression about. Here is the context and the correct
response that should alleviate concerns about large cross-product index
I was noting how copy-based (advanced) indexing using index arrays works
in multiple-dimensions by creating an array of the same-shape of the
input index arrays constructed by selecting the elements indicated by
respective elements of the index arrays.
If a is 2-d, then
a[[10,12,14],[13, 15, 17]]
returns a 1-d array with elements
[a[10,13], a[12,15], a[14,17]].
This is *not* the cross-product that some would expect. The
cross-product can be generated using the ix_ function
is equivalent to
which will return
[[a[10,13] a[10,15], a[10,17]],
[a[12,13] a[12,15], a[12,17]],
[a[14,13] a[14,15], a[14,17]]]
The concern mentioned at the conference was that the cross-product would
generate large intermediate index arrays for large input arrays to ix_.
At the time, I think I validated the concern. However, the concern is
unfounded. This is because the cross product function does not actually
create a large intermediate array, but uses the broad-casting
implementation of indexing to generate the 2-d indexing array
"on-the-fly" (much like ogrid and other tools in NumPy).
]), array([[13, 15, 17]]))
The first indexing array is 3x1, while the second is 1x3. The result
array will be 3x3, but the 2-d indexing array is never actually stored.
Just to set my mind at ease about possible mis-information I spread
during the tutorial, and give a little tutorial on advanced indexing.