[Numpy-discussion] how to efficiently build an array of x, y, z points

Bruce Southey bsouthey at gmail.com
Wed Mar 3 09:51:19 EST 2010

On 03/02/2010 09:47 PM, David Goldsmith wrote:
> On Tue, Mar 2, 2010 at 6:59 PM, Brennan Williams 
> <brennan.williams at visualreservoir.com 
> <mailto:brennan.williams at visualreservoir.com>> wrote:
>     David Goldsmith wrote:
>     >
>     > On Tue, Mar 2, 2010 at 6:29 PM, Brennan Williams
>     > <brennan.williams at visualreservoir.com
>     <mailto:brennan.williams at visualreservoir.com>
>     > <mailto:brennan.williams at visualreservoir.com
>     <mailto:brennan.williams at visualreservoir.com>>> wrote:
>     >
>     >     I'm reading a file which contains a grid definition. Each
>     cell in the
>     >     grid, apart from having an i,j,k index also has 8 x,y,z
>     coordinates.
>     >     I'm reading each set of coordinates into a numpy array. I
>     then want to
>     >     add/append those coordinates to what will be my large
>     "points" array.
>     >     Due to the orientation/order of the 8 corners of each hexahedral
>     >     cell I
>     >     may have to reorder them before adding them to my large
>     points array
>     >     (not sure about that yet).
>     >
>     >     Should I create a numpy array with nothing in it and then
>     .append
>     >     to it?
>     >     But this is probably expensive isn't it as it creates a new copy
>     >     of the
>     >     array each time?
>     >
>     >     Or should I create a zero or empty array of sufficient size and
>     >     then put
>     >     each set of 8 coordinates into the correct position in that
>     big array?
>     >
>     >     I don't know exactly how big the array will be (some cells are
>     >     inactive
>     >     and therefore don't have a geometry defined) but I do know
>     what its
>     >     maximum size is (ni*nj*nk,3).
>     >
>     >
>     > Someone will correct me if I'm wrong, but this problem - the "best"
>     > way to build a large array whose size is not known beforehand - came
>     > up in one of the tutorials at SciPyCon '09 and IIRC the answer was,
>     > perhaps surprisingly, build the thing as a Python list (which is
>     > optimized for this kind of indeterminate sequence building) and
>     > convert to a numpy array when you're done.  Isn't that what was
>     > recommended, folks?
>     >
>     Build a list of floating point values, then convert to an array and
>     shape accordingly? Or build a list of small arrays and then somehow
>     convert that into a big numpy array?
> My guess is that either way will be better than iteratively 
> "appending" to an existing array.
Christopher Barker provided some code last last year on appending 
ndarrays eg:

A lot depends on your final usage of the array otherwise there are no 
suitable suggestions. That is do you need just to index the array using 
i, j, k indices (this gives you either an i by j by k array that 
contains the x, y, z coordinates) or do you also need to index the x, y, 
z coordinates as well (giving you an i by j by k by x by y by z array). 
If it is just plain storage then perhaps just a Python list, dict or 
sqlite object may be sufficient.

There are also time and memory constraints as you can spend large effort 
just to get the input into a suitable format and memory usage. If you 
use a secondary storage like a Python list then you need memory to 
storage the list, the ndarray and all intermediate components and overheads.

If you use scipy then you should look at using sparse arrays where space 
is only added as you need it.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100303/48e4ad34/attachment.html>

More information about the NumPy-Discussion mailing list