[Tutor] Advice on multi-dimensional data storage

Oscar Benjamin oscar.j.benjamin at gmail.com
Thu Mar 17 11:27:37 EDT 2016


On 16 March 2016 at 13:21, Steven D'Aprano <steve at pearwood.info> wrote:
> On Wed, Mar 16, 2016 at 08:36:59AM +0000, Matt Williams wrote:
>> Dear Tutors,
>>
>> I am looking for some advice. I have some data that has three dimensions to
>> it. I would like to store it such that one could manipulate (query/ update/
>> etc.) by dimension - so it would be feasible to ask for all of the data
>> that shares a value in d1, or iterate over all of the values via d2.

Can you give a bit more information about this? Do you want persistent
(i.e. on the hard-disk) storage like Alan's database idea? Or do you
want an in memory data structure like Steve's lists of lists idea?
What do you want to store in this data structure (numbers, strings,
other Python objects, ...)?

I'll assume as Steve did that you want to store numbers in an
in-memory data structure. Steve proposed this list of lists construct:

> arr3D = [
>          # Block 0, 3 rows by 4 columns.
>          [ [1, 2, 3, 4],
>            [5, 6, 7, 8],
>            [9, 10, 11, 12] ],
>          # Block 1.
>          [ [0, 0, 0, 0],
>            [1, 2, 4, 8],
>            [2, 4, 6, 8] ],
>          # Block 2.
>          [ [0, 1, 2, 3],
>            [1, 2, 4, 8],
>            [2, 6, 10, 14] ]
>          ]

With numpy we can go one better and turn this into a true 3D array:

>>> import numpy as np
>>> nparr3D = np.array(arr3D)
>>> nparr3D
array([[[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]],

       [[ 0,  0,  0,  0],
        [ 1,  2,  4,  8],
        [ 2,  4,  6,  8]],

       [[ 0,  1,  2,  3],
        [ 1,  2,  4,  8],
        [ 2,  6, 10, 14]]])

Now that we have this we can do all sorts of things with it. Think of
your 3D array as pages where each page has rows and columns. We can
select a page:

>>> nparr3D[0]
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

Or we can select an element:

>>> nparr3D[0,2,3]
12

We can select the first row of each page making a 2D array:

>>> nparr3D[:,0,:]
array([[1, 2, 3, 4],
       [0, 0, 0, 0],
       [0, 1, 2, 3]])

We can select the last column:

>>> nparr3D[:,:,3]
array([[ 4,  8, 12],
       [ 0,  8,  8],
       [ 3,  8, 14]])
>>> nparr3D[:,:,3].transpose()
array([[ 4,  0,  3],
       [ 8,  8,  8],
       [12,  8, 14]])

We can also modify the array. Let's add 100 to the first page:

>>> nparr3D[0, :, :] += 100
>>> nparr3D
array([[[101, 102, 103, 104],
        [105, 106, 107, 108],
        [109, 110, 111, 112]],

       [[  0,   0,   0,   0],
        [  1,   2,   4,   8],
        [  2,   4,   6,   8]],

       [[  0,   1,   2,   3],
        [  1,   2,   4,   8],
        [  2,   6,  10,  14]]])

There are many more things that you can do here if you learn to use numpy.

--
Oscar


More information about the Tutor mailing list