[Tutor] a generic table data structure for table / 2d array / lookup table?
Kent Johnson
kent37 at tds.net
Sat May 28 13:07:53 CEST 2005
Marcus Goldfish wrote:
> Before I try to reinvent the wheel, can anyone point me to a data
> structure suitable for storing non-numeric, 2-d arrays. For instance,
> something that can store the following:
>
> A B C D
> 1 'cat' 3 object 9
> J 4 [1] 5 6
>
> where the column and row labels in this example are ['A','B','C','D']
> and [1,'J'], respectively. I need to access (set and get values) by
> cell, row, and column.
>
> I have a solution using 2-tuple keys and a dict, e.g., d[('A',1)], but
> it seems kludgy and doesn't handle the row/column access.
>
> Any pointers or code snippets would be appreciated!
You caught me in a good mood this morning. I woke to sunshine for the first time in many days, that
might have something to do with it :-)
Here is a dict subclass that extends __getitem__ and __setitem__ to allow setting an entire row. I
included extensive doctests to show you what it does.
Note: you can write d['A',1] instead of d[('A',1)], which looks a little cleaner.
Kent
class Grid(dict):
"""
A two-dimensional array that can be accessed by row, by column, or by cell.
Create with lists of row and column names plus any valid dict() constructor args.
>>> data = Grid( ['A', 'B'], [1, 2] )
Row and column lists must not have any values in common.
>>> data = Grid([1, 2], [2, 3])
Traceback (most recent call last):
...
ValueError: Row and column lists must not have any values in common
Here is an example with data:
>>> rowNames = ['A','B','C','D']
>>> colNames = [1,'J']
>>> rawData = [ 'cat', 3, object, 9, 4, [1], 5, 6 ]
>>> indices = [ (row, col) for col in colNames for row in rowNames ]
>>> data = Grid(rowNames, colNames, zip(indices, rawData))
Data can be accessed by cell:
>>> for i in indices:
... print i, data[i]
('A', 1) cat
('B', 1) 3
('C', 1) <type 'object'>
('D', 1) 9
('A', 'J') 4
('B', 'J') [1]
('C', 'J') 5
('D', 'J') 6
>>> data['B', 'J'] = 5
Cell indices must contain valid row and column names:
>>> data[3]
Traceback (most recent call last):
...
KeyError: 3
>>> data['C', 2] = 5
Traceback (most recent call last):
...
ValueError: Invalid key or value: Grid[('C', 2)] = 5
Data can be accessed by row or column index alone to set or retrieve
an entire row or column:
>>> print data['A']
['cat', 4]
>>> print data[1]
['cat', 3, <type 'object'>, 9]
>>> data['A'] = ['dog', 2]
>>> print data['A']
['dog', 2]
When setting a row or column, data must be the correct length.
>>> data['A'] = ['dog']
Traceback (most recent call last):
...
ValueError: Invalid key or value: Grid['A'] = ['dog']
"""
def __init__(self, rowNames, colNames, *args, **kwds):
dict.__init__(self, *args, **kwds)
self.rowNames = list(rowNames)
self.colNames = list(colNames)
# Check for no shared row and col names
if set(rowNames).intersection(colNames):
raise ValueError, 'Row and column lists must not have any values in common'
def __getitem__(self, key):
if self._isCellKey(key):
return dict.__getitem__(self, key)
elif key in self.rowNames:
return [ dict.__getitem__(self, (key, col)) for col in self.colNames ]
elif key in self.colNames:
return [ dict.__getitem__(self, (row, key)) for row in self.rowNames ]
else:
raise KeyError, key
def __setitem__(self, key, value):
if self._isCellKey(key):
return dict.__setitem__(self, key, value)
elif key in self.rowNames and len(value) == len(self.colNames):
for col, val in zip(self.colNames, value):
dict.__setitem__(self, (key, col), val)
elif key in self.colNames and len(value) == len(self.rowNames):
for row, val in zip(self.rowNames, value):
dict.__setitem__(self, (row, key), val)
else:
raise ValueError, 'Invalid key or value: Grid[%r] = %r' % (key, value)
def _isCellKey(self, key):
''' Is key a valid cell index? '''
return isinstance(key, tuple) \
and len(key) == 2 \
and key[0] in self.rowNames \
and key[1] in self.colNames
if __name__ == '__main__':
import doctest
doctest.testmod()
More information about the Tutor
mailing list