What is the best data structure for a very simple spreadsheet?
vsoler
vicente.soler at gmail.com
Sun Jan 3 07:38:28 EST 2010
On Jan 3, 1:28 pm, Steven D'Aprano <st... at REMOVE-THIS-
cybersource.com.au> wrote:
> On Sun, 03 Jan 2010 03:27:46 -0800, vsoler wrote:
> > My application would contain a limited set of "cells" represented by the
> > instances of a Cell class:
>
> > class Cell:
> > ...
>
> > A1=Cell(7)
> > A2=Cell(2*A1)
> > A3=Cell(3*A1+A2)
> > A4=Cell(A3*4)
>
> > Of course, A1 = 7, A2 = 14, A3 = 35 and A4 = 140
>
> > Now, I somehow want to be able to show a dependency tree
>
> > 1 level dependency trees
> > A1: None
> > A2: A1
> > A3: A1, A2
> > A4: A3
>
> > All levels dependency trees
>
> > A1: None
> > A2: A1
> > A3: A1, A2
> > A4: A3, A2, A1
>
> > Leaf + values dependency trees:
>
> > A1: 7
> > A2: A1=7, 2
> > A3: 3, A1=7, 2
> > A4: 3, A1=7, 2, 4
>
> > What I'd like to know is:
>
> > 1) what are, in your opinion, the basic elements of the Cell class?
>
> def Cell(object):
> def __init__(self, payload):
> self.payload = payload
> def __str__(self):
> return str(self.payload)
> def __float__(self):
> return float(self.payload)
> def dependency(self):
> try:
> return self.payload.dependency()
> except AttributeError:
> return ['None']
>
> Cells can contain one of three things: a number, a string, or a formula.
> The first two can be supported by providing a built-in Python object
> (float or str) as payload. You can support formulae two ways that I can
> think of:
>
> (1) Provide a formula as a string, with a leading '='. Then, whenever you
> want to evaluate such a cell, you fetch the string from the cell, parse
> it, generate an arithmetic expression, and calculate it.
>
> (2) Instead of parsing the formula on every single spreadsheet refresh,
> use a couple of helper classes:
>
> class Indirect(object):
> def __init__(self, ref, sheet=None):
> if sheet is None:
> self.sheet = default_sheet()
> else:
> self.sheet = sheet
> self.ref = ref
> def __str__(self):
> return str(self.sheet[self.ref])
> def float(self):
> return float(self.sheet[self.ref])
> def dependency(self):
> return [self.ref]
>
> class Formula(object):
> def __init__(self, x, y, op):
> self.x = x
> self.y = y
> self.op = op
> def eval(self):
> return self.op(float(x), float(y))
> def dependency(self):
> return self.x.dependency(level) + self.y.dependency(level)
>
> Then do something like this:
>
> sheet = {}
> sheet['A1'] = Cell(7)
> sheet['A2'] = Cell(Formula(2, Indirect('A1'), operator.mul))
> sheet['A3'] = Cell(
> Formula(
> Formula(3, Indirect('A1'), operator.mul),
> Indirect('A2'),
> operator.add
> ))
> sheet['A4'] = Cell(Formula(Indirect('A3'), 4, operator.mul))
>
> Then you only need to parse each human-readable formula like '=2*A1' once.
>
> > 2) Do I need a parser to evaluate the formulas like “3*A1+A2”?
>
> Yes.
>
> > Can you recommend one library that already contains one?
>
> Try PyParsing.
>
> > 3) Do I need a tree
> > data structure to represent my data? would the tree be an attribute of
> > the class instance?
>
> I suspect a dict will be faster.
>
> To get the dependencies of each cell:
>
> for key, value in sheet.items():
> print key, value.dependency()
>
> Keep in mind I haven't tested ANY of this -- it is entirely stream of
> consciousness. I've never actually done this, so I have no idea whether
> it is a good approach or not, but it seems to me that it might be.
>
> --
> Steven
WOW!!!
After lunch I am going to read your post thoroughly, but I can already
see that you've put into it a lot of time and expertise.
Thank you
More information about the Python-list
mailing list