What is the best data structure for a very simple spreadsheet?

Lie Ryan lie.1296 at gmail.com
Sun Jan 3 09:58:05 EST 2010


On 1/3/2010 10:27 PM, vsoler wrote:

> 1) what are, in your opinion, the basic elements of the Cell class?

The "user-entered formula" and "effective value". A Cell containing a 
formula "abc" has a value of "abc"; a cell containing the formula "=1+5" 
has a value of "6". You could use the 'property' decorator for the 
"effective value" attribute.

> 2) Do I need a parser to evaluate the formulas like “3*A1+A2”? Can you
> recommend one library that already contains one?

Yes you do. PyParsing, regex, write your own parser; choose your own 
poison depending on the expected complexity of the formulaes.

> 3) Do I need a tree data structure to represent my data?

There are two main options for your "primary data structure":
- table (2x2 list) -- if you expect most of your cells to be filled, 
this is simple, but uses lots of memory
- sparse table (list/dicts of cells) -- if you have a large table that 
is mostly empty cells

The Dependency Tree is the "side data structure"; a "Cell" is a "Node"; 
a Cell's "dependencies" is the Node's "children"; a Cell's "dependant" 
is the Node's "parent". Cells maintains a list of dependencies (or 
dynamically generates them when requested) by analyzing the "formula" 
attribute for references to other cells.

PS: Strictly speaking, this "side data structure" is not a "Tree" as 
there are multiple root nodes. I think it's called "Directed Acyclic 
Graph". However, it's true that, given a cell as a root, the cell's 
dependencies is a proper tree.

> would the
> tree be an attribute of the class instance?

you could use 'property' decorator and make the dependency tree a 
"computed attribute" of the cell; many people recommended against a 
'property' that is heavy to compute though and calculating dependencies 
may be quite heavy depending on the complexity of the spreadsheet.

> I imagine a lot can be said on these questions. What I am looking for
> is some hints that help me get out of where I am now.
>
> Any help is highly appreciated.
>
> Vicente Soler




More information about the Python-list mailing list