reverse engineering Excel spreadsheet

Laurent Pointal laurent.pointal at wanadoo.fr
Sun Apr 1 19:43:04 CEST 2007


Duncan Smith wrote:

> Hello,
>      I am currently implementing (mainly in Python) 'models' that come
> to me as Excel spreadsheets, with little additional information.  I am
> expected to use these models in a web application.  Some contain many
> worksheets and various macros.
> 
> What I'd like to do is extract the data and business logic so that I can
> figure out exactly what these models actually do and code it up.  An
> obvious (I think) idea is to generate an acyclic graph of the cell
> dependencies so that I can identify which cells contain only data (no
> parents) and those that depend on other cells.  If I could also extract
> the relationships (functions), then I could feasibly produce something
> in pure Python that would mirror the functionality of the original
> spreadsheet (using e.g. Matplotlib for plots and more reliable RNGs /
> statistical functions).
> 
> The final application will be running on a Linux server, but I can use a
> Windows box (i.e. win32all) for processing the spreadsheets (hopefully
> not manually).  Any advice on the feasibility of this, and how I might
> achieve it would be appreciated.
> 
> I assume there are plenty of people who have a better knowledge of e.g.
> COM than I do.  I suppose an alternative would be to convert to Open
> Office and use PyUNO, but I have no experience with PyUNO and am not
> sure how much more reliable the statistical functions of Open Office
> are.  At the end of the day, the business logic will not generally be
> complex, it's extracting it from the spreadsheet that's awkward.  Any
> advice appreciated.  TIA.  Cheers.
> 
> Duncan

As I remember, there is a documentation about Excel documents in xlrd
package. And with that, you dont need to use Excel via COM to find data in
the document.
http://www.lexicon.net/sjmachin/xlrd.htm

May also look at  pyExcelerator
http://sourceforge.net/projects/pyexcelerator/




More information about the Python-list mailing list