reverse engineering Excel spreadsheet
laurent.pointal at wanadoo.fr
Sun Apr 1 19:43:04 CEST 2007
Duncan Smith wrote:
> I am currently implementing (mainly in Python) 'models' that come
> to me as Excel spreadsheets, with little additional information. I am
> expected to use these models in a web application. Some contain many
> worksheets and various macros.
> What I'd like to do is extract the data and business logic so that I can
> figure out exactly what these models actually do and code it up. An
> obvious (I think) idea is to generate an acyclic graph of the cell
> dependencies so that I can identify which cells contain only data (no
> parents) and those that depend on other cells. If I could also extract
> the relationships (functions), then I could feasibly produce something
> in pure Python that would mirror the functionality of the original
> spreadsheet (using e.g. Matplotlib for plots and more reliable RNGs /
> statistical functions).
> The final application will be running on a Linux server, but I can use a
> Windows box (i.e. win32all) for processing the spreadsheets (hopefully
> not manually). Any advice on the feasibility of this, and how I might
> achieve it would be appreciated.
> I assume there are plenty of people who have a better knowledge of e.g.
> COM than I do. I suppose an alternative would be to convert to Open
> Office and use PyUNO, but I have no experience with PyUNO and am not
> sure how much more reliable the statistical functions of Open Office
> are. At the end of the day, the business logic will not generally be
> complex, it's extracting it from the spreadsheet that's awkward. Any
> advice appreciated. TIA. Cheers.
As I remember, there is a documentation about Excel documents in xlrd
package. And with that, you dont need to use Excel via COM to find data in
May also look at pyExcelerator
More information about the Python-list