Parsing Excel spreadsheets

brooklineTom BrooklineTom at gmail.com
Wed Dec 31 00:02:11 EST 2008


andyhume at gmail.com wrote:
> Hi,
>
> Can anybody recommend an approach for loading and parsing Excel
> spreadsheets in Python. Any well known/recommended libraries for this?
>
> The only thing I found in a brief search was http://www.lexicon.net/sjmachin/xlrd.htm,
> but I'd rather get some more input before going with something I don't
> know.
>
> Thanks,
> Andy.

I save the spreadsheets (in Excel) in xml format. I started with the
standard xml tools (xml.dom and xml.dom.minidom). I built a
pullparser, and then just crack them. The MS format is tedious and
overly complex (like all MS stuff), but straightforward. Once I've
cracked them into their component parts (headers, rows, cells, etc),
then I walk through them doing whatever I want.

I found this material to be no worse than doing similar crud with
xhtml. I know there are various python packages around that do it, but
I found the learning curve of those packages to be steeper than just
grokking the spreadsheet structure itself.

In spite of all the hair, the underlying MS structure really does have
everything you'll need. My suggestion is just go for it, it isn't all
that hard.



More information about the Python-list mailing list