[Tutor] Parsing html tables and using numpy for subsequent processing

Wed Sep 16 00:24:27 CEST 2009

Hello all,

I've finally gotten around to my 'learn how to parse html' project. For
those of you looking for examples (like me!), hopefully it will show you one
potentially thickheaded way to do it.

For those of you with powerful python-fu, I would appreciate any feedback
regarding the direction I'm taking and obvious coding no-no's (I have no
formal training in computer science). Please note the project is unfinished,
so there isn't a nice, neat result quite yet.

Rather than spam the list with a long description, please visit the
following post where I outline my approach and provide necessary links --
http://financialpython.wordpress.com/2009/09/15/parsing-dtcc-part-1-pita/

The code can be found at pastebin:
http://financialpython.pastebin.com/f4efd8930
The original html can be found at
http://www.dtcc.com/products/derivserv/data/index.php (I am pulling and
parsing tables from all three sections).

Many thanks!

-- DK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20090915/22f12321/attachment-0001.htm>