HTML DOM parser?
Peter Hansen
peter at engcorp.com
Thu Jul 18 18:13:16 EDT 2002
Paul Rubin wrote:
>
> Anyone know of a Python-callable HTML DOM parser? I mean a serious
> one that tries to understand the crappy malformed out there in the
> real-world Web, the way a browser does. If it can interpret
> Javascript that's even better. This is for a consulting client, so a
> commercial library would be acceptable (though not preferred).
How about automating IE using Python?
from win32com.client import DispatchEx
ie = DispatchEx('internetexplorer.application')
ie.visible = 1
ie.navigate('http://www.nightsong.com')
dom = ie.document
etc...
Access to the DOM tree of the document might be too slow for your
needs, but if it's not, you definitely get a lot of bang for the buck...
-Peter
More information about the Python-list
mailing list