Python equivalent of "lynx -dump"?

ben at co.and.co ben at co.and.co
Mon Mar 27 16:45:06 EST 2000


lewst <lewst at yahoo.com> wrote:
> I'm looking for a functional equivalent of the "-dump" option to the
> lynx web-browser in Python.  "-dump" dumps the formatted output of an
> HTML document.

> Right now I have a python program that captures the output of a
> webpage and prints it like so:

>         lynxcmd = "lynx -dump %s" %url
>         data = os.popen(lynxcmd).read()
>         print data

An all Python solution is a little bit more complicated:

import htmllib, formatter 

p = htmllib.HTMLParser(formatter.AbstractFormatter(formatter.DumbWriter()))
f = open('test.html')
p.feed(f.read())
p.close()
f.close()

If you want a writer who knows how to write lists (<ol>), look for a message
called LessDumbWriter posted last friday (by me).

Greeting,
-- 
ben . de . rydt at pandora . be ------------------ your comments
http://users.pandora.be/bdr/ ------- inl. IPv6, Linux en Pandora




More information about the Python-list mailing list