Browsers
Andrew M. Kuchling
akuchlin at mems-exchange.org
Tue Jun 1 16:13:12 EDT 1999
>Daniel Faulkner <m01ymu00 at cwcom.net> wrote:
>> Is there a basic browser some where that I can look at to see how it
>> works? (not grail)
>> As I can't understand much of the python internet software and don't
>> understand how to parse the HTML once I've got it.
G. David Kuhlman writes:
>Lynx is a text mode browser:
> http://lynx.browser.org/
>For fancier stuff, look at Mozilla:
> http://www.mozilla.org/
Note, however, that an HTML parser capable of coping with all
the invalid HTML on the Web is a complicated beast. For example, Lynx
currently has an SGMLish style parser that has been brain damaged in
various ways to cope with invalid HTML. I don't know how much error
correction Grail includes, but it might actually be a simpler parser
if it hasn't been complicated with various error recovery hacks.
Another good option might be to look at the test code in htmllib.py,,
which does simple HTML-to-text formatting. (When trying to figure out
a module, always look in the module's code first, since authors will
often include simple examples or test scripts inside an 'if
__name__=='__main__'" block.
--
A.M. Kuchling http://starship.python.net/crew/amk/
Time, place, and action may with pains be wrought, / But Genius must be born;
and can never be taught.
-- John Congreve
More information about the Python-list
mailing list