Parsing
Mike C. Fletcher
mcfletch at rogers.com
Wed Jul 3 20:42:54 EDT 2002
For your task, look at:
re (low-level text processing)
htmllib or sgmllib (actually parses HTML/SGML, lets you define callbacks
to handle content)
xml (same basic idea as previous, but more committed :) ).
in the standard Python library or search Google's groups for "strip
HTML" in the Python newsgroup to find examples using those libraries
that get posted every few months or so :) . For example:
http://groups.google.com/groups?th=fbebe304ebf2c36e&rnum=1
For actual generalised parsing solutions (which are a pretty big hammer
for such a simple task), Google about for:
PLY
SPARK
SimpleParse
PyLR
Yapps
kwParsing
Plex
or even just search for "Python parsing".
Have fun,
Mike
Thomas Berglund wrote:
> Hello all
>
> I'm new to this group and new to python =).
>
> I was thinking, as a first project, to have a program that would
> download a random quote from a homepage that gives such, parse all the
> html out of it and print it. Should be simple, no?
...
> but, that's not really working out. I was just wondering if there was
> some kind of standard parsing library for python to help me get rid of
> those nasty html tags. Any pointers would me much appreciated.
>
> Have a nice day, and thanks.
>
> /Thomas
More information about the Python-list
mailing list