getting text inside the HTML tag
kyosohma at gmail.com
kyosohma at gmail.com
Sat Jul 14 14:01:27 EDT 2007
On Jul 14, 12:47 pm, Nikola Skoric <nick-n... at net4u.hr> wrote:
> I'm using sgmllib.SGMLParser to parse HTML. I have successfuly parsed start
> tags by implementing start_something method. But, now I have to fetch the
> string inside the start tag and end tag too. I have been reading through
> SGMLParser documentation, but just can't figure that out... can somebody
> help? :-)
>
> --
> "Now the storm has passed over me
> I'm left to drift on a dead calm sea
> And watch her forever through the cracks in the beams
> Nailed across the doorways of the bedrooms of my dreams"
Oi! Try Beautiful Soup instead. That seems to be the defacto HTML
parser for Python:
http://www.crummy.com/software/BeautifulSoup/
You might find the minidom or lxml modules to your liking as well.
Mike
More information about the Python-list
mailing list