Regular Expression help for parsing html tables

Paddy paddy3118 at netscape.net
Sun Oct 29 17:46:08 CET 2006


steve551979 at hotmail.com wrote:
> Hello,
>
> I am having some difficulty creating a regular expression for the
> following string situation in html. I want to find a table that has
> specific text in it and then extract the html just for that immediate
> table.
>
> the string would look something like this:
>
> ...stuff here...
> <table>
> ...stuff here...
> <table>
> ...stuff here...
> <table>
> ...
> text i'm searching for
> ...
> </table>
> ...stuff here...
> </table>
> ...stuff here...
> </table>
> ...stuff here...
>
>
> My question:  is there a way in RE to say:   "when I find this text I'm
> looking for, search backwards and find the immediate instance of the
> string "<table>"  and then search forwards and find the immediate
> instance of the string "</table>".  " ?
>
> any help is appreciated.
>
> Steve.

Might searching the output of BeautifulSoup(html).prettify() make
things easier?

http://www.crummy.com/software/BeautifulSoup/documentation.html#Parsing%20HTML

- Paddy




More information about the Python-list mailing list