Regular Expression help for parsing html tables
Odalrick
odalrick at hotmail.com
Sun Oct 29 03:30:24 EST 2006
steve551979 at hotmail.com skrev:
> Hello,
>
> I am having some difficulty creating a regular expression for the
> following string situation in html. I want to find a table that has
> specific text in it and then extract the html just for that immediate
> table.
>
> the string would look something like this:
>
> ...stuff here...
> <table>
> ...stuff here...
> <table>
> ...stuff here...
> <table>
> ...
> text i'm searching for
> ...
> </table>
> ...stuff here...
> </table>
> ...stuff here...
> </table>
> ...stuff here...
>
>
> My question: is there a way in RE to say: "when I find this text I'm
> looking for, search backwards and find the immediate instance of the
> string "<table>" and then search forwards and find the immediate
> instance of the string "</table>". " ?
>
> any help is appreciated.
>
> Steve.
It would have been easier if you'd said what the text you are looking
for is, but I think:
regex = re.compile( r'<table>(.*?text you are looking for.*?)</table>',
re.DOTALL )
match = regex.search( html_string )
found_table = match.group( 1 )
would work.
/Odalrick
More information about the Python-list
mailing list