converting an html table to a tree

Ian Lipsky NOSPAM at pacificnet.net
Thu Aug 24 16:22:23 EDT 2000


"Alex Martelli" <alex at magenta.com> wrote in message
news:8o3hjf0na9 at news1.newsguy.com...
> "Ian Lipsky" <NOSPAM at pacificnet.net> wrote in message
> news:Rbbp5.1370$3Q6.66906 at newsread2.prod.itd.earthlink.net...
>     [snip]
> >  i dont need to worry about the contents of each table cell. So unless i
> am
> > overlooking something, i'll really only need to worry about the TABLE,
TR
> > and TD tags.
>
> A <TR> could contain <TH>.  What would you want to do with those?
>
> > I think i have to do this as though there could be an
> > unspecified number of tables, which shouldnt be much more complicated
then
> > doing it if it were a specified number.
>
> Nope, just another level of nesting, I think.
>
>
> Alex

Hmm...true i forgot about that. Actually, it could have a whole load of tags
inside the <TD> tags...font, bold etc. Since i'm only concerned with the
data and not the formatting, i'll just have to make sure i put something in
so that once its inside the <td> tags, it ignores the opening tag < and the
closing tag > and everything between it, unless its </td>

I know i saw a bit of code dealing with doing something like that...i think
it was using regexp? i'll have to dig it up.





More information about the Python-list mailing list