[Tutor] Remove certain tags in html files

Alan Gauld alan.gauld at btinternet.com
Fri Jul 27 21:48:09 CEST 2007


"Sebastien Noel" <sebastien at solutions-linux.org> wrote

> My question, since I'm quite new to python, is about what tool I 
> should
> use to remove the table, tr and td tags, but not what's enclosed in 
> it.
> I think BeautifulSoup isn't good for that because it removes what's
> enclosed as well.

BS can do what you want, you must be missing something. One of the
most basic examples of using BS is to print an HTML file as plain text
- ie stripping just the tags. So it must be possible.

Can you put together a short example of the code you are using?

You an use lower level parsers but BS is geneally easier, but until
we know what you are doing its hard to guess what might be wrong.

Alan G. 




More information about the Tutor mailing list