how to get rid of html tags

koko kokohh at hotmail.com
Thu Oct 3 18:27:11 EDT 2002


or remove \n first

text = re.sub( r'\n[ \t]+', '', data )
text= re.sub(r'<.*?>', '', text)



"koko" <kokohh at hotmail.com> wrote in message
news:Gc3n9.1604$XX3.967270 at newssrv26.news.prodigy.com...
> Thanks a lot. It worked well if the tags are on the same line.
> But if the tag is broked to a few lines, it will not work.
> eg. <!--abcd
>         eeeee
>             fff>
>
>
> "Cameron Laird" <claird at lairds.org> wrote in message
> news:anhj3t$mg9$1 at lairds.org...
> > In article <mailman.1033619587.32128.python-list at python.org>,
> > Ian Bicking  <ianb at colorstudy.com> wrote:
> > >The easy answer:
> > >
> > >page = re.sub(r'<.*?>', '', page)
>





More information about the Python-list mailing list