Strip HTML tags from downloaded files

Thomas Pham tdpham at email.com
Wed Dec 5 13:47:27 EST 2001


When I use urlretrieve to download a file from the web, the raw text file have HTML tags embedded at the beginning and the end of the file.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
 <HEAD>



</PRE>
</BODY></HTML>

Is there anyway to strip all the HTML tags from the file?

Thanks,
-- 

_______________________________________________
1 cent a minute calls anywhere in the U.S.!

http://www.getpennytalk.com/cgi-bin/adforward.cgi?p_key=RG9853KJ&url=http://www.getpennytalk.com







More information about the Python-list mailing list