HTML "sanitizer" in Python

Scott Stirling SSTirlin at holnam.com
Thu Apr 29 09:26:18 EDT 1999


Thanks, Mark!  That is a very cool tool.  It will make a nice HTML editor for me here at work.

The only feature I immediately saw lacking (but maybe I missed it--I just downloaded it this AM) is the ability to record macros.  For my Excel problem, I really need the ability to batch process the HTML files because there are 14 of them.

Anyway, this is a great reference.  Thank you again.

Scott
>>> "Mark Nottingham" <mnot at pobox.com> 04/28 6:17 PM >>>
There's a better (albeit non-Python) way.

Check out http://www.w3.org/People/Raggett/tidy/ 

Tidy will do wonderful things in terms of making HTML compliant with the
spec (closing tags, cleaning up the crud that Word makes, etc.) As a big
bonus, it will remove all <FONT> tags, etc, and replace them with CSS1 style
sheets. Wow.

It's C, and is also available with a windows GUI (HTML-Kit) that makes a
pretty good HTML editor as well. On Unix, it's a command line utility, so
you can use it (clumsily) from a Python program.

I suppose an extension could also be written; will look into this (or if
anyone does it, please tell me!)

__________________________________________________________________
|  Scott M. Stirling                                                                                                                        |
|  Visit the HOLNAM Year 2000 Web Site: http://web/y2k                                            |
|  Keane - Holnam Year 2000 Project                                                                                   |
|  Office:  734/529-2411 ext. 2327 fax: 734/529-5066 email: sstirlin at holnam.com  |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~





More information about the Python-list mailing list