[Tutor] Search and Replace

Reed L. O'Brien reed at intersiege.com
Fri Jun 24 17:47:23 CEST 2005

I ma trying to write a script to search adn replace a sizable chink of
text in about 460 html pages.
It is an old form that usesa search engine no linger availabe.

I am wondering if anyone has any advice on the best way to go about that.
There are more than one layout place ment for the form, but I would be
happy to correct the last few by hand as more than 90% are the same.

So my ideas have been,
use regex to find <form>.*</form> and replace it with <form>newform</form>.
Unfortunately there is more than just search form.  So this would just
clobber all of them.  So I could <form>.*knownName of
SearchButton.*</form> --> <form>newform</form>

Or should I read each file in as a big string and break on the form
tags, test the strings as necessary ad operate on them if the conditions
are met.   Unfortunaltely  I think there are wide variances in white
space and lines breaks.  Even the order of the tags is inconsistent.  So
I think I am stuck with the first option...

Unless there is  some module or package I haven't found that works on
html in just the way that I want.  I found htmlXtract but it is for
Python 1.5 and not immediately intuitive.

Any help or advice on this is much appreciated.


-- 4.6692916090 'cmVlZG9icmllbkBnbWFpbC5jb20=\n'.decode('base64')


More information about the Tutor mailing list