[Tutor] Search and Replace
Reed L. O'Brien
reed at intersiege.com
Fri Jun 24 17:47:23 CEST 2005
I ma trying to write a script to search adn replace a sizable chink of
text in about 460 html pages.
It is an old form that usesa search engine no linger availabe.
I am wondering if anyone has any advice on the best way to go about that.
There are more than one layout place ment for the form, but I would be
happy to correct the last few by hand as more than 90% are the same.
So my ideas have been,
use regex to find <form>.*</form> and replace it with <form>newform</form>.
Unfortunately there is more than just search form. So this would just
clobber all of them. So I could <form>.*knownName of
SearchButton.*</form> --> <form>newform</form>
Or should I read each file in as a big string and break on the form
tags, test the strings as necessary ad operate on them if the conditions
are met. Unfortunaltely I think there are wide variances in white
space and lines breaks. Even the order of the tags is inconsistent. So
I think I am stuck with the first option...
Unless there is some module or package I haven't found that works on
html in just the way that I want. I found htmlXtract but it is for
Python 1.5 and not immediately intuitive.
Any help or advice on this is much appreciated.
-- 4.6692916090 'cmVlZG9icmllbkBnbWFpbC5jb20=\n'.decode('base64')
More information about the Tutor