Wildcard for string replacement?!?!

laotseu bdesth at removethis.free.fr
Tue Mar 11 02:36:55 CET 2003

Perverted Orc wrote:
> Hello everyone!
> I 'm working for over a week on this script but I can't make my way out. The
> whole idea is to replace (better say delete) anything that stands between
> the <td> and</td> tag of an html file.
> from string import *
> pname=raw_input("Give path\ :") #The path name where the original file is
> and the new to be saved
> fnameo=raw_input("Give file to change :")
> fnamen=raw_input("...Save As :")
> old_file=pname+fnameo #Joining path and file name
> new_file=pname+fnamen #Joining path and file name
> inp=open (old_file,'r')
> outp=open(new_file,'w')
> for line in inp.readlines():
>      nline=replace(line,"?????"," ")
>      outp.write(nline)
> print "1 file changed and saved..."
> inp.close()
> outp.close()
> Can anyone tell me what to fill in the ????? area? I tried any possible
> combination with "." and "*" but didn't work.
> Please help!!!

What about using
a/ regular expressions or
b/ an html parser (there's one in the batteries - sorry, the standard 
lib, and there is a good exemple of html parsing in diveintopython).


PS : dumb question : what should your program do if there are nested 
tables (which implies nested <td>) in the file ?-)

More information about the Python-list mailing list