replacing words in HTML file
fetchinson at googlemail.com
Thu Apr 29 11:38:53 CEST 2010
> | > Any idea how I can replace words in a html file? Meaning only the
> | > remain untouch.
> | I'm not sure what you tried and what you haven't but as a first trial
> | you might want to
> | <untested>
> | f = open( 'new.html', 'w' )
> | f.write( open( 'index.html' ).read( ).replace( 'replace-this', 'with-that'
> ) )
> | f.close( )
> | </untested>
> HTML tag name, it will get mangled. The OP didn't want that.
Correct, that is why I started with "I'm not sure what you tried and
what you haven't but as a first trial you might". For instance if the
css and he knows that these words are also not in html attribute
names/values, etc, etc, then the above approach would work, in which
case BeautifulSoup is a gigantic overkill. The OP needs to specify
more clearly what he wants, before really useful advice can be given.
> The only way to get this right is to parse the file, then walk the doc
> tree enditing only the text parts.
> The BeautifulSoup module (3rd party, but a single .py file and trivial to
> fetch and use, though it has some dependencies) does a good job of this,
> coping even with typical not quite right HTML. It gives you a parse
> tree you can easily walk, and you can modify it in place and write it
> straight back out.
> Cameron Simpson <cs at zip.com.au> DoD#743
> The Web site you seek
> cannot be located but
> endless others exist
> - Haiku Error Messages
Psss, psss, put it down! - http://www.cafepress.com/putitdown
More information about the Python-list