[Tutor] string.replace

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Tue, 11 Sep 2001 17:55:44 -0700 (PDT)


On Tue, 11 Sep 2001, Jon Cosby wrote:

> Thanks for the help so far. Here's one more that has me stumped. I
> need a function to strip HTML tags from lines. What I have looks like
> it should work, but it stops at the first occurence of the tags. I've
> gone through it step by step, and it's finding all of the tags, but
> for some reason it's not replacing them. The routine is
> 
> 
> >>> from string import *
> >>> def striptabs(line):
> ... 	if "<" in line:
> ... 		l = find(line, "<", 0)
> ... 		m = find(line, ">", l)
> ... 		if l and m:
> ... 			print "Tag found"		# Test

This has a small bug in it: if string.find()ing is unsuccessfuly, it
doesn't return a false value, but instead it returns -1:

###
>>> string.find('hello', 'z')
-1
###

So your test for tag finding should be:

    if l >= 0 and m >= 0:

instead.  The reason it doesn't return a "false" value is because 0 is a
perfectly good return value for find: it would mean a match at the very
beginning of the string.



> ... 		line = replace(line, line[l:m+1], "")

You probably want to include the line above as part of the if-block.  We
should only be replacing things only if we've found matching braces.



> ... 		striptabs(line)

As Allan has pointed out, you'll want to somehow save the rest of the
changes done to your line.  Either:

    return striptabs(line)

or

    line = striptabs(line)

would be good ways of doing this.  It's nice to see that you're taking a
recursive approach here to strip tags until there aren't any more; just be
careful that you send that result off to the user.



When you learn about regular expressions (the re module), you may find
them useful toward stripping tags off a line.  If you're interested, you
can take a look at:

    http://www.python.org/doc/lib/module-re.html

and look for re.sub().

It looks like you may be using Python 1.52.  If so, be wary of a similar
module called "regex": regex is buggy and should be avoided.



I'm off to look at the news; things look grim, but let's do our best to
support each other.  Best of wishes to you.