RegSub problem?

piet at cs.uu.nl piet at cs.uu.nl
Mon Sep 25 05:02:40 EDT 2000


>>>>> rerwig at my-deja.com (R) writes:

R> I tried to remove brackets inside a string, using regsub as follows:
R>   import regsub

R>   a = "aaaa (bbbb) {cccc} <dddd> [eeee]"
R>   b = regsub.gsub("[(){}<>]","",a)

R> The correct result in b is:
R>   "aaaa bbbb cccc dddd [eeee]"

R> However, when I try to get rid of the "[" and "]", it won't work:

R>   b = regsub.gsub("[(){}<>\[\]]","",a)

R> The expected result in b is:
R>   "aaaa bbbb cccc dddd eeee"

R> But I get:
R>   "aaaa (bbbb) {cccc} <dddd> [eeee]"

R> Thus also the previous substitutions doesn't work anymore.
R> What am I missing?

Your regexp is wrong. You seem to think that \ escapes the ] in a character
class, which isn't true. The first ] closes the [, so the \ is the last
character in the character class. And the last ] is an additional character
that must be matched. A ] character in a character class must be the first
character, so 

So b = regsub.gsub("[](){}<>[]","",a) would work although it looks odd.

By the way, regsub is outdated; you should use the re module which does
support the \ escape in character classes.
-- 
Piet van Oostrum <piet at cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP]
Private email: P.van.Oostrum at hccnet.nl



More information about the Python-list mailing list