Problems with regexps
Kirk Strauser
kirk at daycos.com
Fri Nov 7 16:20:08 EST 2003
I'm writing a program to scan through a bunch of VB source. An example of
one of the types of strings I'm trying to match is:
response.write Request.Cookies ("domain")("cname")
but I do not want to match variable assignments like this (they are picked
up later by a different pattern):
strXRSCust= Request.Cookies ("domain")("cname")
I'm differentiating between the two forms by looking for an equal sign
followed by zero or more spaces; if the '=' is there, then I don't want to
match. Here's where it gets weird. This pattern works perfectly (as long
as there are 1 or more spaces after the '='), in that it will not match
the assignment example above:
re.compile(r'(?<!=)\s+Request.Cookies\s*((\(\s*".*?"\s*\)\s*)+)')
I really want to use the pattern below to match for zero or more spaces
(not one or more). Note that it's identical except that the first '\s+'
is replaced with a '\s*':
re.compile(r'(?<!=)\s*Request.Cookies\s*((\(\s*".*?"\s*\)\s*)+)')
I don't know why, but the second pattern does match the assignment example
above, although I don't think it should.
It seems like there's a problem with the negative lookbehind assertion and
that the variable-length '\s' pattern immediately following it is throwing
it off. Any thoughts?
--
Kirk Strauser
The Day Companies
More information about the Python-list
mailing list