jgilbert+python at uvm.edu
Tue Jun 1 17:09:56 CEST 2004
> I am a newbie to python and need some help.
> I am looking at doing some batch search/replace for some of my source
> code. Criteria is to find all literal strings and wrap them up with
> some macro, say MC. For ex., var = "somestring" would become var =
> MC("somestring"). Literal strings can contain escaped " & \.
> But there are 2 cases when this replace should not happen:
> 1.literal strings which have already been wrapped, like
> 2.directives like #include "header.h" and #extern "C".
> I tried to use negative look-behind assertion for this purpose. The
> expression I use for matching a literal string is
> "((\\")|[^"(\\")])+". This works fine. But as I start prepending
> look-behind patterns, things go wrong. The question I have is whether
> the pattern in negative look-behind part can contain alternation ? In
> other words can I make up a regexp which says "match this pattern x
> only if it not preceded by anyone of pattern a, pattern b and pattern
> c" ?
> I tried the following expression to take into account the two
> constraints mentioned above, (?<![(#include )(#extern
> )(MC\()])"((\\")|[^"(\\")])+". Can someone point out the mistakes in
> this ?
It would have been nice if you simplified your example. Since you said
that your base pattern matched properly (for example) you could have let
that be a literal. But no matter.
I think that your problem is that you're trying to use grouping in a
character class (set). [(1 )(2 )] matches '1', ' ', '(', ')'. My proof:
>>> re.sub('[(1 )(2 )]','a','1 2 ( ) ')
So you should just need to ditch the '[' and ']'.
I think what you meant by the set was question marks, ie:
(#include )?(#extern )?(MC\()?
So at least one occurs, though all may.
This is not a Python specific question, this is just plain Reg ex's. You
may wish to consult a good reference site such as
or the O'Reilly book http://www.oreilly.com/catalog/regex/ in the future.
More information about the Python-list