using re module to find " but not " alone ... is this a BUG in re?
sjmachin at lexicon.net
Thu Jun 12 13:47:43 CEST 2008
On Jun 12, 7:11 pm, anton <anto... at gmx.de> wrote:
> I want to replace all occourences of " by \" in a string.
> But I want to leave all occourences of \" as they are.
> The following should happen:
> this I want " while I dont want this \"
> should be transformed to:
> this I want \" while I dont want this \"
> and NOT:
> this I want \" while I dont want this \\"
> I tried even the (?<=...) construction but here I get an unbalanced paranthesis
Sounds like a deficit of backslashes causing re to regard \) as plain
text and not the magic closing parenthesis in (?<=...) -- and don't
you want (?<!...) ?
> It seems tha re is not able to do the job due to parsing/compiling problems
> for this sort of strings.
Nothing is ever as it seems.
> Have you any idea??
For a start, *ALWAYS* use a raw string for an re pattern -- halves the
> re.findall("[^\\]\"","this I want \" while I dont want this \\\" ")
and if you have " in the pattern, use '...' to enclose the pattern so
that you don't have to use \"
> Traceback (most recent call last):
> File "<interactive input>", line 1, in <module>
> File "C:\Python25\lib\re.py", line 175, in findall
> return _compile(pattern, flags).findall(string)
> File "C:\Python25\lib\re.py", line 241, in _compile
> raise error, v # invalid expression
> error: unexpected end of regular expression
What you want is:
>> import re
>> text = r'frob this " avoid this \", OK?'
'frob this " avoid this \\", OK?'
>> re.sub(r'(?<!\\)"', r'\"', text)
frob this \\" avoid this \\", OK?'
More information about the Python-list