using re module to find " but not " alone ... is this a BUG in re?

John Machin sjmachin at lexicon.net
Thu Jun 12 13:47:43 CEST 2008


On Jun 12, 7:11 pm, anton <anto... at gmx.de> wrote:
> Hi,
>
> I want to replace all occourences of " by \" in a string.
>
> But I want to leave all occourences of \" as they are.
>
> The following should happen:
>
>   this I want " while I dont want this \"
>
> should be transformed to:
>
>   this I want \" while I dont want this \"
>
> and NOT:
>
>   this I want \" while I dont want this \\"
>
> I tried even the (?<=...) construction but here I get an unbalanced paranthesis
> error.

Sounds like a deficit of backslashes causing re to regard \) as plain
text and not the magic closing parenthesis in (?<=...) -- and don't
you want (?<!...) ?

>
> It seems tha re is not able to do the job due to parsing/compiling problems
> for this sort of strings.

Nothing is ever as it seems.

>
> Have you any idea??

For a start, *ALWAYS* use a raw string for an re pattern -- halves the
backslash pollution!


>
>
> re.findall("[^\\]\"","this I want \" while I dont want this \\\" ")

and if you have " in the pattern, use '...' to enclose the pattern so
that you don't have to use \"

>
> Traceback (most recent call last):
>   File "<interactive input>", line 1, in <module>
>   File "C:\Python25\lib\re.py", line 175, in findall
>     return _compile(pattern, flags).findall(string)
>   File "C:\Python25\lib\re.py", line 241, in _compile
>     raise error, v # invalid expression
> error: unexpected end of regular expression

As expected.

What you want is:

>> import re
>> text = r'frob this " avoid this \", OK?'
>>> text
'frob this " avoid this \\", OK?'
>> re.sub(r'(?<!\\)"', r'\"', text)
frob this \\" avoid this \\", OK?'
>>

HTH,
John



More information about the Python-list mailing list