[Tutor] RE module is working ?

Peter Otten __peter__ at web.de
Thu Feb 3 14:15:34 CET 2011


Karim wrote:

> I am trying to subsitute a '""' pattern in '\"\"' namely escape 2
> consecutives double quotes:
> 
>     * *In Python interpreter:*
> 
> $ python
> Python 2.7.1rc1 (r271rc1:86455, Nov 16 2010, 21:53:40)
> [GCC 4.4.3] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> expression = *' "" '*
>  >>> re.subn(*r'([^\\])?"', r'\1\\"', expression*)
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
>    File "/home/karim/build/python/install/lib/python2.7/re.py", line
> 162, in subn
>      return _compile(pattern, flags).subn(repl, string, count)
>    File "/home/karim/build/python/install/lib/python2.7/re.py", line
> 278, in filter
>      return sre_parse.expand_template(template, match)
>    File "/home/karim/build/python/install/lib/python2.7/sre_parse.py",
> line 787, in expand_template
>      raise error, "unmatched group"
> sre_constants.error: unmatched group
> 
> But if I remove '?' I get the following:
> 
>  >>> re.subn(r'([^\\])"', r'\1\\"', expression)
> (' \\"" ', 1)
> 
> Only one substitution..._But this is not the same REGEX._ And the
> count=2 does nothing. By default all occurrence shoul be substituted.
> 
>     * *On linux using my good old sed command, it is working with my '?'
>       (0-1 match):*
> 
> *$* echo *' "" '* | sed *'s/\([^\\]\)\?"/\1\\"/g*'*
>   \"\"
> 
> *Indeed what's the matter with RE module!?*

You should really fix the problem with your email program first; afterwards 
it's probably a good idea to try and explain your goal clearly, in plain 
English.

Yes. What Steven said ;)

Now to your question as stated: if you want to escape two consecutive double 
quotes that can be done with

s = s.replace('""', '\"\"')

but that's probably *not* what you want. Assuming you want to escape two 
consecutive double quotes and make sure that the first one isn't already 
escaped, this is my attempt:

>>> def sub(m):
...     s = m.group()
...     return r'\"\"' if s == '""' else s
...
>>> print re.compile(r'[\\].|""').sub(sub, r'\\\"" \\"" \"" "" \\\" \\" \"')
\\\"" \\\"\" \"" \"\" \\\" \\" \"

Compare that with

$ echo '\\\"" \\"" \"" "" \\\" \\" \"' | sed 's/\([^\\]\)\?"/\1\\"/g'
\\\"\" \\"\" \"\" \"\" \\\\" \\\" \\"

Concerning the exception and the discrepancy between sed and python's re, I 
suggest that you ask it again on comp.lang.python aka the python-list 
mailing list where at least one regex guru will read it.

Peter



More information about the Tutor mailing list