Regular expression and substitution, unexpected duplication

Laurent Pointal laurent.pointal at
Tue Aug 18 23:42:33 CEST 2015


I want to make a replacement in a string, to ensure that ellipsis are 
surrounded by spaces (this is not a typographycal problem, but a preparation 
for late text chunking).

I tried with regular expressions and the SRE_Pattern.sub() method, but I 
have an unexpected duplication of the replacement pattern:

The code:

ellipfind_re = re.compile(r"((?=\.\.\.)|…)", re.IGNORECASE|re.VERBOSE)
ellipfind_re.sub(' ... ', 
       "C'est un essai... avec différents caractères… pour voir.")

And I retrieve:

"C'est un essai ... ... avec différents caractères ...  pour voir."

I tested with/without group capture, same result.

My Python version:
Python 3.4.3 (default, Mar 26 2015, 22:03:40) 
[GCC 4.9.2] on linux

Any idea ?


More information about the Python-list mailing list