re.sub unexpected behaviour

Javier Collado javier.collado at gmail.com
Tue Jul 6 13:10:17 EDT 2010


Hello,

Let's imagine that we have a simple function that generates a
replacement for a regular expression:

def process(match):
    return match.string

If we use that simple function with re.sub using a simple pattern and
a string we get the expected output:
re.sub('123', process, '123')
'123'

However, if the string passed to re.sub contains a trailing new line
character, then we get an extra new line character unexpectedly:
re.sub(r'123', process, '123\n')
'123\n\n'

If we try to get the same result using a replacement string, instead
of a function, the strange behaviour cannot be reproduced:
re.sub(r'123', '123', '123')
'123'

re.sub('123', '123', '123\n')
'123\n'

Is there any explanation for this? If I'm skipping something when
using a replacement function with re.sub, please let me know.

Best regards,
    Javier



More information about the Python-list mailing list