Problem with re module
Peter Otten
__peter__ at web.de
Tue Mar 22 14:35:57 EDT 2011
John Harrington wrote:
> I'm trying to use the following substitution,
>
> lineList[i]=re.sub(r'(\\begin{document})([^$])',r'\1\n\n
> \2',lineList[i])
>
> I intend this to match any string "\begin{document}" that doesn't end
> in a line ending. If there's no line ending, then, I want to place
> two carriage returns between the string and the non-line end
> character.
>
> However, this places carriage returns even when the string is followed
> directly after with a line ending. Can someone explain to me why this
> match is not behaving as I intend it to, especially the ([^$])?
Quoting http://docs.python.org/library/re.html:
"""
Special characters are not active inside sets. For example, [akm$] will
match any of the characters 'a', 'k', 'm', or '$';
"""
>
> Also, how can I write a regex that matches what I wish to match, as
> described above?
I think you want a "negative lookahead assertion", (?!...):
>>> print re.compile("(xxx)(?!$)", re.MULTILINE).sub(r"\1**", "aaa bbb
xxx\naaa xxx bbb\nxxx")
aaa bbb xxx
aaa xxx** bbb
xxx
More information about the Python-list
mailing list