Does Python mess with CRLFs?

Irmen de Jong irmen.NOSPAM at
Wed Nov 12 19:52:13 CET 2008

Gilles Ganault wrote:
> Hello
> I'm stuck at understanding why Python can't extract some bit from an
> HTML file using regexes, although I can find it just fine with
> UltraEdit.
> #BAD    
> friends  = re.compile('</td></tr></table>\r\n</div>\r\n',re.IGNORECASE

If you keep running into trouble and you're sure it's related to the newlines,
maybe it helps using the 'whitespace' symbol instead of \r\n in your expression:
  re.compile('</td></tr></table>\\s*</div>\\s*', .... )

Other than that, hard to say what's not working as expected without knowing
the exact contents of the "content.html" file you're searching in....


More information about the Python-list mailing list