[2.5] Regex doesn't support MULTILINE?
gagsl-py2 at yahoo.com.ar
Sun Jul 22 10:34:17 CEST 2007
En Sun, 22 Jul 2007 01:56:32 -0300, Gilles Ganault <nospam at nospam.com>
> Incidently, as far as using Re alone is concerned, it appears that
> re.MULTILINE isn't enough to get Re to include newlines: re.DOTLINE
> must be added.
> Problem is, when I add re.DOTLINE, the search takes less than a second
> for a 500KB file... and about 1mn30 for a file that's 1MB, with both
> files holding similar contents.
> Why such a huge difference in performance?
> pattern = "<span class=.?defaut.?>(\d+:\d+).*?</span>"
Try to avoid using ".*" and ".+" (even the non greedy forms); in this
case, I think you want the scan to stop when it reaches the ending </span>
or any other tag, so use: [^<]* instead.
BTW, better to use a raw string to represent the pattern: pattern =
More information about the Python-list