[Spambayes-checkins] spambayes/spambayes message.py, 1.15, 1.16

Tim Peters tim.one at comcast.net
Mon Apr 21 20:44:06 EDT 2003


>> That's one I suggested.  Another was
>>
>>    \r\n|[\r\n]

[Tim Stone]
> Surprisingly, my cursory benchmark scores this one about 20%
> slower than the others that we've bandied about...

I don't want to bother learning exactly which data you benchmarked or how
you timed it.  Both topics have consumed megabytes on c.l.py <wink>.

>>(character classes are faster than alternation in Python regexps).

> Yes, and the first alternative is still a useless match... can't
> we get rid of that somehow?

The match prevents re.sub from seeing the sequence \r\n as individual
characters, which is important to stop just one of them from getting
replaced by \r\n.  A look*behind* assertion could prevent \n from getting
replaced when it's preceded by \r, but Skip was after readability here.
This stuff doesn't cost enough to worry about.




More information about the Spambayes-checkins mailing list