Regex speed
Reinhold Birkenfeld
reinhold-birkenfeld-nospam at wolke7.net
Sat Oct 30 11:14:24 EDT 2004
Peter Hansen wrote:
> Reinhold Birkenfeld wrote:
>> Well, commenting out the regex substitutions in both versions leads to
>> equal execution times, as I mentioned earlier, so it has to be my regexes.
>
> I'm sorry, I managed to miss the significance of that
> statement in your original post.
;)
> I wonder what the impact is of the fact that the re.sub
> operations are function calls in Python. The overhead
> of function calls is relatively high in Python, so
> perhaps an experiment would be revealing. Can you try
> with a dummy re.sub() call (create a dummy "re" object
> that is global to your module, with a dummy .sub()
> method that just does "return") and compare the
> speed of that with the version without the re.sub
> calls at all? Probably a waste of time, but perhaps
> the actual re operations are not so slow after all,
> but the calls themselves are.
>
> If that's true, you would at least get a tiny improvement
> by alias re.sub to a local name before the loop, to
> avoid the global lookup for "re" and then the attribute
> lookup for "sub" on each of the three calls, each time
> through the loop.
>
> If you can show the format of the input data, I would
> be happy to try a little profiling, if you haven't already
> done that to prove that the bulk of the time is actually
> in the re.sub operation itself.
Well, I did alias the sub methods in that way:
re1sub = re.compile("whatever").sub
There was a performance gain, but it was about 1/100th of the speed
difference.
Reinhold
--
[Windows ist wie] die Bahn: Man muss sich um nichts kuemmern, zahlt fuer
jede Kleinigkeit einen Aufpreis, der Service ist mies, Fremde koennen
jederzeit einsteigen, es ist unflexibel und zu allen anderen Verkehrs-
mitteln inkompatibel. -- Florian Diesch in dcoulm
More information about the Python-list
mailing list