why huge speed difference btwn 1.52 and 2.1?
Duncan Booth
duncan at NOSPAMrcp.co.uk
Tue Jun 5 04:39:45 EDT 2001
rsenior at hotmail.com (Robin Senior) wrote in
news:3b1bcf7a.12528354 at news.ccs.queensu.ca:
> On 3 Jun 2001 08:07:56 -0700, aahz at panix.com (Aahz Maruch) wrote:
>
>>In article <9f8fgs$ooo$1 at knot.queensu.ca>,
>>robin senior <rsenior at hotmail.com> wrote:
>>>
>>>I have a pretty simple script for processing a flat file db, running
>>>on Python 2.1; I tried running it under 1.52 for kicks, and to my
>>>surprise it ran almost 10 times as fast! Could someone let me know why
>>>2.1 would be so much slower?
Probably the regular expressions.
Why are you compiling your regular expressions every time round the loop?
Why are you using two regular expressions when one would do?
Why are you using regular expressions at all.
As far as I can tell your code reads a line in and then looks to see
whether the line contains a word that ends in a state name or a state
abbreviation. So if the line is "Today waz blowy, tomorrow may be better"
is in the input it will be copied to the output files for Arizona and
Wyoming. Is this correct?
I would be tempted to rewrite the code, either to not use regular
expressions at all, or to use a single regular expression for everything.
If you build one big regular expression that matches all states and state
abbreviations, then you can extract the match out of the line and use what
matched as a dictionary key to find the right filename (provided you first
build a dictionary with both state names and abbreviations as keys mapping
to the filenames).
Oh, and you upper cased the input, so you don't need a case insensitive
search.
--
Duncan Booth duncan at rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?
More information about the Python-list
mailing list