Generator Expressions and CSV

Emile van Sebille emile at
Fri Jul 17 23:56:12 CEST 2009

On 7/17/2009 1:08 PM Zaki said...
> Here is the final code that I have running. It's very much 'hack' type
> code and not at all efficient or optimized and any help in optimizing
> it would be greatly appreciated.

There are some things I'd approach differently , eg I might prefer glob 
to build iNuQ and queryQ [1], and although glob is generally fast, I'm 
not sure it'd be faster.  But overall it looks like most of the time is 
spent in your 'for row in' loops, and as you're reading each file only 
once, and would have to anyway, there's not much that'll improve overall 
timing.  I don't know what csvreader is doing under the covers, but if 
your files are reasonably sized for your system you might try timing 
something that reads in the full file and splits:

for each in filelist:
     for row in open(filelist).readlines():
         if row.split()[2] in ....

import glob

iNuQ = glob.glob(os.sep.join(inputdir,"par1.install*")
queryQ = glob.glob(os.sep.join(inputdir,"par1.query*")


