axel at axel.truedestiny.net
Thu Jun 3 17:04:19 CEST 2004
> First off you're using exernal programs here for decompression. This
> trade off of making a system call vs internal implementation. Maybe
> implementation is slower? I don't know, just pointing out that it is a
> difference. Personally when programming tools like this I try to keep
> everything internal because I've had endless system calls kill the
> However with the few files you're iterating over the cost might be the
> way 'round. :)
I'll be looping over these files only, but I thought using python's gzip
module would be faster then spawning gzip itself the way I did in the perl
> > for line in lf.readlines():
> > if string.count( line, "INFECTED" ):
> > vname = re.compile( "INFECTED \((.*)\)" ).search(
> If I read this correctly you're compiling this regex every time you're
> going through the for loop. So every line the regex is compiled again.
> might want to compile the regex outside the loop and only use the compiled
> version inside the loop.
Well, only for lines containing 'INFECTED' then. Good point. (I suddenly
remember some c stuff in which it made a huge difference) I've placed it
outside the loop now, but the times are still the same.
Another difference might be while( <filehandle>) and line in lf.readlines().
The latter reads the whole file to memory if I'm not mistaken as the former
will read the file line by line. Why that could make such a difference I
Thanks for your quick reply,
More information about the Python-list