Help with script with performance problems

Ville Vainio ville.spammehardvainio at spamtut.fi
Sun Nov 23 04:32:04 EST 2003


googlegroups at spacerodent.org (Dennis Roberts) writes:

> is enough of a difference that unless I can figure out what I did
> wrong or a better way of doing it I might not be able to use python
> (since most of what I do is parsing various logs).  The main reason to

Isn't parsing logs a batch-oriented thing, where 20 minutes more
wouldn't matter all that much? Log parsing is the home field of Perl,
so python probably can't match its performance there, but other
advantages of Python might make you still want to avoid going back to
Perl. As long as it's 'efficient enough', who cares?

> f = sys.stdin

Have you tried using a normal file instead of stdin? BTW, you can
iterate over a file easily by "for line in open("mylog.log"):". ISTR
it's also more efficient than readline()'s, because it caches the
lines instead of reading them one by one. You can also get the line
numbers by doing "for linenum, line in enumerate(open("mylog.log")):"


>         splitline = string.split(line)

Do not use 'string' module (it's deprecated), use string methods
instead: line.split()

>             clients[source] = clients[source] + 1

clients[source] += 1

or another way to handle the common 'add 1, might not exist' idiom:


clients[source] = 1 + clients.get(source,0)

See http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/66516


-- 
Ville Vainio   http://www.students.tut.fi/~vainio24




More information about the Python-list mailing list