[Tutor] log parser speed optimization

Tom Jenkins tjenkins@devis.com
Fri May 16 15:20:13 2003


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alex wrote:
| Hello,

| this message. If you would be so kind to point me to the bottlenecks in
| it I would be very grateful.

To find the bottlenecks you need to run through the profiler:
http://www.python.org/doc/current/lib/profile-instant.html

There are 2 spots I identified as potential bottlenecks...

|
| def parseline(logline):
|     inlist = logline.strip().split()
|     datetime = parsedate(inlist.pop(3)[1:])
|     user = inlist.pop(2)
|     inlist.pop(1)
|     inlist.pop(1)
|     ip = inlist.pop(0)
|     action = inlist.pop(0)[1:]
|     fsize = inlist.pop(-1)
|     result = inlist.pop(-1)
|     fname = " ".join(inlist)[:-1]
|     return "%s,%s,%s,%s,%s,%s,%s" % (datetime, user, ip, action,
result, fname, fsize)
|

you are popping values out of inlist.  this is equivalent to:
x = inlist[i]; del inlist[i]

the del call is unnecessary.  so replace all your pops with the
appropriate slice.

|         while 1:
|             line = logfile.readline()
|             if not line:
|                 break

classic way to read files.  also slow.  look at xreadlines.


- --
Tom Jenkins
devIS - Development Infostructure
http://www.devis.com

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.0-nr2 (Windows XP)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQE+xTmqV7Yk9/McDYURApuUAJ9xDmCkJhBfqJkbma5XqP/9/4bTCwCguKqb
4nlWuO5mSOoWNPoiMzC1MbA=
=VO/0
-----END PGP SIGNATURE-----