[Tutor] Logfile Manipulation

Alan Gauld alan.gauld at btinternet.com
Mon Nov 9 09:47:52 CET 2009


"Stephen Nelson-Smith" <sanelson at gmail.com> wrote

> * How does Python compare in performance to shell, awk etc in a big
> pipeline?  The shell script kills the CPU

Python should be significantly faster than the typical shell script
and it should consume less resources, although it will probably
still use a fair bit of CPU unless you nice it.

> * What's the best way to extract the data for a given time, eg 0000 -
> 2359 yesterday?

I'm not familiar with Apache log files so I'll let somebody else answer,
but I suspect you can either use string.split() or a re.findall(). You 
might
even be able to use csv. Or if they are in XML you could use ElementTree.
It all depends on the data!

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/ 




More information about the Tutor mailing list