Sorting Apache Log Files

David Kirtley dkirtley at panam.edu
Wed Jun 27 16:43:13 EDT 2001


> Lenny Self wrote:

>>> SNIP <<<
 
> import string
>
>   datestamp1 = line1[string.find(line1,"[") + 1:string.rfind(line1,"]")]
>   datestamp2 = line2[string.find(line2,"[") + 1:string.rfind(line2,"]")]
>   # Compare the date stamps and return appropriate value
>   if datestamp1 < datestamp2:
>           return -1
>   elif datestamp2 < datestamp1:
>           return 1
>   else:
>           return 0
>ist.sort(compare)
> Writing sorted list to new file
>pen("d:/work/newfile.txt","w").writelines(list)

That won't work like you think it will. That will do a lexographic
comparison (alphabetical) that will partially work but not when they
are on boundiries (worst case example: Jan < Apr  works but Jul < Aug
doesnt.)

The way to handle it is to convert it to a numeric time value:

for the default Apache log format::  00/Mon/0000/00:00:00


from time import *

timestamp = "00/Mon/0000/00:00:00" #<- insert a real timestamp here.
myTime = mktime(strptime(timestamp,"%d/%b/%Y:%H:%M:%S"))

Now you can use myTime to make the comparison.

David Kirtley



More information about the Python-list mailing list