Parse ASCII log ; sort and keep most recent entries
David Fisher
fishboy at SPAMredSPAMpeanutSPAM.com
Wed Jun 16 21:11:59 EDT 2004
novastaylor at hotmail.com (Nova's Taylor) writes:
> Hi folks,
>
> I am a newbie to Python and am hoping that someone can get me started
> on a log parser that I am trying to write.
>
> The log is an ASCII file that contains a process identifier (PID),
> username, date, and time field like this:
>
> 1234 williamstim 01AUG03 7:44:31
> 2348 williamstim 02AUG03 14:11:20
> 23 jonesjimbo 07AUG03 15:25:00
> 2348 williamstim 17AUG03 9:13:55
> 748 jonesjimbo 13OCT03 14:10:05
> 23 jonesjimbo 14OCT03 23:01:23
> 748 jonesjimbo 14OCT03 23:59:59
>
> I want to read in and sort the file so the new list only contains only
> the most the most recent PID (PIDS get reused often). In my example,
> the new list would be:
>
> 1234 williamstim 01AUG03 7:44:31
> 2348 williamstim 17AUG03 9:13:55
> 23 jonesjimbo 14OCT03 23:01:23
> 748 jonesjimbo 14OCT03 23:59:59
>
> So I need to sort by PID and date + time,then keep the most recent.
>
> Any help would be appreciated!
>
> Taylor
>
> NovasTaylor at hotmail.com
#!/usr/bin/env python
#
# I'm expecting the log file to be in chronalogical order
# so later entries are later in time
# using the dict, later PIDs overwrite newer ones.
# make a script and use this like
# logparse.py mylogfile.log > newlogfile.log
#
import fileinput
piddict = {}
for line in fileinput:
pid,username,date,time = line.split()
piddict[pid] = (username,date,time)
#
pidlist = piddict.keys()
pidlist.sort()
for pid in pidlist:
username,date,time = piddict[pid]
print pid,username,date,time
#tada!
More information about the Python-list
mailing list