How to read such file and sumarize the data?
huisky at gmail.com
Thu Nov 18 09:54:06 CET 2010
thank you Martin. You are right.
But the elapsed time is also okay for me. And i would like to assume
that the total CPU time equals to the number of CPUs multiply the
elapsed time. As to the number you mentioned, it is the 'process id',
so it will be no problem to identify each job.
On Nov 18, 12:38 am, Martin Gregorie <mar... at address-in-sig.invalid>
> On Wed, 17 Nov 2010 13:45:58 -0800, huisky wrote:
> > Say I have following log file, which records the code usage. I want to
> > read this file and do the summarize how much total CPU time consumed for
> > each user.
> Two points you should think about:
> - I don't think you can extract CPU time from this log: you can get
> the process elapsed time and the number of CPUs each run has used,
> but you can't calculate CPU time from those values since you don't
> know how the process spent waiting for i/o etc.
> - is the first (numeric) part of the first field on the line a process id?
> If it is, you can match start and stop messages on the value of the
> first field provided that this value can never be shared by two
> processes that are both running. If you can get simultaneous
> duplicates, then you're out of luck because you'll never be able to
> match up start and stop lines.
> > Is Python able to do so or say easy to achieve this?, anybody can give
> > me some hints, appricate very much!
> Sure. There are two approaches possible:
> - sort the log on the first two fields and then process it with Python
> knowing that start and stop lines will be adjacent
> - use the first field as the key to an array and put the start time
> and CPU count in that element. When a matching stop line is found
> you, retrieve the array element, calculate and output or total the
> usage figure for that run and delete the array element.
> martin@ | Martin Gregorie
> gregorie. | Essex, UK
> org |
More information about the Python-list