[Tutor] Compute data usage from log

Christian Witts cwitts at compuscan.co.za
Tue Oct 27 13:41:36 CET 2009


bibi midi wrote:
> Hey Christian,
>
> There seems to be a missing parenthesis in your join function below. 
> Correct me if I'm wrong.
>
> I can live with ppp's time format for now. My script is not 
> world-changing anyway :-). How do i know I'm on the last line of the 
> log file per the code below? Just asking as I'm away from my linux box 
> atm.
>
> for line in open(log_file):
>     last_log_date = ' '.join(line.split(' ')[:3])
>
> Thanks.
>
>
> On Tue, Oct 27, 2009 at 2:56 PM, Christian Witts 
> <cwitts at compuscan.co.za <mailto:cwitts at compuscan.co.za>> wrote:
>
>     bibi midi wrote:
>
>         #!/usr/bin/env python
>         # -*- coding: utf-8 -*-
>
>         '''
>         Calculate internet data consumption
>         of service provider
>         Code as per help of python tutor mailing list
>         Created: 26-Oct-2009
>
>         '''
>
>         intro = raw_input('press enter to view your internet data
>         consumption: ')
>         log_file = '/home/bboymen/mobily.data.plan'
>         total_consumed = 0
>         for line in open(log_file):
>            total_consumed += int(line.split(' ')[9])
>
>
>         total_consumed = total_consumed/1000000.0
>         print "total data consumed is: %0.3f MB\n" % total_consumed
>
>
>         #TODO: #1) show the list comprehension alternative method
>         #2) add exception handling e.g. when log_file cant be found or
>         when key interrupt is pressed
>
>         #      or when index[9] is not a number, etc
>         #3) print latest date of calculated data
>          
>         I'm working on TODO no. 3 e.g. I want to show the latest date
>         when wvdial generated the ppp data. This is normally the date
>         of last line of the ppd:
>
>         Oct 14 11:03:45 cc000002695 pppd[3092]: Sent 3489538 bytes,
>         received 43317854 bytes.
>         ^^^^^^^^^
>
>         For the exception handling i *think* i just use the general
>         exception method e.g. will catch all kinds of error. I really
>         dont know what other errors will show up aside from the ones i
>         listed in the TODO. Advise is appreciated.
>
>
>
>
>
>         On Mon, Oct 26, 2009 at 2:12 PM, Luke Paireepinart
>         <rabidpoobear at gmail.com <mailto:rabidpoobear at gmail.com>
>         <mailto:rabidpoobear at gmail.com
>         <mailto:rabidpoobear at gmail.com>>> wrote:
>
>
>
>            On Mon, Oct 26, 2009 at 3:20 AM, Christian Witts
>            <cwitts at compuscan.co.za <mailto:cwitts at compuscan.co.za>
>         <mailto:cwitts at compuscan.co.za
>         <mailto:cwitts at compuscan.co.za>>> wrote:
>
>                fInput = open('/path/to/log.file', 'rb')
>                total_usage = 0
>                for line in fInput:
>                  total_usage += int(line.split(' ')[9].strip())
>                print total_usage
>
>
>            It's actually bad to assign a variable to the file object
>         in this
>            case (flinput = ....) because Python will automatically close a
>            file after you're done with it if you iterate over it directly,
>            but if you include a reference it will stay open until the
>         python
>            program ends or you explicitly call flinput.close().  It
>         doesn't
>            matter much in this example but in general it is good
>         practice to
>            either
>            1) call foo.close() immediately after you're done using a file
>            object, or
>            2) don't alias the file object and just over it directly so
>         Python
>            will auto-close it.
>
>            Therefore a better (and simpler) way to do the above would be:
>
>            total_usage = 0
>            for line in open('/path/to/log.file'):
>                total_usage += int(line.split(' ')[9])
>
>            Also note you don't need to strip the input because int()
>         coersion
>            ignores whitespace anyway. And additionally you shouldn't be
>            opening this in binary mode unless you're sure you want to, and
>            I'm guessing the log file is ascii so there's no need for the
>            'rb'.  (reading is default so we don't specify an 'r'.)
>
>
>            And since I like list comprehensions a lot, I'd probably do it
>            like this instead:
>
>            total_usage = sum([int(line.split(' ')[9]) for line in
>            open('/path/to/log.file')])
>
>            Which incidentally is even shorter, but may be less readable if
>            you don't use list comprehensions often.
>
>            Also, the list comprehension version is likely to be more
>            efficient, both because of the use of sum rather than repeated
>            addition (sum is implemented in C) and because list
>         comprehensions
>            in general are a tad faster than explicit iteration, if i
>         recall
>            correctly (don't hold me to that though, I may be wrong.)
>
>
>                Of course this has no error checking and or niceties, but I
>                will leave that up to you.
>
>            The same applies to my modifications.
>
>            Good luck, and let us know if you need anything else!
>
>            -Luke
>
>
>
>
>         -- 
>         Best Regards,
>         bibimidi
>
>
>
>     Exceptions:
>     * Not finding the log file would be IOError.
>     * Casting an alphanumeric or alpha string to integer would be a
>     ValueError, in this context you won't have a None so you shouldn't
>     need to worry about a TypeError
>     * Selecting the 10th element in your list can raise an IndexError
>     if your line did not contain enough delimiters to create a large
>     enough list.
>
>     Pedantic:
>     1MB = 1,024KB = 1,024*1,024B
>     So your total consumed should be div (1024*1024.0) or div 1048576.0
>
>     For the date you can look at the time module to get a nice string
>     representation of the date/time.  Or as you said you want the last
>     date listed in the log file then you could add something like
>
>
>     for line in open(log_file):
>       last_log_date = ' '.join(line.split(' ')[:3]
>
>     which would take the first 3 elements in your list and combine
>     them again.  Of course this is again just a simple representation
>     of what to do.
>
>     -- 
>     Kind Regards,
>     Christian Witts
>
>
>
>
>
> -- 
> Best Regards,
> bibimidi
>
> Sent from Riyadh, 01, Saudi Arabia

Hi Bibi,

Yeah there was a missing parenthesis, I was just typing away.
As for knowing if you're on the last line of your log file, basically 
what will happen is for every line you iterate through it will parse the 
first 3 elements of your line and use that for the last_log_date.
If you know for certain that every line will start with the time stamp 
then it will work fine, if that is not the case then you will need to 
build in some checking to ensure you get the correct information.

-- 
Kind Regards,
Christian Witts




More information about the Tutor mailing list