[Tutor] iteration help

Joel Goldstick joel.goldstick at gmail.com
Thu Aug 20 15:48:37 CEST 2015


On Thu, Aug 20, 2015 at 9:27 AM, richard kappler <richkappler at gmail.com> wrote:
> Running python 2.7 on Linux
>
> While for and if loops always seem to give me trouble. They seem obvious
> but I often don't get the result I expect and I struggle to figure out why.
> Appended below is a partial script. Ultimately, this script will read a
> log, parse out two times from each line of the log, a time the line was
> written to the lg (called serverTime in the script) and an action time from
> elsewhere in the line, then get the difference between the two. I don't
> want every difference, but rather the average per hour, so I have a line
> count. The script will output the average time difference for each hour.
> I've got most of the pieces working in test scripts, but I'm stymied with
> the single output bit.
>
> The idea is that the script takes the hour from the server time of the
> first line of the log and sets that as the initial serverHr. That works,
> has been tested. Next the script is supposed to iterate through each line
> of the log (for line in f1) and then check that there is a time in the line
> (try), and if not skip to the next line. That works, has been tested.
>
> As each line is iterated over, my intent was that the variable newServerHr
> (read from the current line) is compared to serverHr and if they are the
> same, the script will increase the count by one and add the difference to a
> cummulative total then go to the next line. If the newServerHr and serverHr
> are not the same, then we have entered a new clock hour, and the script
> should calculate averages and output those, zero all counts and cummulative
> totals, then carry on. The idea being that out of 117,000 ish lines of log
> (the test file) that have inputs from 0200 to 0700, I would get 6 lines of
> output.
>
> I've got everything working properly in a different script except I get 25
> lines of output instead of 6, writing something like 16 different hours
> instead of 02 - 07.
>
> In trying to chase down my bug, I wrote the appended script, but it outputs
> 117,000 ish lines (times 02-07, so that bit is better), not 6. Can someone
> tell me what I'm misunderstanding?
>
> #!/usr/bin/env python
>
> import re
>
> f1 = open('ATLA_PS4_red5.log', 'r')
> f2 = open('recurseOut.log', 'a')
>
> # read server time of first line to get hour
> first_line = f1.readline()
> q = re.search(r'\d\d:\d\d:\d\d', first_line)
> q2 = q.start()
> serverHr = (first_line[q2:q2+2])
>
>
> for line in f1:
>     try:
>         s = line
>         #    read server time
>         a = re.search(r'\d\d:\d\d:\d\d', s)  # find server time in line
>         b = a.start()                        # find 1st position of srvTime
>         newServerHr = (s[b:b+2])          # what hour is it now?
>         if newServerHr != serverHr:
>             f2.write('hour ' + newServerHr + '\n')
>         else:
>             serverHr == newServerHr
>
>     except:
>         pass
>
1. You don't need s, you can use line directly.
2. In your else: code, you want = not == since you want to assign the
new value to the serverHr.  That line does nothing now since it is
comparing two values, but making no decision based on the comparison.
3. I'm guessing you are coming from another language.  In python
people generally use lower case names with underscores between words.

-- 
Joel Goldstick
http://joelgoldstick.com


More information about the Tutor mailing list