[Tutor] Newbie Here -- Averaging & Adding Madness Over a Given (x) Range?!?!

Thu Feb 14 23:01:05 CET 2013

On 02/14/2013 03:55 PM, Michael McConachie wrote:
> Hello all,
>
> This is my first post here.  I have tried to get answers from StackOverflow, but I realized quickly that I am too "green" for that environment.  As such, I have purchased Beginning Python (2nd edition, Hetland) and also the \$29.00 course available from learnpythonthehardway(dot)com.  I have been reading fervently, and have enjoyed python -- very much.  I can do all the basic printing, math, substitutions, etc.  Although, I am stuck when trying to combine all the new skills I have been learning over the past few weeks.  Anyway, I was hoping to get some help with something NON-HOMEWORK related. (I swear.)
>
> I have a task that I have generalized due to the nature of what I am trying to do -- and it's need to remain confidential.
>
> My end goal as described on SO was: "Calculating and Plotting the Average of every (X) items in a list of (Y) total", but for now I am only stuck on the actual addition, and/or averaging items -- in a serial sense, based on the relation to the previous number, average of numbers, etc being acted on.  Not the actual plotting. (Plotting is pretty EZ.)
>

If you're stuck on the addition, why give us all the other parts?  Your
problem statement is very confused, and you don't show much actual code.

> Essentially:
>
> 1.  I have a list of numbers that already exist in a file.  I generate this file by parsing info from logs.
> 2.  Each line contains an integer on it (corresponding to the number of milliseconds that it takes to complete a certain repeated task).
> 3.  There are over a million entries in this file, one per line; at any given time it can be just a few thousand, or more than a million.
>
>     Example:
>     -------
>     173
>     1685
>     1152
>     253
>     1623

So write a loop that reads this file into a list of ints, converting
each line.  Then you can tell us you've got a list of about a million ints.

>
> Eventually what I'll need to do is:
>
> 1.  Index the file and/or count the lines, as to identify each line's positional relevance so that it can average any range of numbers that are sequential; one to one another.
> 2.  Calculate the difference between any given (x) range.  In order to be able to ask the program to average every 5, 10, 100, 100, or 10,000 etc. --> until completion.  This includes the need to dealing with stray remainders at the end of the file that aren't divisible by that initial requested range.
>
> (ie: average some file with 3,245 entries by 100 --> not excluding the remaining 45 entries, in order to represent the remainder.)
>
> So, looking above, transaction #1 took "173" milliseconds, while transaction #2 took 1685 milliseconds.
>
> Based on this, I need to figure out how to do two things:
>
> 1.  Calculate the difference of each transaction, related to the one before it AND record/capture the difference. (An array, list, dictionary -- I don't care.)

What difference, what transaction, related how?

> 2.  Starting with the very first line/entry, count the first (x number) and average (x).  I can obtain a "Happy medium" for what the gradient/delta is between sets of 100 over the course of the aggregate.

What's an x-number?  What, what, which, who ?

>
>     ie:
>     ---
>     Entries 1-100 = (eventualPlottedAvgTotalA)
>     Entries 101-200 = (eventualPlottedAvgTotalB)
>     Entries 201-300 = (eventualPlottedAvgTotalC)
>     Entries 301-400 = (eventualPlottedAvgTotalD)
>
>>From what I can tell, I don't need to indefinitely store the values, only pass them as they are processed (in order) to the plotter. I have tried the following example to sum a range of 5 entries from the above list of 5 (which works), but I don't know how to dynamically pass the 5 at a time until completion, all the while retaining the calculated averages which will ultimately be passed to pyplot at a later time/date.
>
> What I have been able to figure out thus far is below.
>
> ex:
>
>     Python 2.7.3 (default, Jul 24 2012, 10:05:38)
>     [GCC 4.7.0 20120507 (Red Hat 4.7.0-5)] on linux2
>     >>> plottedTotalA = ['173', '1685', '1152', '253', '1623']
>     >>> sum(float(t) for t in plottedTotalA)
>     4886.0
>
> I received 2 answers from SO, but was unable to fully capture what they were trying to tell me.  Unfortunately, I might need a "baby-step" / "Barney-style" mentor who is willing to guide me on this.  I hope this makes sense to someone out there, and thank you in advance for any help that you can provide.  I apologize in advance for being so thick if its uber-EZ.
>
>

If you want to make a sublist out of the first 2 items in a list, you
can use a slice  (notice the colon):

allvalues = [ 173, 1685, 1152, 263, 1623, 19 ]
firsttwo = allvalues[0:2]

To get the 3rd such sublist, use
othertwo = allvalues[4:2]

If you've made such a list, you can readily use sum directly on it:
mysum = sum(othertwo)

