[Tutor] Traversing lists or getting the element you want.

Tue Feb 4 06:07:02 CET 2014

Thanks to all that helped on this one. 

I missed in my reading that you can address an element of a list as
y[i].  Going back through "Think Python" book, they have one example of
it. And spend 1/4 page out of the 263 pages in the book. They spend much
more time on other parts of lists like slices, and methods, etc.

For the record, this particular problem had to be linear, and I have to
have the endpoint of one segment be the same point as the beginning of
the next segment for the particular integer x[]. So I could not do a
normal linear regression and least square fit for the line segment.  

My book also does not have anything about itertools, so I have never
heard of that. So I will google it to learn more.  

Thanks, again,

Kip

On Mon, 2014-02-03 at 14:35 +0100, Peter Otten wrote:
> Kipton Moravec wrote:
> 
> > I am new to Python, and I do not know how to traverse lists like I
> > traverse arrays in C. This is my first program other than "Hello World".
> > I have a Raspberry Pi and they say Python is the language of choice for
> > that little machine. So I am going to try to learn it.
> > 
> > 
> > I have data in the form of x, y pairs where y = f(x) and is non linear.
> > It comes from a .csv file.
> > 
> > In this case x is an integer from 165 to 660 so I have 495 data sets.
> > 
> > I need to find the optimal locations of three values of x to piecewise
> > linear estimate the function.
> 
> import itertools
> 
> def boundaries(n, segments):
>     max_n = n-1
>     for inner in itertools.combinations(range(n), segments-2):
>         yield (0,) + inner + (max_n,)
>  
> def minimized(x, y, segments):
>     def total_error(b):
>         return sum(error(x, y, start, end) for start, end in zip(b, b[1:]))
>     return min(boundaries(len(x), segments), key=total_error)
> 
> def error(x, y, istart, iend):
>     errorsq = 0
>     for m in range(istart, iend):
>         lin_estimate = y[istart] + ((y[iend] - y[istart]) * 
>                        ((x[m] - x[istart]) / (x[iend] - x[istart])))
>         d = lin_estimate - y[m]
>         errorsq += d*d
>     return errorsq
> 
> if __name__ == "__main__":
>     # generate dataset with N random (x, y) pairs
>     import random
>     N = 100
>     random.seed(42) # want reproducible result
>     data = [
>         (random.random()*10, random.random()*100-50.)
>         for _ in range(N)]
> 
> 
>     print minimized([x for x, y in data],
>                     [y for x, y in data], 4)
> 
> 
> As this is mostly number crunching the Python version is probably a lot 
> slower than the C code. I tried to move the inner loop into numpy with
> import numpy
> 
> [...]
> 
> def error(x, y, istart, iend):
>     lin_estimate = y[istart] + ((y[iend] - y[istart]) *
>                    ((x[istart:iend] - x[istart]) / (x[iend] - x[istart])))
>     delta = lin_estimate - y[istart:iend]
>     return (delta*delta).sum()
> 
> [...]
> 
>     print minimized(numpy.array([x for x, y in data]),
>                     numpy.array([y for x, y in data]), 4)
> 
> but that had no noticeable effect. There may be potential gains for someone 
> willing to put in more work or with better knowledge of numpy.
> 
> 
> By the way, your method of calculating the error 
> 
> > double error(int istart, int iend)
> > {
> > 
> > // linear interpolation I can optimize but left
> > // it not optimized for clarity of the function
> > 
> >     int m;
> >     double lin_estimate;
> >     double errorsq;
> >     errorsq = 0;
> > 
> >     for (m=istart; m<iend; m++)
> >     {
> >         lin_estimate = y[istart] + ((y[iend] – y[istart]) *
> >                        ((x[m] – x[istart]) / (x[iend] – x[istart])));
> > 
> >         errorsq += (lin_estimate – y[m]) * (lin_estimate – y[m]);
> >     }
> >     return (errorsq);
> > }
> 
> (which I adopted) looks odd. With a tried and tested method like "least 
> square fit" which does not require the line (segments) to go through any 
> point of the dataset you should get better fits.
> 
> PS: Caveat emptor! I did not even plot a graph with the result to check 
> plausibility; there may be embarassing errors.
> 
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor