Identifying the start of good data in a list
George Sakkis
george.sakkis at gmail.com
Wed Aug 27 23:09:58 CEST 2008
On Aug 27, 3:00 pm, Gerard flanagan <grflana... at gmail.com> wrote:
> tkp... at hotmail.com wrote:
> > I have a list that starts with zeros, has sporadic data, and then has
> > good data. I define the point at which the data turns good to be the
> > first index with a non-zero entry that is followed by at least 4
> > consecutive non-zero data items (i.e. a week's worth of non-zero
> > data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
> > 9], I would define the point at which data turns good to be 4 (1
> > followed by 2, 3, 4, 5).
>
> > I have a simple algorithm to identify this changepoint, but it looks
> > crude: is there a cleaner, more elegant way to do this?
>
> > flag = True
> > i=-1
> > j=0
> > while flag and i < len(retHist)-1:
> > i += 1
> > if retHist[i] == 0:
> > j = 0
> > else:
> > j += 1
> > if j == 5:
> > flag = False
>
> > del retHist[:i-4]
>
> > Thanks in advance for your help
>
> > Thomas Philips
>
> data = [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>
> def itergood(indata):
> indata = iter(indata)
> buf = []
> while len(buf) < 4:
> buf.append(indata.next())
> if buf[-1] == 0:
> buf[:] = []
> for x in buf:
> yield x
> for x in indata:
> yield x
>
> for d in itergood(data):
> print d
This seems the most efficient so far for arbitrary iterables. With a
few micro-optimizations it becomes:
from itertools import chain
def itergood(indata, good_ones=4):
indata = iter(indata); get_next = indata.next
buf = []; append = buf.append
while len(buf) < good_ones:
next = get_next()
if next: append(next)
else: del buf[:]
return chain(buf, indata)
$ python -m timeit -s "x = 1000*[0, 0, 0, 1, 2, 3] + [1,2,3,4]; from
itergood import itergood" "list(itergood(x))"
100 loops, best of 3: 3.09 msec per loop
And with Psyco enabled:
$ python -m timeit -s "x = 1000*[0, 0, 0, 1, 2, 3] + [1,2,3,4]; from
itergood import itergood" "list(itergood(x))"
1000 loops, best of 3: 466 usec per loop
George
More information about the Python-list
mailing list