Stripping non-numbers from a file parse without nested lists?

Wed Apr 1 11:05:01 EDT 2009

On Apr 1, 2:35 am, daku9... at gmail.com wrote:
> On Mar 31, 6:47 pm, "Rhodri James" <rho... at wildebst.demon.co.uk>
> wrote:
>
> > What you're doing (pace error checking) seems fine for the data
> > structures that you're using.  I'm not entirely clear what your usage
> > pattern for "dip" and "dir" is once you've got them, so I can't say
> > whether there's a more appropriate shape for them.  I am a bit curious
> > though as to why a nested list is non-ideal?
>
> > ...
> >      if "/" in word and "dip" not in word:
> >         dip_n_dir.append(word.split("/", 1))
>
> > is marginally shorter, and has the virtue of making it harder to use
> > unrelated dip and dir values together.
>
> > --
> > Rhodri James *-* Wildebeeste Herder to the Masses
>
> Rhodri,
>
> Thanks.  That works better than what I had before and I learned a new
> method of parsing what I was looking for.
>
> Now I'm on to jumping a set number of lines from a given positive
> search match:
>
> ...(lines of garbage)...
> 5656      (or some other value I want, but don't explicitly know)
> ...(18 lines of garbage)...
> search object
> ...(lines of garbage)...
>
> I've tried:
>
> def read_poles(filename):
>   index = 0
>   fh = None
>   try:
>       fh = open(filename, "r")
>       lines=fh.readlines()
>       while True:
>
>           if "search object" in lines[index]
>               poles = int(lines[index-18])
>               print(poles)
>
>           index +=1
>
>   except(IndexError): pass
>
>   finally:
>       if fh is not None: # close file
>           fh.close()
>
> ------------------
>
> Which half works.  If it's not found, IndexError is caught and passed
> (avoids quitting on lines[index out of range].  The print(poles)
> properly displays the value I am looking for (_always_ 18 lines before
> the search object).
>
> However, since it is assigned using the index variable, the value of
> poles doesn't keep (poles is always zero when referenced outside of
> the read_poles function).  I'm assuming because I'm pointing to a
> certain position of an object and once index moves on, it no longer
> points to anything valid.  My python book suggested using
> copy.deepcopy, but that didn't get around the fact I am calling it on
> (index-18).
>
> Any experience jumping back (or forward) a set number of lines once a
> search object is found?  This is the only way I can think of doing it
> and it clearly has some problems.
>
> Reading the file line by line using for line in blah works for finding
> the search object, but I can't see a way of going back the 18 lines to
> grabbing what I need.
>
> Thanks for the help!  I'm slowly getting this mangled mess of a file
> into something automated (hand investigating the several thousand
> files I need to do would be unpleasant).

# You could try using a deque holding 18 lines and search using that
deque
# This is untested, but here's a try (>=Python 3.0)
from collections import deque
import itertools as it
import sys


def read_poles(filename):
    with open(filename) as f:
        line_iter = iter(f)
        d = deque(it.islice(line_iter,17), maxlen=18)

        for line in line_iter:
            d.append(line)

            if 'search object' in line:
                poles = int(d[0])
                print(poles)
                return poles
        else:
            print('No poles found in', filename, file=sys.err)