finding repeated data sequences in a column

Rhodri James rhodri at wildebst.demon.co.uk
Thu May 21 22:35:41 EDT 2009


On Thu, 21 May 2009 08:55:45 +0100, yadin <conra2004 at yahoo.com> wrote:

> this is the program...I wrote but is not working
> I have a list of valves, and another of pressures;
> If I am ask to find out which ones are the valves that are using all
> this set of pressures, wanted best pressures
> this is the program i wrote but is not working properly, it suppossed
> to return in the case
> find all the valves that are using pressures 1 "and" 2 "and" 3.

So if I understand you correctly, you actually want to split your
data up by valve name to find each valve that has listed pressures of
1, 2 and 3 in that order?  That's a lot simpler, though it has to be
said that your data isn't in a terribly convenient format.

> It returns me A, A2, A35....
> The correct answer supposed to be A and A2...
> if I were asked for pressures 56 and 78 the correct answer supossed to
> be valves G and G2...

Ah, so the target "best" pressure sequence doesn't have to be all of the
values listed.  Hmm.  Here goes...

====HERE BE CODE====

 from itertools import izip, groupby

VALVES = ['A','A','A','G', 'G', 'G',
           'C','A2','A2','A2','F','G2',
           'G2','G2','A35','A345','A4'] ##valve names
PRESSURES = [1,2,3,4235,56,78,
              12, 1, 2, 3, 445, 45,
              56,78, 1, 23,7] ## valve pressures
TARGET = [1, 2, 3]

target_len = len(TARGET) # Since we're using this a lot
result = []

for valve, p in groupby(izip(VALVES, PRESSURES),
                         key=lambda x: x[0]):
   pressures = [x[1] for x in p]
   for i in xrange((len(pressures) - target_len) + 1):
     if pressures[i:i+target_len] == TARGET:
       result.append(valve)
       break

print "The answer you want is", result

====HERE ENDETH THE CODE====

Not terribly pretty largely because of having to do sublist
matching, but it should work for most "best pressures".

The unfamiliar looking stuff are functions from the iterator
toolkit that make this a lot simpler.  If you don't get what's
going on here, I don't blame you.  I just deleted my attempt
to explain it because it was confusing me :-)  Reading the
descriptions of izip and groupby in the standard library
documentation should make things clearer.

-- 
Rhodri James *-* Wildebeeste Herder to the Masses



More information about the Python-list mailing list