String Splitter Brain Teaser
Brian van den Broek
bvande at po-box.mcgill.ca
Sun Mar 27 19:09:54 EST 2005
James Stroud said unto the world upon 2005-03-27 17:39:
> Hello,
>
> I have strings represented as a combination of an alphabet (AGCT) and a an
> operator "/", that signifies degeneracy. I want to split these strings into
> lists of lists, where the degeneracies are members of the same list and
> non-degenerates are members of single item lists. An example will clarify
> this:
>
> "ATT/GATA/G"
>
> gets split to
>
> [['A'], ['T'], ['T', 'G'], ['A'], ['T'], ['A', 'G']]
>
> I have written a very ugly function to do this (listed below for the curious),
> but intuitively I think this should only take a couple of lines for one
> skilled in regex and/or listcomp. Any takers?
>
> James
>
> p.s. Here is the ugly function I wrote:
>
> def build_consensus(astr):
>
> consensus = [] # the lol that will be returned
> possibilities = [] # one element of consensus
> consecutives = 0 # keeps track of how many in a row
>
> for achar in astr:
> if (achar == "/"):
> consecutives = 0
> continue
> else:
> consecutives += 1
> if (consecutives > 1):
> consensus.append(possibilities)
> possibilities = [achar]
> else:
> possibilities.append(achar)
> if possibilities:
> consensus.append(possibilities)
> return consensus
Hi,
in the spirit of "Now I have two problems" I like to avoid r.e. when I
can. I don't think mine avoids a bit of ugly, but I, at least, find it
easier to grok (YMMV):
def build_consensus(string):
result = [[string[0]]] # starts list with a list of first char
accumulate = False
for char in string[1:]:
if char == '/':
accumulate = True
else:
if accumulate:
# The pop removes the last list appended, and we use
# its single item to build then new list to append.
result.append([result.pop()[0], char])
accumulate = False
else:
result.append([char])
return result
(Since list.append returns None, this could use
accumulate = result.append([result.pop()[0], char])
in place of the two lines in the if accumulate block, but I don't
think that is a gain worth paying for.)
HTH,
Brian vdB
More information about the Python-list
mailing list