Loop from 'aaaa' to 'tttt' ?
Alex Martelli
aleax at aleax.it
Tue Jun 17 05:58:05 EDT 2003
Des Small wrote:
> Lars Schaps <tuffi23 at gmx.de> writes:
>
>> Hello.
>>
>> In my program in need a loop from 'aaaa' over
>> 'aaac', 'aaag', 'aaat', 'aaca' to 'tttt'.
>> (Possible characters 'a', 'c', 'g' and 't')
>>
>> One idea i had is to take a number n with the base of
>> 4 and use
>>
>> t= string.translate( '0123', 'acgt')
>> string.translate( n, t)
>>
>> But i don't know how to convert from base10 to base4.
>>
>> Has anyone a idea?
>
>
> I like to use recursive generators for this sort of thing:
>
> from __future__ import generators # Not needed in python 2.3
>
> def base_iter():
> for a in 'acgt': yield a
>
> def nbase_iter(n):
> if n==0: yield ''
> else:
> for first in base_iter():
> for rest in nbase_iter(n-1):
> yield first + rest
I think a recursive generator is the most elegant and practical
idea (even though I might disagree with some implementation
details of your specific generator). Still, "counting in base
N" is also quite acceptable -- and it should not need any base
conversion, since all you're doing is always "increment by 1"
(that's what "counting" stands for here). E.g.:
def countInBase(base, ndigits):
digits = ndigits * [0]
while 1:
curdig = ndigits - 1
while curdig >= 0 and digits[curdig] >= base-1:
digits[curdig] = 0
curdig -= 1
if curdig < 0:
break
digits[curdig] += 1
yield digits
You may want to yield tuple(digits) or digits[:] if you need to
be able to call list(countInBase(x, y)) or the like, but this
is suitable for a normal for loop. Of course, this yields items
which are lists of integers, not strings. But, the structure IS
what the OP asked for -- we just need some small adaptation,
which can be done in several ways, to give the OP just what he
wants... for example:
def codonStrings(lenghtOfStrings, codons='acgt'):
base = len(codons)-1
result = lengthOfStrings * [codons[0]]
while 1:
curindex = lenghtOfStrings - 1
while curindex >= 0 and result[curindex] >= base:
result[curindex] = 0
curindex -= 1
if curindex < 0:
break
result[curindex] += 1
yield ''.join( [codons[i] for i in result] )
As a matter of style, the outer loop's body might be recoded...:
for curindex in range(lengthOfStrings-1,-1,-1):
if result[curindex] < base:
result[curindex] += 1
break
result[curindex] = 0
else:
break
and the computation of "range(lengthOfStrings-1,-1,-1)" could,
and no doubt should, be hoisted out of the loop, giving:
def codonStrings(lenghtOfStrings, codons='acgt'):
base = len(codons)-1
indexSequence = range(lengthOfStrings-1,-1,-1)
result = lengthOfStrings * [codons[0]]
while 1:
for curindex in indexSequence:
if result[curindex] < base:
result[curindex] += 1
break
result[curindex] = 0
else:
break
yield ''.join( [codons[i] for i in result] )
but I guess it's a matter of personal taste whether this
structure is perceived as clearer or less clear than the
original one with the two nested while loops.
Alex
More information about the Python-list
mailing list