Loop from 'aaaa' to 'tttt' ?

Peter Hansen peter at engcorp.com
Mon Jun 16 19:25:29 CEST 2003


Lars Schaps wrote:
> 
> In my program in need a loop from 'aaaa' over
> 'aaac', 'aaag', 'aaat', 'aaca' to 'tttt'.
> (Possible characters 'a', 'c', 'g' and 't')
> 
> One idea i had is to take a number n with the base of
> 4 and use
> 
> t= string.translate( '0123', 'acgt')
> string.translate( n, t)

>>> set = 'acgt'
>>> sets = [''.join(a,b,c,d) for a in set for b in set for c in set for d in set]
>>> sets
['aaaa', 'aaac', 'aaag', 'aaat', 'aaca', 'aacc', 'aacg', 'aact', 'aaga', 'aagc',
 'aagg', 'aagt', 'aata', 'aatc', 'aatg', 'aatt', 'acaa', 'acac', 'acag', 'acat',
...[snip]...
 'ttgg', 'ttgt', 'ttta', 'tttc', 'tttg', 'tttt']

:-)

By the way, if you haven't already, note that four sets of four values is the
same size as a byte (256 possibilities).  That might simplify some of what you're
doig as you could use bytes or strings (of bytes) to represent the sequences
and only translate to the above sets of four characters when you need to output
something.  Put the results of the above in a dictionary and it would be very
quick to do a lookup:

>>> baseMapping = dict(zip(range(len(sets)), sets))

(I don't think that last works in any version prior to 2.3, and I probably
got the syntax wrong for 2.3 anyway, but you might get the point. :-)

-Peter




More information about the Python-list mailing list