recombination variations
Peter Otten
__peter__ at web.de
Wed Dec 1 15:57:56 EST 2004
David Siedband wrote:
> The problem I'm solving is to take a sequence like 'ATSGS' and make all
> the DNA sequences it represents. The A, T, and G are fine but the S
> represents C or G. I want to take this input:
>
> [ [ 'A' ] , [ 'T' ] , [ 'C' , 'G' ], [ 'G' ] , [ 'C' , 'G' ] ]
>
> and make the list:
>
> [ 'ATCGC' , 'ATCGG' , 'ATGGC' , 'ATGGG' ]
[...]
The code you provide only addresses the first part of your problem, and so
does mine:
>>> def disambiguate(seq, alphabet):
... return [list(alphabet.get(c, c)) for c in seq]
...
>>> alphabet = {
... "W": "AT",
... "S": "CG"
... }
>>> disambiguate("ATSGS", alphabet)
[['A'], ['T'], ['C', 'G'], ['G'], ['C', 'G']]
Note that "identity entries" (e. g. mapping "A" to "A") in the alphabet
dictionary are no longer necessary. The list() call in disambiguate() is
most likely superfluous, but I put it in to meet your spec accurately.
Now on to the next step :-)
Peter
More information about the Python-list
mailing list