[Tutor] Huge list comprehension

Danny Yoo dyoo at hashcollision.org
Mon Jun 12 21:02:40 EDT 2017


On Sun, Jun 11, 2017 at 9:54 PM, syed zaidi <syedzaidi85 at hotmail.co.uk> wrote:
> Thanks
> One reason fornsharing the code was that I have to manually create over 100 variables


Don't do that.


Anytime you have to manually repeat things over and over is a sign
that you need a loop structure of some sort.


Let's look at a little code.

> with open("C:/Users/INVINCIBLE/Desktop/T2D_ALL_blastout_batch.txt", 'r') as f:
[code cut ]
>         if dictionary.get(sample_name):
>             dictionary[sample_name].append(operon)
>         else:
>             dictionary[sample_name] = []
>             dictionary[sample_name].append(operon)
> locals().update(dictionary) ## converts dictionary keys to variables


Ah.  That last statement there is really big warning sign.   Do *not*
use locals().  locals() is almost never a good thing to use.  It's
dangerous and leads to suffering.


Instead, just stick with your dictionary, and use the dictionary.  If
you need a mapping from a sample name to an array of strings,
construct another dictionary to hold that information.  Then most of
the code here:

> for i in main_op_list_np:
>     if i in DLF002: DLF002_1.append('1')
>     else:DLF002_1.append('0')
>     if i in DLF004: DLF004_1.append('1')
>     else:DLF004_1.append('0')
>     if i in DLF005: DLF005_1.append('1')
>     else:DLF005_1.append('0')
>     if i in DLF006: DLF006_1.append('1')
>     else:DLF006_1.append('0')
>     if i in DLF007: DLF007_1.append('1')
>     else:DLF007_1.append('0')
>     if i in DLF008: DLF008_1.append('1')
>     else:DLF008_1.append('0')
...

will dissolve into a simple dictionary lookup, followed by an array append.




Just to compare to a similar situation, consider the following.  Let's
say that we want to compute the letter frequency of a sentence.
Here's one way we could do it:

########################
def histogram(message):
    a = 0
    b = 0
    c = 0
    d = 0
    # .... cut

    for letter in message:
        if letter == 'a':
            a = a + 1
        else if letter == 'b':
            b = b + 1
        # ... cut

    return {
        'a': a,
        'b': b,
        'c': c,
        # ... cut
    }
########################

This is only a sketch.  We can see how this would go, if we fill in
the '...' with the obvious code.  But it would also be a very bad
approach.  It's highly repetitive, and easy to mistpe: you might miss
a letr.



There's a much better approach.  We can use a dictionary that maps
from a letter to its given frequency.

#########################
def histogram(message):
    frequencies = {}
    for letter in message:
        frequencies[letter] = frequencies.get(letter, 0) + 1
    return frequencies
#########################


Unlike the initial sketch, the version that takes advantage of
dictionaries is short and simple, and we can run it:

##############
>>> histogram('the quick brown fox')
{' ': 3, 'c': 1, 'b': 1, 'e': 1, 'f': 1, 'i': 1, 'h': 1, 'k': 1, 'o':
2, 'n': 1, 'q': 1, 'r': 1, 'u': 1, 't': 1, 'w': 1, 'x': 1}
>>> histogram('abacadabra')
{'a': 5, 'c': 1, 'b': 2, 'r': 1, 'd': 1}
##############



If you have questions, please feel free to ask.


More information about the Tutor mailing list