[Tutor] Recursively flatten the list

Peter Otten __peter__ at web.de
Thu Mar 24 09:09:28 CET 2011


Dharmit Shah wrote:

> I am learning Python and yesterday I cam across a definition wherein I was
> supposed to flatten a list recursively. I am getting the solution properly
> but wanted to know if I can optimize the code further.
> 
> #!/usr/bin/env python
> new_list=[]
> def flatten(num_list):
>     """
>       >>> flatten([2, 9, [2, 1, 13, 2], 8, [2, 6]])
>       [2, 9, 2, 1, 13, 2, 8, 2, 6]
>       >>> flatten([[9, [7, 1, 13, 2], 8], [7, 6]])
>       [9, 7, 1, 13, 2, 8, 7, 6]
>       >>> flatten([[9, [7, 1, 13, 2], 8], [2, 6]])
>       [9, 7, 1, 13, 2, 8, 2, 6]
>       >>> flatten([[5, [5, [1, 5], 5], 5], [5, 6]])
>       [5, 5, 1, 5, 5, 5, 5, 6]
>     """
>     global new_list
>     for i in num_list:
>         if type(i) == type([]):
>             new_list = flatten(i)
>         else:
>             new_list.append(i)
>     tmp = new_list
>     new_list=[]
>     return tmp
> 
> if __name__=="__main__":
>     import doctest
>     doctest.testmod()
> 
> PS - My knowledge of Python is still very basic and I am trying to dive
> into it as deeper as I can. Solutions on Stackoverflow.com were beyond my
> understandability.

You have a working solution, and you have tests to prove it. That's a good 
start!

Now let's think more tests to break it:

(1) Deeply nested lists:

items = []
for i in range(2000):
    items = [items] 
flatten(items)

Python has a limited stacksize, i. e. allows only for a limited number of 
nested function calls. There's not much you can do about that other than 
switch to a non-recursive implementation of flatten(). Most of the time it's 
not a problem in practice; you should just know about it.

(2) Lists that contain themselves:

items = []
items.append(items)
flatten(items)

This one would run forever were it not for the limited stack size. You are  
even less likely to run into that in practice, but it might still be an 
interesting challenge for you to come up with a solution for the problem.

(3) Multithreaded programs:

Because you use a global variable you cannot run two instances of the 
function simultaneously. This problem leads to intermittent bugs that are 
really hard to find. You should try to eliminate the global new_list 
variable from your flatten() function.

Unfortunately the demo is a bit messy, so don't waste too much time on 
trying to understand it. You can revisit it later if you like. Just remember 
the rule of thumb: don't use global variables if you can avoid them.

if __name__=="__main__":
    import threading
    def set_flattened(num_list, index):
        results[index] = flatten(num_list)

    A = [5, [5, [1, 5], 5], 5], [5, 6]
    B = [[50, [50, [10, 50], 50], 50], [50, 60]]
    EXPECTED = [flatten(A), flatten(B)]
    print "expected output:"
    print EXPECTED
    i = 1
    results = [None, None]
    while True:
        # run flatten() in two threads simultaneously...
        a = threading.Thread(target=set_flattened, args=(A, 0))
        b = threading.Thread(target=set_flattened, args=(B, 1))
                
        for t in a, b:
            t.start()
        for t in a, b:
            t.join()

        #... until we see an unexpected result
        if results != EXPECTED:
            print "The %dth attempt gave this preculiar result:" % i
            print results
            break
        i += 1




More information about the Tutor mailing list