unbalanced tree iteration issue

kost BebiX k.bx at ya.ru
Fri Jan 14 08:26:59 EST 2011


14.01.2011, 14:15, "Alex Boyko" <alex.kyrish at gmail.com>:
> Dear All!
>
> I have deal with large unbalanced trees and I have to implement post-order tree traversal. My first attempt is shown below ("Node" and "Tree" classes) and based on recursive generators approach.
>
> class Node():
>     def __init__(self,value):
>         self.childs = []
>         self.value = value
>
> class Tree():
>
>     def __init__(self, root):
>         self.root = root
>         self.numberCells = 1
>
>     def add(self, node, child):
>         node.childs.append(child)
>         self.numberCells+=1
>
>     def __iter__(self):
>         return self.postorder(self.root)
>
>     def postorder(self, node):
>         if node:
>             for child in node.childs:
>                 for n in self.postorder(child):
>                     yield n
>             yield node
>
> It works fine for small test trees. But, my tree has approximately 300000 nodes, and shown post order traversal with generators takes 80 sec against 1 sec with simple recursive routine:
>
> def recursiveFromTop(node):
>     for child in node.childs:
>         recursiveFromTop(child)
>         ## here I can do some computations with current node's data
>
> So, I'd like to know how should I implement (if it's possible of course) __iter__ for my tree class based on recursion without generators? Please, can You show me the ways?
> because I'm very passionate in idea iterate through my tree with simple:
>
> for node in tree:
>    do something with node
>
> Thanks in Advance!
> Best Regards!
> Alex
>
> --
> http://mail.python.org/mailman/listinfo/python-list

Well, I think it's actually because the difference is that you would not do yielding, you would just put everything in memory and then return it.

ret_val = [x for x in self.postorder(child)]
return ret_val + [self]

or something like that (but beware of memory). But that's strange. This code works fast:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import sys

def w(s):
    sys.stdout.write("%s" % s)
    sys.stdout.flush()

class Node():
    __slots__ = ('childs', 'value',)
    
    def __init__(self, value):
        self.childs = []
        self.value = value

    def post_order(self):
        for child in self.childs:
            yield child
        yield self

def build_tree():
    def append_1000_childs(node):
        for i in xrange(20):
            node.childs.append(Node(10))
    
    def append_n_levels(node, levels=1):
        if levels >= 1:
            append_1000_childs(node)
            if levels > 1:
                for child in node.childs:
                    append_n_levels(child, levels - 1)
    
    root = Node(10)
    append_n_levels(root, 5)
    return root
    
if __name__ == '__main__':
    from datetime import datetime

    w("building tree...")
    _t = datetime.now()
    root = build_tree()
    w("done\n")
    w(datetime.now() - _t)
    w("\n")
    
    w("doing generator post_order...")
    _t = datetime.now()
    for item in root.post_order():
        fake = item.value
    w("done\n")
    w(datetime.now() - _t)
    w("\n")

    def post_order(root):
        for child in root.childs:
            post_order(child)
            fake = item.value
    
    w("doing non-generator post_order...")
    _t = datetime.now()
    post_order(root)
    w("done\n")
    w(datetime.now() - _t)
    w("\n")

$ python postorder.py 
building tree...done
0:01:34.422288
doing generator post_order...done
0:00:00.000018
doing non-generator post_order...done
0:00:01.232272

-- 
jabber: k.bx at ya.ru



More information about the Python-list mailing list