Linear Time Tree Traversal Generator
Steve D'Aprano
steve+python at pearwood.info
Tue Sep 20 22:34:13 EDT 2016
On Wed, 21 Sep 2016 02:46 am, ROGER GRAYDON CHRISTMAN wrote:
> I am trying to find a better (i.e. more efficient) way to implement a
> generator that traverses a tree.
>
> The current model of the code (which is also used by a textbook I am
> teaching from does this)
>
> def __iter__(node):
> for x in iter(node._left):
> yield x
> yield node._value
> for x in iter(node._right)
> yield x
>
> This is nice, simple, and straightforward, but has an O(n log n) running
> time, since
> values from the leaves of the tree have to be yielded multiple times to
> the top of the tree.
I'm afraid I don't understand this. This is a standard binary tree inorder
traversal. Each node is visited once, and there are N nodes, so I make that
out to be O(N) not O(N log N). I'm afraid I can't parse your final clause:
"since values from the leaves of the tree have to be yielded
multiple times to the top of the tree"
Each leaf is yielded once, and I don't understand what you mean by yielding
to the top of the tree.
I thought that maybe I missed something obvious, so I knocked up a quick and
dirty tree structure, monkey-patched the iter() built-in to count how many
times it was called, and ran your traversal code. Here's my code:
# ----- cut -----
class Node:
def __init__(self, value):
self._left = [] # Sentinel for an empty leaf.
self._right = []
self._value = value
def __iter__(node):
# By the way, you don't have to explicitly call iter() here.
for x in iter(node._left):
yield x
yield node._value
for x in iter(node._right):
yield x
def insert(tree, value):
if value < tree._value:
if tree._left == []:
# add new leaf node
tree._left = Node(value)
else:
insert(tree._left, value)
elif value > tree._value:
if tree._right == []:
tree._right = Node(value)
else:
insert(tree._right, value)
data = list(range(10000))
import random
random.shuffle(data)
tree = Node(data[0])
for value in data[1:]:
insert(tree, value)
_iter = iter # grab a reference to the built-in
count = 0
def iter(obj): # Monkey-patch the built-in
global count
count += 1
return _iter(obj)
assert list(tree) == sorted(data)
print(count)
# ----- cut -----
which prints 20000, as expected: for each node, iter() gets called twice.
So I don't understand where your O(N log N) behaviour is coming from.
--
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.
More information about the Python-list
mailing list