itertools.flatten()? and copying generators/iterators.

Wed Oct 29 10:57:05 EST 2003

Alex Martelli wrote:

> The key thing is watching for difference in big-O behavior --
> and, here, we have none... just somewhat different (NOT
> by whole orders of magnitude!) multiplicative constants.
> THAT kind of performance issue is one we can live with
> more often than not!  Which, in turn, is what makes the
> ordinary use of exceptions generally tolerable from the
> point of view of performance, except in the very hottest
> bottlenecks...

You are right, but let me play a little with the code anyway:

def flatten_am(s):
    try:
        s+''
    except TypeError:
        try: it = iter(s)
        except TypeError:
            yield s
        else:
            for elem in s:
                for subelem in flatten_am(elem):
                    yield subelem
    else:
        yield s

def flatten_dict(s, isScalar):
    try:
        scalar = isScalar[s.__class__]
    except KeyError:
        t = s.__class__
        try:
            s = iter(s)
        except TypeError:
            scalar = isScalar[t] = True
        else:
            scalar = isScalar[t] = False

    if scalar:
        yield s
    else:
        for elem in s:
            for subelem in flatten_dict(elem, isScalar):
                yield subelem

# an example tree -- a mix of strings, items, some nesting
subtree = ['foo']*18 + [1,2,3]*6
tree = [ subtree*10, [ subtree * 8 ] ]

def pro_am():
    for f in flatten_am(tree): pass

def pro_dict():
    for f in flatten_dict(tree, {"".__class__: True}): pass

assert list(flatten_am(tree)) == list(flatten_dict(tree, {"".__class__:
True}))

This gives for flatten_am()
...> timeit.py -c -s"import fl" "fl.pro_am()"
100 loops, best of 3: 5.7e+03 usec per loop

versus

...> timeit.py -c -s"import fl" "fl.pro_dict()"
1000 loops, best of 3: 1.37e+03 usec per loop

for flatten_dict(). So you get some speed up at the cost of uglier though
still quite readable code and under the assumption that either all or no
instances of a class are iterable.
Of course the idea works with both isinstance() and a + "" but seems
somewhat more in the spirit of isinstance().

Peter