[Python-Dev] itertools.walk()

Bob Ippolito bob at redivi.com
Wed Mar 16 15:17:39 CET 2005


On Mar 16, 2005, at 8:37 AM, Nick Coghlan wrote:

> Bob Ippolito wrote:
>> On Mar 16, 2005, at 6:19, Raymond Hettinger wrote:
>>> Some folks on comp.lang.python have been pushing for itertools to
>>> include a flatten() operation.  Unless you guys have some thoughts on
>>> the subject, I'm inclined to accept the request.
>>>
>>> Rather than calling it flatten(), it would be called "walk" and 
>>> provide
>>> a generalized capability to descend through nested iterables 
>>> (similar to
>>> what os.walk does for directories).  The one wrinkle is having a
>>> stoplist argument to specify types that should be considered atomic
>>> eventhough they might be iterable (strings for example).
>> You could alternatively give them a way to supply their own "iter" 
>> function, like the code I demonstrate below:
>
> I think the extra flexibility ends up making the function harder to 
> comprehend and use. Here's a version with a simple stoplist:
...
> This makes it easy to reclassify certain things like dictionaries or 
> tuples as atomic elements.
>
> > # maybe there should be a bfswalk too?
>
> By putting the 'walk(subitr)' after the current itr when chaining?

Yeah

> If Raymond does decide to go for the flexible approach rather than the 
> simple one, then I'd vote for a full-featured approach like:

I don't mind that at all.  It's certainly convenient to have an easy 
stoplist.  The problem with the way you have implemented it is that 
basestring will cause infinite recursion if you use the built-in iter, 
so if you provide your own atomic_types, you damn well better remember 
to add in basestring.

> def walk(iterable, depth_first=True, atomic_types=(basestring,), 
> iter_factory=iter):
>   itr = iter(iterable)
>   while True:
>     for item in itr:
>       if isinstance(item, atomic_types):
>         yield item
>         continue
>       try:
>         subitr = iter_factory(item)
>       except TypeError:
>         yield item
>       else:
>         if depth_first:
>           itr = chain(walk(subitr), itr)
>         else:
>           itr = chain(itr, walk(subitr))
>         break
>     else:
>       break

I'm not sure why it's useful to explode the stack with all that 
recursion?  Mine didn't do that.  The control flow is nearly identical, 
but it looks more fragile (and you would get some really evil stack 
trace if iter_factory(foo) happened to raise something other than 
TypeError).

-bob



More information about the Python-Dev mailing list