Itertools wishlists

Mon Mar 14 00:19:33 EST 2005

[bearophile]
> This was my suggestion for a possible flatten():
>
> flatten(sequence, level=-1, tuples=True, strings=False, safe=False)
> - tuples=True then it flattens tuples too.
> - strings=True then it flattens strings with len(s)>1 too.
> - safe if True it cheeks (with something like an iterative isrecursive)
> for
>   recursive references inside the sequence.
> - level allows to specify the mapping level:
>   level=0 no flattening.
>   level=1 the flattening is applied to the first level only.
>   level=2 the flattening is applied to the first and second level only.
>   level=m where m>=actual depth. This is as level=-1.
>   Etc.
>   And like in the indexing of lists:
>   level=-1 or None (default) means the flattening is applied up to the
> leaves.
>   level=-2 flattens up to pre-leaves.
>   Etc.

My suggestion is to stop smoking Crack and check into rehab ;-)

Each one of the options listed is a reason that flatten() shouldn't be an
itertool.  It fails tests of obviousness, learnability, complexity of
implementation, and simplicity of API.  The options also suggest that the
abstraction is not as basic or universal as we would hope.

> >And, there is also the issue of use cases. It appears to be much more
> fun to toy around with developing flatten() recipes than it is to work
> on applications that require it.<
>
> It's not easy to define "require" because usually there are many ways
> to solve every problem.

Perhaps "require" was the wrong word.  The issue is that appear to be very few
real situations where flatten() would be the tool of choice.

> There are two situations that I've found can make use of the
> flatten/partition, but you can probably find better ways to do the same
> thing (and I can appreciate suggestions):
>
> 1)
> The function f returns two values, but you need a flat list as result:
> def f(x): return x, x**2
> r = flatten( f(i) for i in range(10) )
> print r

This is not a good way to store the function's results.  It unnecessarily throws
away structure.  Unless creating some type of serialization function, it likely
the wrong thing to do.  Also, the example appears to be contrived and not
representative of real code.  If there is actually a need to treat multiple
return values as being undifferentiated, then your alternate solution with
extend() is the most appropriate solution.

> 2)
> A file with columns of numbers separated by a space/tab:
. . .
> ll = open("namefile").read().splitlines()
> r = [map(float, l.split()) for l in ll]

If you need that to be flattened one level, it would have been better to do all
the splits at once:

   # split on tabs, spaces, and newlines
   r = map(float, open('namefile').read().split())

Generalizing the two results, it may be fair to say that the desire to flatten
is a code smell indicating that structure is being unnecessarily destroyed or
that earlier processing introduced unwanted structure.  Let the data guide the
programming.

Raymond Hettinger