[Python-Dev] Suggestion for a new built-in - flatten
glyph at divmod.com
glyph at divmod.com
Fri Sep 22 21:29:28 CEST 2006
On Fri, 22 Sep 2006 18:43:42 +0100, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
>I have a suggestion for a new Python built in function: 'flatten'.
This seems superficially like a good idea, but I think adding it to Python anywhere would do a lot more harm than good. I can see that consensus is already strongly against a builtin, but I think it would be bad to add to itertools too.
Flattening always *seems* to be a trivial and obvious operation. "I just need something that takes a group of deeply structured data and turns it into a group of shallowly structured data.". Everyone that has this requirement assumes that their list of implicit requirements for "flattening" is the obviously correct one.
This wouldn't be a problem except that everyone has a different idea of those requirements:).
Here are a few issues.
What do you do when you encounter a dict? You can treat it as its keys(), its values(), or its items().
What do you do when you encounter an iterable object?
What order do you flatten set()s in? (and, ha ha, do you Set the same?)
How are user-defined flattening behaviors registered? Is it a new special method, a registration API?
How do you pass information about the flattening in progress to the user-defined behaviors?
If you do something special to iterables, do you special-case strings? Why or why not?
What do you do if you encounter a function? This is kind of a trick question, since Nevow's "flattener" *calls* functions as it encounters them, then treats the *result* of calling them as further input.
If you don't think that functions are special, what about *generator* functions? How do you tell the difference? What about functions that return generators but aren't themselves generators? What about functions that return non-generator iterators? What about pre-generated generator objects (if you don't want to treat iterables as special, are generators special?).
Do you produce the output as a structured list or an iterator that works incrementally?
Also, at least Nevow uses "flatten" to mean "serialize to bytes", not "produce a flat list", and I imagine at least a few other web frameworks do as well. That starts to get into encoding issues.
If you make a decision one way or another on any of these questions of policy, you are going to make flatten() useless to a significant portion of its potential userbase. The only difference between having it in the standard library and not is that if it's there, they'll spend an hour being confused by the weird way that it's dealing with <insert your favorite data type here> rather than just doing the "obvious" thing, and they'll take a minute to write the 10-line function that they need. Without the standard library, they'll skip to step 2 and save a lot of time.
I would love to see a unified API that figured out all of these problems, and put them together into a (non-stdlib) library that anyone interested could use for a few years to work the kinks out. Although it might be nice to have a simple "flatten" interface, I don't think that it would ever be simple enough to stick into a builtin; it would just be the default instance of the IncrementalDestructuringProcess class with the most popular (as determined by polling users of the library after a year or so) IncrementalDestructuringTypePolicy.
More information about the Python-Dev