[Python-Dev] Suggestion for a new built-in - flatten

Sat Sep 23 02:35:04 CEST 2006

On Fri, 22 Sep 2006 20:55:18 +0100, Michael Foord <fuzzyman at voidspace.org.uk> wrote:

>glyph at divmod.com wrote:
>>On Fri, 22 Sep 2006 18:43:42 +0100, Michael Foord 
>><fuzzyman at voidspace.org.uk> wrote:

>>This wouldn't be a problem except that everyone has a different idea of 
>>those requirements:).

You didn't really address this, and it was my main point.  In fact, you more or less made my point for me.  You just assume that the type of application you have in mind right now is the only one that wants to use a flatten function, and dismiss out of hand any uses that I might have in mind.

>If you consume iterables, and only special case strings - then none of the 
>issues you raise above seem to be a problem.

You have just made two major policy decisions about the flattener without presenting a specific use case or set of use cases it is meant to be restricted to.

For example, you suggest special casing strings.  Why?  Your guideline otherwise is to follow what the iter() or list() functions do.  What about user-defined classes which subclass str and implement __iter__?

>Sets and dictionaries are both iterable.
>
>If it's not iterable it's an element.
>
>I'd prefer to see this as a built-in, lots of people seem to want it. IMHO

Can you give specific examples?  The only significant use of a flattener I'm intimately familiar with (Nevow) works absolutely nothing like what you described.

>Having it in itertools is a good compromise.

No need to compromise with me.  I am not in a position to reject your change.  No particular reason for me to make any concessions either: I'm simply trying to communicate the fact that I think this is a terrible idea, not come to an agreement with you about how progress might be made.  Absolutely no changes on this front are A-OK by me :).

You have made a case for the fact that, perhaps, you should have a utility library which you use in all your projects could use for consistency and to avoid repeating yourself, since you have a clearly defined need for what a flattener should do.  I haven't read anything that indicates there's a good reason for this function to be in the standard library.  What are the use cases?

It's definitely better for the core language to define lots of basic types so that you can say something in a library like "returns a dict mapping strings to ints" without having a huge argument about what "dict" and "string" and "int" mean.  What's the benefit to having everyone flatten things the same way, though?  Flattening really isn't that common of an operation, and in the cases where it's needed, a unified approach would only help if you had two flattenable data-structures from different libraries which needed to be combined.  I can't say I've ever seen a case where that would happen, let alone for it to be common enough that there should be something in the core language to support it.

>>What do you do if you encounter a function?  This is kind of a trick 
>>question, since Nevow's "flattener" *calls* functions as it encounters 
>>them, then treats the *result* of calling them as further input.
>>
>Sounds like not what anyone would normally expect.

Of course not.  My point is that there is nothing that anyone would "normally" expect from a flattener except a few basic common features.  Bob's use-case is completely different from yours, for example: he's talking about flattening to support high-performance I/O.

>What does the list constructor do with these ? Do the same.

>>> list('hello')
['h', 'e', 'l', 'l', 'o']

What more can I say?

>>Do you produce the output as a structured list or an iterator that works 
>>incrementally?

>Either would be fine. I had in mind a list, but converting an iterator into 
>a list is trivial.

There are applications where this makes a big difference.  Bob, for example, suggested that this should only work on structures that support the PySequence_Fast operations.

>>Also, at least Nevow uses "flatten" to mean "serialize to bytes", not 
>>"produce a flat list", and I imagine at least a few other web frameworks do 
>>as well.  That starts to get into encoding issues.

>Not a use of the term I've come across. On the other hand I've heard of 
>flatten in the context of nested data-structures many times.

Nevertheless the only respondent even mildly in favor of your proposal so far also mentions flattening sequences of bytes, although not quite as directly.

>I think that you're over complicating it and that the term flatten is really 
>fairly straightforward. Especially if it's clearly documented in terms of 
>consuming iterables.

And I think that you're over-simplifying.  If you can demonstrate that there is really a broad consensus that this sort of thing is useful in a wide variety of applications, then sure, I wouldn't complain too much.  But I've spent a LOT of time thinking about what "flattening" is, and several applications that I've worked on have very different ideas about how it should work, and I see very little benefit to unifying them.  That's just the work of one programmer; I have to assume that the broader domain of all applications which do structure flattening is much more diverse.