[Python-Dev] Suggestion for a new built-in - flatten

Bob Ippolito bob at redivi.com
Fri Sep 22 22:34:23 CEST 2006


On 9/22/06, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> "Bob Ippolito" <bob at redivi.com> wrote:
> > On 9/22/06, Brian Harring <ferringb at gmail.com> wrote:
> > > On Fri, Sep 22, 2006 at 12:05:19PM -0700, Bob Ippolito wrote:
> > > > I think instead of adding a flatten function perhaps we should think
> > > > about adding something like Erlang's "iolist" support. The idea is
> > > > that methods like "writelines" should be able to take nested iterators
> > > > and consume any object they find that implements the buffer protocol.
> > >
> > > Which is no different then just passing in a generator/iterator that
> > > does flattening.
> > >
> > > Don't much see the point in gumming up the file protocol with this
> > > special casing; still will have requests for a flattener elsewhere.
> > >
> > > If flattening was added, should definitely be a general obj, not a
> > > special casing in one method in my opinion.
> >
> > I disagree, the reason for iolist is performance and convenience; the
> > required indirection of having to explicitly call a flattener function
> > removes some optimization potential and makes it less convenient to
> > use.
>
> Sorry Bob, but I disagree.  In the few times where I've needed to 'write
> a list of buffers to a file handle', I find that iterating over the
> buffers to be sufficient.  And honestly, in all of my time dealing
> with socket and file IO, I've never needed to write a list of iterators
> of buffers.  Not to say that YAGNI, but I'd like to see an example where
> 1) it was being used in the wild, and 2) where it would be a measurable
> speedup.

The primary use for this is structured data, mostly file formats,
where you can't write the beginning until you have a bunch of
information about the entire structure such as the number of items or
the count of bytes when serialized. An efficient way to do that is
just to build a bunch of nested lists that you can use to calculate
the size (iolist_size(...) in Erlang) instead of having to write a
visitor that constructs a new flat list or writes to StringIO first. I
suppose in the most common case, for performance reasons, you would
want to restrict this to sequences only (as in PySequence_Fast)
because iolist_size(...) should be non-destructive (or else it has to
flatten into a new list anyway).

I've definitely done this before in Python, most recently here:
http://svn.red-bean.com/bob/flashticle/trunk/flashticle/

The flatten function in this case is flashticle.util.iter_only, and
it's used in flashticle.actions, flashticle.amf, flashticle.flv,
flashticle.swf, and flashticle.remoting.

-bob


More information about the Python-Dev mailing list