Performance of list comprehensions vs. map
chrishbarker at home.net
Fri Sep 7 22:03:34 CEST 2001
Chris Barker wrote:
> and can't imagine how it could happen in a single thread.
Clearly my imagination was very lame when I wrote that!!!
Alex Martelli wrote:
> Let's not confuse *re-binding* of the sequence (which would have no effect
> on any of map, filter, reduce, list comprehension, for loops - each
> takes a *reference* to the sequence at the start, of course) with
> *alteration* (modification) of the sequence object.
I am starting to see a big hole in my idea for homogenous sequences...
Even if the sequence is homogenous when a function or somthing is called
( like map() ), it may not remain homogenous, so the interpreter would
have to check that there was only one reference to the object in order
to perform the optimisations I'm imagining. This wouldn't happen very
often. in fact, it could only work if you made the copy in the call:
map(fun, list[:]). That would kind of kill the kind of auto-optimizing
The only way for it to work would be to define a given sequence an
statically homogenous. These do exist: strings, for instance, and, in
fact, a homogenous tuple could not be modified either. As well as NumPy
arrays. (actually, I think you can change the type of a NumPy array on
the fly, at least with a C extension, but I don't think it can be done
So the idea of keeping a dynamic homogenous flag with lists and
dictionaries is probably pretty useless, but it may still make sense to
have such a flag for other sequences: ones that are either immutable, or
defined to be homogenous. With the native types, I suppose there is no
need for a flag, but with extension types, it could still be useful.
One more question: is it possible for a function to be mutable?? Skip
Montanaro gave a good example of how a function could be re-bound in the
middle of a for loop or list comprehesion, but is there any way a
function could be redefined without re-binding, as in:
when this is called, the function bound to the "fun" name is passed to
map, as is the tuple bound to the name "t". The tuple is immutable, and
let's pretend it were also marked as homogeous. If the function is also
immutable, than map could conceivably compile a version of fun that
worked on the type of the items in t, and then apply that function to
all the elements in t, without type checking, and looping at the speed
of C. This is the kind of optimization I am imagining.
Granted, some one would have to write the code to do this , but at the
moment I'm trying to determine whether it is even possible. Also,
apparently there are folks working on some kinds of JIT compilation that
could be used here.
Even in a for loop or list comprehension, you would have to do the extra
check to see if the function had been re-bound, but that's the only
check that would have to be done.
I'm also seeing this as something that could be used by Py2C type
approaches, or hand written extensions. You could write a function that
operates on a homogenous sequence of strings, for instance. Then the
only type checking you would have to do is check if you got such a
sequence. If the sequence was mutable, you could make a copy in your
function (A lot of NumPy extensions do that: build a PyArray from
whatever is input, if a PyArray is input, it doesn't need to get
changed) You would have a call analagous to NumPy's
PyArray_FromObject(), maybe PyHomogenousSequenceFromObject() that would
build it for you if it was not what was passed in.
This is not the least bit trivial to impliment, but it could be done bit
ChrisHBarker at home.net --- --- ---
http://members.home.net/barkerlohmann ---@@ -----@@ -----@@
------@@@ ------@@@ ------@@@
Oil Spill Modeling ------ @ ------ @ ------ @
Water Resources Engineering ------- --------- --------
Coastal and Fluvial Hydrodynamics --------------------------------------
More information about the Python-list