[Numpy-discussion] "Extended" Outer Product

Timothy Hochberg tim.hochberg at ieee.org
Tue Aug 21 17:14:00 EDT 2007


On 8/21/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
>
> On 21/08/07, Timothy Hochberg <tim.hochberg at ieee.org> wrote:
>
> > This is just a general comment on recent threads of this type and not
> > directed specifically at Chuck or anyone else.
> >
> > IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is
> > often more memory friendly and thus faster to vectorize only the inner
> loop
> > and leave outer loops alone. Everything varies with the specific case of
> > course, but trying to avoid FOR loops on principle is not a good
> strategy.
>
> Yes and no. From a performance point of view, you are certainly right;
> vectorizing is definitely not always a speedup. But for me, the main
> advantage of vectorized operations is generally clarity: C = A*B is
> clearer and simpler than C = [a*b for (a,b) in zip(A,B)]. When it's
> not clearer and simpler, I feel no compunction about falling back to
> list comprehensions and for loops.


I always assume that in these cases performance is a driver of the question.
It would be straightforward to code an outer equivalent in Python to hide
this for anyone who cares. Since no one who asks these questions ever does,
I assume they must be primarily motivated by performance.

That said, it would often be nice to have something like
> map(f,arange(10)) for arrays; the best I've found is
> vectorize(f)(arange(10)).
>
> vectorize, of course, is a good example of my point above: it really
> just loops, in python IIRC,


I used to think that too, but then I looked at it and I believe it actually
grabs the code object out of the function and loops in C. You still have to
run the code object at each point though so it's not that fast. It's been a
while since I did that looking so I may be totally wrong.

but conceptually it's extremely handy for
> doing exactly what the OP wanted. Unfortunately vectorize() does not
> yield a sufficiently ufunc-like object to support .outer(), as that
> would be extremely tidy.


I suppose someone should fix that someday. However, I still think vectorize
is an attractive nuisance in the sense that someone has a function that they
want to apply to an array and they get sucked into throwing vectorize at the
problem. More often than not, vectorize makes things slower than they need
to be. If you don't care about performance, that's fine, but I live in fear
of code like:

   def f(a, b):
       return sin(a*b + a**2)
   f = vectorize(f)

The original function f is a perfectly acceptable vectorized function
(assuming one uses numpy.sin), but now it's been replaced by a slower
version by passing it through vectorize. To be sure, this isn't always the
case; in cases where you have to make choices, things get messier. Still,
I'm not convinced that vectorize doesn't hurt more than it helps.



-- 
.  __
.   |-\
.
.  tim.hochberg at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070821/67a298d2/attachment.html>


More information about the NumPy-Discussion mailing list