Re: [Numpy-discussion] [C++-sig] Overloading sqrt(5.5)*myvector
![](https://secure.gravatar.com/avatar/ee8f79b6db74897a835877cb9519cd55.jpg?s=120&d=mm&r=g)
Roman Yakovenko wrote:
I don't have a "small and complete" example available, but I'll summarize from earlier posts. VPython (vpython.org) has its own vector class to mimic the properties of 3D vectors used in physics, in the service of easy creation of 3D animations. There is a beta version which imports numpy and uses it internally; the older production version uses Numeric. Boost python and thread libraries are used to connect the C++ VPython code to Python. There is operator overloading that includes scalar*vector and vector*scalar, both producing vector. With Numeric, sqrt produced a float, which was a scalar for the operator overloading. With numpy, sqrt produces a numpy.float64 which is caught by vector*scalar but not by scalar*vector, which means that scalar*vector produces an ndarray rather than a vector, which leads to a big performance hit in existing VPython programs. The overloading and Boost code is the same in the VPython/Numeric and VPython/numpy versions. I don't know whether the problem is with numpy or with Boost or with the combination of the two. Here is the relevant part of the vector class: inline vector operator*( const double s) const throw() { return vector( s*x, s*y, s*z); } and here is the free function for right multiplication: inline vector operator*( const double& s, const vector& v) { return vector( s*v.x, s*v.y, s*v.z); } Maybe the problem is in the Boost definitions: py::class_<vector>("vector", py::init< py::optional<double, double, double> >()) .def( self * double()) .def( double() * self) Left multiplication is fine, but right multiplication isn't. A colleague suggested the following Boost declarations but cautioned that he wasn't sure of the syntax for referring to operator, and indeed this doesn't compile: .def( "__mul__", &vector::operator*(double), "Multiply vector times scalar") .def( "__rmul__", &operator*(const double&, const vector&), "Multiply scalar times vector") I would really appreciate a Boost or numpy expert being able to tell me what's wrong (if anything) with these forms. However, I may have a useful workaround as I described in a post to the numpy discussion list. A colleague suggested that I do something like this for sqrt and other such mathematical functions: def sqrt(x): try: return mathsqrt(x) except TypeError: return numpysqrt(x) That is, first try the simple case of a scalar argument, handled by the math module sqrt, and only use the numpy sqrt routine in the case of an array argument. Even with the overhead of the try/except machinery, one gets must faster square roots for scalar arguments this way than with the numpy sqrt. Bruce Sherwood
![](https://secure.gravatar.com/avatar/ee8f79b6db74897a835877cb9519cd55.jpg?s=120&d=mm&r=g)
Okay, I've implemented the scheme below that was proposed by Scott Daniels on the VPython mailing list, and it solves my problem. It's also much faster than using numpy directly: even with the "def "and "if" overhead: sqrt(scalar) is over 3 times faster than the numpy sqrt, and sqrt(array) is very nearly as fast as the numpy sqrt. Thanks to those who made suggestions. There remains the question of why operator overloading of the kind I've described worked with Numeric and Boost but not with numpy and Boost. There is also the question of whether it would pay for numpy to make what is probably an exceedingly fast check and do much faster calculations of sqrt(scalar) and other such mathematical functions. Bruce Sherwood Bruce Sherwood wrote:
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Bruce Sherwood wrote:
There is no question that it would pay. It takes time and effort to implement, though. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/5dde29b54a3f1b76b2541d0a4a9b232c.jpg?s=120&d=mm&r=g)
Robert Kern wrote:
There is a question of whether it would pay enough to be worth the time and effort. For most use (at least most of my use), I'm either working with a small number of scalars, and performance is a non-issue, or I"m working with arrays of data, and then I need the numpy versions. So I really don't see the point of optimizing for scalar arguments. IIUC, the issue at hand involved not a desire for a faster sqrt(scalar), but rather an interaction with C++ code that didn't efficiently deal with numpy scalars -- that's a totally different issue, one that may have been solved by using math.sqrt() to avoid the numpy scalar, but not one that would that provides an argument for cluttering up numpy with this sort of special casing. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
![](https://secure.gravatar.com/avatar/40489da22d2dc0cc12596420bb810463.jpg?s=120&d=mm&r=g)
Bruce Sherwood wrote:
Using math.sqrt short-circuits the ufunc approach of returning numpy scalars (with all the methods and attributes of 0-d arrays --- which is really their primary reason for being). It is also faster because it avoids the numpy ufunc machinery (which is some overhead --- the error setting control and broadcasting facility doesn't happen for free).
Thus, when the numpy.float64 is first (its multiply implementation gets called first) and it uses the equivalent of ufunc.multiply(scalar, vector) I suspect that because your vector can be converted to an array, this procedure works (albeit more slowly than you would like), and so the vector object never gets a chance to try. A quick fix is to make it so that ufunc.multiply(scalar, vector) raises NotImplemented which may not be desireable, either. Alternatively, the generic scalar operations should probably not be so "inclusive" and should allow the other object a chance to perform the operation more often (by returning NotImplemented). -Travis O.
![](https://secure.gravatar.com/avatar/f351f66930a1449592dd11b288e95cb8.jpg?s=120&d=mm&r=g)
On Jan 7, 2008, at 19:57, Travis E. Oliphant wrote:
That would be great. In fact, this has been (and still is) the #1 problem for me in moving from Numeric to NumPy: as soon as I have NumPy scalar objects circulating in my programs, everything that behaves like a sequence gets converted to an array in math operations. Konrad. -- --------------------------------------------------------------------- Konrad Hinsen Centre de Biophysique Moléculaire, CNRS Orléans Synchrotron Soleil - Division Expériences Saint Aubin - BP 48 91192 Gif sur Yvette Cedex, France Tel. +33-1 69 35 97 15 E-Mail: hinsen@cnrs-orleans.fr ---------------------------------------------------------------------
![](https://secure.gravatar.com/avatar/ee8f79b6db74897a835877cb9519cd55.jpg?s=120&d=mm&r=g)
Okay, I've implemented the scheme below that was proposed by Scott Daniels on the VPython mailing list, and it solves my problem. It's also much faster than using numpy directly: even with the "def "and "if" overhead: sqrt(scalar) is over 3 times faster than the numpy sqrt, and sqrt(array) is very nearly as fast as the numpy sqrt. Thanks to those who made suggestions. There remains the question of why operator overloading of the kind I've described worked with Numeric and Boost but not with numpy and Boost. There is also the question of whether it would pay for numpy to make what is probably an exceedingly fast check and do much faster calculations of sqrt(scalar) and other such mathematical functions. Bruce Sherwood Bruce Sherwood wrote:
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Bruce Sherwood wrote:
There is no question that it would pay. It takes time and effort to implement, though. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/5dde29b54a3f1b76b2541d0a4a9b232c.jpg?s=120&d=mm&r=g)
Robert Kern wrote:
There is a question of whether it would pay enough to be worth the time and effort. For most use (at least most of my use), I'm either working with a small number of scalars, and performance is a non-issue, or I"m working with arrays of data, and then I need the numpy versions. So I really don't see the point of optimizing for scalar arguments. IIUC, the issue at hand involved not a desire for a faster sqrt(scalar), but rather an interaction with C++ code that didn't efficiently deal with numpy scalars -- that's a totally different issue, one that may have been solved by using math.sqrt() to avoid the numpy scalar, but not one that would that provides an argument for cluttering up numpy with this sort of special casing. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
![](https://secure.gravatar.com/avatar/40489da22d2dc0cc12596420bb810463.jpg?s=120&d=mm&r=g)
Bruce Sherwood wrote:
Using math.sqrt short-circuits the ufunc approach of returning numpy scalars (with all the methods and attributes of 0-d arrays --- which is really their primary reason for being). It is also faster because it avoids the numpy ufunc machinery (which is some overhead --- the error setting control and broadcasting facility doesn't happen for free).
Thus, when the numpy.float64 is first (its multiply implementation gets called first) and it uses the equivalent of ufunc.multiply(scalar, vector) I suspect that because your vector can be converted to an array, this procedure works (albeit more slowly than you would like), and so the vector object never gets a chance to try. A quick fix is to make it so that ufunc.multiply(scalar, vector) raises NotImplemented which may not be desireable, either. Alternatively, the generic scalar operations should probably not be so "inclusive" and should allow the other object a chance to perform the operation more often (by returning NotImplemented). -Travis O.
![](https://secure.gravatar.com/avatar/f351f66930a1449592dd11b288e95cb8.jpg?s=120&d=mm&r=g)
On Jan 7, 2008, at 19:57, Travis E. Oliphant wrote:
That would be great. In fact, this has been (and still is) the #1 problem for me in moving from Numeric to NumPy: as soon as I have NumPy scalar objects circulating in my programs, everything that behaves like a sequence gets converted to an array in math operations. Konrad. -- --------------------------------------------------------------------- Konrad Hinsen Centre de Biophysique Moléculaire, CNRS Orléans Synchrotron Soleil - Division Expériences Saint Aubin - BP 48 91192 Gif sur Yvette Cedex, France Tel. +33-1 69 35 97 15 E-Mail: hinsen@cnrs-orleans.fr ---------------------------------------------------------------------
participants (5)
-
Bruce Sherwood
-
Christopher Barker
-
Konrad Hinsen
-
Robert Kern
-
Travis E. Oliphant