Accessing real and imaginary parts of npy_complex(nbits)
Hi, This is my first look at the numpy internals, so I'm trying to help update pytensor to be compatible with numpy 2.0, and we have some structs that inherit from npy_complex64 and npy_complex128, which currently use .real and .imag to access real and imaginary parts. Assuming a 64 bit system, it is easy to update the code using npy_crealf (etc) for the struct that inherits from npy_complex64, and so on. (I'll put an example of the code at the bottom of the post.) Since numpy doesn't assume a 64 bit system, I'm writing some aliases for npy_crealf etc., depending on NPY_SIZEOF_FLOAT etc. I'm wondering if there is a smarter way to do this. Also, the pytensor code redefines complex arithmetic in terms of the standard "math" definitions. For C99 complex types, this can be achieved using #pragma STDC CX_LIMITED_RANGE (in theory, but really depending on the compiler). Is there any way to ask numpy to use this directive? (Googling says this makes complex arithmetic 3-5 times faster. The C99 complex arithmetic tries to avoid overflow, and this directive is only recommended if you know that won't be an issue. I don't know if we can assume that, but this code has been this way since it was added to Theano.) Example code: struct pytensor_complex64 : public npy_complex64 { typedef pytensor_complex64 complex_type; typedef npy_float32 scalar_type; complex_type operator+(const complex_type &y) const { complex_type ret; // ret.real = this->real + y.real; // ret.imag = this->imag + npy_cimagf(y); npy_csetrealf(&ret, npy_crealf(*this) + npy_crealf(y)); npy_csetimagf(&ret, npy_cimagf(*this) + npy_cimagf(y)); return ret; } complex_type operator-() const { complex_type ret; npy_csetrealf(&ret, -npy_crealf(*this)); npy_csetimagf(&ret, -npy_cimagf(*this)); return ret; } bool operator==(const complex_type &y) const { return (npy_crealf(*this) == npy_crealf(y)) && (npy_cimagf(*this) == npy_cimagf(y)); } bool operator==(const scalar_type &y) const { return (npy_crealf(*this) == y) && (npy_crealf(*this) == 0); } complex_type operator-(const complex_type &y) const { complex_type ret; npy_csetrealf(&ret, npy_crealf(*this) - npy_crealf(y)); npy_csetimagf(&ret, npy_cimagf(*this) - npy_cimagf(y)); return ret; } complex_type operator*(const complex_type &y) const { complex_type ret; npy_csetrealf(&ret, npy_crealf(*this) * npy_crealf(y) - npy_cimagf(*this) * npy_cimagf(y)); npy_csetimagf(&ret, npy_crealf(*this) * npy_cimagf(y) + npy_cimagf(*this) * npy_crealf(y)); return ret; } complex_type operator/(const complex_type &y) const { complex_type ret; scalar_type y_norm_square = npy_crealf(y) * npy_crealf(y) + npy_cimagf(y) * npy_cimagf(y); npy_csetrealf(&ret, (npy_crealf(*this) * npy_crealf(y) + npy_cimagf(*this) * npy_cimagf(y)) / y_norm_square); npy_csetimagf(&ret, (npy_crealf(*this) * npy_crealf(y) - npy_cimagf(*this) * npy_cimagf(y)) / y_norm_square); return ret; } template <typename T> complex_type &operator=(const T &y); pytensor_complex64() {} template <typename T> pytensor_complex64(const T &y) { *this = y; } template <typename TR, typename TI> pytensor_complex64(const TR &r, const TI &i) { npy_csetrealf(this, r); npy_csetimagf(this, i); } };
I can partially answer my own questions here: 1) To avoid figuring out the type underlying npy_complex64 etc, the following macro seems to work: #define set_real(X, Y) _Generic((X), \ npy_cfloat: npy_csetrealf, \ npy_cdouble: npy_csetreal, \ npy_clongdouble: npy_csetreall \ )((X), (Y)) #define set_imag(X, Y) _Generic((X), \ npy_cfloat: npy_csetimagf, \ npy_cdouble: npy_csetimag, \ npy_clongdouble: npy_csetimagl \ )((X), (Y)) #define get_real(X) _Generic((X), \ npy_cfloat: npy_crealf, \ npy_cdouble: npy_creal, \ npy_clongdouble: npy_creall \ )(X) #define get_imag(X) _Generic((X), \ npy_cfloat: npy_cimagf, \ npy_cdouble: npy_cimag, \ npy_clongdouble: npy_cimagl \ )(X) Since we're using C++ for pytensor, overloading inline functions for get_real, set_real, etc. would also be an option. For 2), I realised that scalarmath.c.src already redefines the arithmetic of complex types. I was assuming that the built in C99 operations were being used (and if __cplusplus is defined, I guess I was assuming that the structs were cast to C complex types). This post suggests this gives better performance than the built in C99 operations: https://stackoverflow.com/questions/11076924/how-to-use-cx-limited-range-on-... Using the compiler flag they suggest could mean that numpy doesn't need to redefine the arithmetic operators for complex numbers, although I don't know if there is a clang equivalent, and I suppose it isn't much code saved.
participants (1)
-
Brendan Murphy