NumPy-Discussion

Download

numpy-discussion@python.org

November 2023

33 participants
27 discussions

Windows default integer now 64bit in main
by Sebastian Berg Nov. 3, 2023

Nov. 3, 2023

Hi all, just a heads up, the PR to change the default integer is merged on main. This may cause issues, especially with Cython code because `np.int_t` cannot be reasonably defined anymore. Other code may also want to vet usage of "long" in any variation. Much code (like SciPy) simply supports any integer input, although even there integer output may be relevant. New NumPy defines `NPY_DEFAULT_INT` to be able to branch at runtime for backward compatiblity you could use: #ifndef NPY_DEFAULT_INT #define NPY_DEFAULT_INT NPY_LONG #endif Unfortunately, I expect this to be a bit painful, please let us know if it is too painful for some reason. But OTOH it has been a recurring surprise and is a common reason for linux written software to not run on windows. - Sebastian

2 2

Function that searches arrays for the first element that satisfies a condition
by rosko37 Nov. 1, 2023

Nov. 1, 2023

I know this question has been asked before, both on this list as well as several threads on Stack Overflow, etc. It's a common issue. I'm NOT asking for how to do this using existing Numpy functions (as that information can be found in any of those sources)--what I'm asking is whether Numpy would accept inclusion of a function that does this, or whether (possibly more likely) such a proposal has already been considered and rejected for some reason. The task is this--there's a large array and you want to find the next element after some index that satisfies some condition. Such elements are common, and the typical number of elements to be searched through is small relative to the size of the array. Therefore, it would greatly improve performance to avoid testing ALL elements against the conditional once one is found that returns True. However, all built-in functions that I know of test the entire array. One can obviously jury-rig some ways, like for instance create a "for" loop over non-overlapping slices of length slice_length and call something like np.where(cond) on each--that outer "for" loop is much faster than a loop over individual elements, and the inner loop at most will go slice_length-1 elements past the first "hit". However, needing to use such a convoluted piece of code for such a simple task seems to go against the Numpy spirit of having one operation being one function of the form func(arr)". A proposed function for this, let's call it "np.first_true(arr, start_idx, [stop_idx])" would be best implemented at the C code level, possibly in the same code file that defines np.where. I'm wondering if I, or someone else, were to write such a function, if the Numpy developers would consider merging it as a standard part of the codebase. It's possible that the idea of such a function is bad because it would violate some existing broadcasting or fancy indexing rules. Clearly one could make it possible to pass an "axis" argument to np.first_true() that would select an axis to search over in the case of multi-dimensional arrays, and then the result would be an array of indices of one fewer dimension than the original array. So np.first_true(np.array([1,5],[2,7],[9,10],cond) would return [1,1,0] for cond(x): x>4. The case where no elements satisfy the condition would need to return a "signal value" like -1. But maybe there are some weird cases where there isn't a sensible return value, hence why such a function has not been added. -Andrew Rosko

7 13