[Numpy-discussion] Improving Complex Comparison/Ordering in Numpy

Rakesh Vasudevan rakesh.nvasudev at gmail.com
Fri Jun 5 00:15:03 EDT 2020


Hi all,

As a follow up to gh-15981 <https://github.com/numpy/numpy/issues/15981>, I
would like to propose a change to bring complex dtype(s) comparison
operators and related functions, in line with respective cpython
implementations.

The current state of complex dtype comparisons/ordering as summarised in
the issue is as follows:

# In python

>> cnum = 1 + 2j
>> cnum_two = 1 + 3j

# Doing a comparision yields
>> cnum > cnum_two

TypeError: '>' not supported between instances of 'complex' and 'complex'


# Doing the same in Numpy scalar comparision

>> np.array(cnum) > np.array(cnum_two)

# Yields

False


*NOTE*: only >, <, >= , <= do not work on complex numbers in python ,
equality (==) does work

similarly sorting uses comparison operators behind to sort complex values.
Again this behavior diverges from the default python behavior.

# In native python
>> clist = [cnum, cnum_2]
>> sorted(clist, key=lambda c: (c.real, c.imag))
[(1+2j), (1+3j)]

# In numpy

>> np.sort(clist) #Uses the default comparision order

# Yields same result

# To get a cpython like sorting call we can do the following in numpy
np.take_along_axis(clist, np.lexsort((clist.real, clist.imag), 0), 0)


This proposal aims to bring parity between default python handling of
complex numbers and handling complex types in numpy

This is a two-step process


   1. Sort complex numbers in a pythonic way , accepting key arguments, and
   deprecate usage of sort() on complex numbers without key argument
      1. Possibly extend this to max(), min(), if it makes sense to do so.
      2. Since sort() is being updated for complex numbers, searchsorted()
      is also a good candidate for implementing this change.
   2. Once this is done, we can deprecate the usage of comparison operators
   (>, <, >= , <=) on complex dtypes




*Handling sort() for complex numbers*
There are two approaches we can take for this


   1. update sort() method, to have a ‘key’ kwarg. When key value is
   passed, use lexsort to get indices and continue sorting of it. We could
   support lambda function keys like python, but that is likely to be very
   slow.
   2. Create a new wrapper function sort_by() (placeholder name, Requesting
   name suggestions/feedback)That essentially acts like a syntactic sugar for
      1. np.take_along_axis(clist, np.lexsort((clist.real, clist.imag), 0),
      0)


   1. Improve the existing sort_complex() method with the new key search
   functionality (Though the change will only reflect for complex dtypes).

We could choose either method, both have pros and cons , approach 1 makes
the sort function signature, closer to its python counterpart, while using
approach 2 provides a better distinction between the two approaches for
sorting. The performance on approach 1 function would vary, due to the key
being an optional argument. Would love the community’s thoughts on this.


*Handling min() and max() for complex numbers*

Since min and max are essentially a set of comparisons, in python they are
not allowed on complex numbers

>> clist = [cnum, cnum_2]
>>> min(clist)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'complex' and 'complex'

# But using keys argument again works
min(clist, key=lambda c: (c.real, c.imag))

We could use a similar key kwarg for min() and max() in python, but
question remains how we handle the keys, in this use case , naive way would
be to sort() on keys and take last or first element, which is likely going
to be slow. Requesting suggestions on approaching this.

*Comments on isclose()*
Both python and numpy use the absolute value/magnitude for comparing if two
values are close enough. Hence I do not see this change affecting this
function.

Requesting feedback and suggestions on the above.

Thank you,

Rakesh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20200604/5677f51d/attachment.html>


More information about the NumPy-Discussion mailing list