On Tue, Jun 28, 2022 at 7:21 PM Miles Cranmer <miles.cranmer@gmail.com> wrote:
Thanks for the comments Ralf!

> You cannot switch the default behavior, that will break backwards compatibility.

The default `kind=None` have no effect on input/output behavior of the function. The only changes a user will see are in terms of speed and memory usage. `unique` will select this new algorithm `"table"` only if it is available (integral array, no axis specified, return_index and return_inverse set to False) and the required memory allocation is not too big (which is arbitrarily defined as six times the allocation of the input array - similar to what the sorting method use). Using `kind="table"` won't affect the input/output either, but it is only available for certain arrays (somewhat similar to the usage of `assume_unique` for `np.isin`). Does this sound fine to you?

Ah, I didn't get from the initial description that the results would still be sorted. Then I'd say that I'm not sure that:
1. I'm not sure that the performance vs. complexity trade-off is worth it. But I'll leave that to others; Sebastian seemed happy to merge the similar change in `in1d`
2. If you're interested in working more on this, implementing an unsorted option would be more interesting imho - significantly more performance benefits, and not just for integers or at the cost of more memory use.

Cheers,
Ralf



> Regarding the name, `'table'` is an implementation detail. The end user should not have to care what the data structure is that is used. I suggest to use something like "unsorted" and just explain it as the ordering of results being undefined, which can give significant performance benefits.

The names `'table'` and `'sort'` were selected for consistency as these names were recently put in place for `np.isin` and `np.in1d`, and the analogous methods used for `unique` are conceptually similar. I have no particular attachment to either name though.

Thanks again!
Miles
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-leave@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: ralf.gommers@googlemail.com