On Sun, Aug 20, 2023 at 7:33 AM Doug Turnbull <softwaredoug@gmail.com> wrote:
First of all, I really love the docs of the C API :) It's way above what I would expect!

I was reviewing the signature possibilities for generalized UFuncs, and had a question

https://numpy.org/doc/stable/reference/c-api/generalized-ufuncs.html

I am playing with a UFunc that scores and returns some top N, where N could be specified the user. IE the user might do

get_most_similar(X, y, n=10)

You can imagine situations where this could happen in similarity functions, where we want to get some Top N rows of X most similar to y. But sometimes users will want 10, or 100, or need to page through results etc. For performance reasons, I wouldn't want to maintain an index of every row of X, I'd prefer to only have to care about the top 10 or so.

I wonder what the best way to do this? 

One thought I had was always set the output dimension to 10 for now, and handle paging on the python side by perhaps also having an offset parameter for my function, to window into the similar results.

The second thought I had was to just get 100 instead of 10, as that probably is enough for most use cases. And users can slice out what they need. It's a little annoying in terms of perf cost, but probably not a big deal.

But it would be convenient to just let the user specify the N they want.


Thanks for the suggestion, Doug.  This is something I've thought about too.  In fact, I've drafted a proposal at https://github.com/WarrenWeckesser/numpy-notes/blob/main/enhancements/gufunc-shape-only-params.md for allowing "shape only" parameters of a gufunc.  This is the first time that I've announced that proposal on the mailing list. Any comments from NumPy devs would be appreciated.

Warren


Thanks for any insights!
-Doug
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-leave@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: warren.weckesser@gmail.com