Changing return types in a public API?

Hello all, I've been working towards unifying KDTree and cKDTree in gh-12382<https://github.com/scipy/scipy/pull/12382#discussion_r451491803>. This has raised some general questions about API compatibility that I'd appreciate any thoughts on. First, KDTree returns NumPy scalars everywhere because the results come from indexing NumPy arrays. Whereas, cKDTree returns python ints wherever possible. Is it reasonable for an API to change from returning NumPy scalars to python int? The NumPy scalars do mimic the array interface to some extent so there is a small interface incompatibility. However, the documentation usually just says a function returns int or float, not NumPy scalar specifically. Secondly, KDTree uses `dtype=int` everywhere which results in 64-bit integers on linux but 32-bit integers on windows. Ideally, I'd want to return 64-bit integers (or at least np.intp) on all platforms for consistency and to avoid issues with integer overflow. Again, this is behaviour that isn't documented but code could still be relying on implicitly. - Peter

First, KDTree returns NumPy scalars everywhere because the results come from indexing NumPy arrays. Whereas, cKDTree returns python ints wherever possible. Is it reasonable for an API to change from returning NumPy scalars to python int? The NumPy scalars do mimic the array interface to some extent so there is a small interface incompatibility. However, the documentation usually just says a function returns int or float, not NumPy scalar specifically.
Secondly, KDTree uses `dtype=int` everywhere which results in 64-bit integers on linux but 32-bit integers on windows. Ideally, I'd want to return 64-bit integers (or at least np.intp) on all platforms for consistency and to avoid issues with integer overflow. Again, this is behaviour that isn't documented but code could still be relying on implicitly.
Changing to a NumPy int return type seems fine in this case, especially since it can be considered a bugfix (32-bit int problems seem to come up a lot across SciPy for Windows users). My 2c, Eric

On Tue, Aug 4, 2020 at 6:03 PM Eric Larson <larson.eric.d@gmail.com> wrote:
First, KDTree returns NumPy scalars everywhere because the results come
from indexing NumPy arrays. Whereas, cKDTree returns python ints wherever possible. Is it reasonable for an API to change from returning NumPy scalars to python int? The NumPy scalars do mimic the array interface to some extent so there is a small interface incompatibility. However, the documentation usually just says a function returns int or float, not NumPy scalar specifically.
Secondly, KDTree uses `dtype=int` everywhere which results in 64-bit integers on linux but 32-bit integers on windows. Ideally, I'd want to return 64-bit integers (or at least np.intp) on all platforms for consistency and to avoid issues with integer overflow. Again, this is behaviour that isn't documented but code could still be relying on implicitly.
Changing to a NumPy int return type seems fine in this case, especially since it can be considered a bugfix (32-bit int problems seem to come up a lot across SciPy for Windows users).
I agree, seems okay to make this change Cheers, Ralf

Okay, so I'll keep KDTree returning NumPy scalars for now but change the integer sizes to match cKDTree. Thanks guys.nteger sizes to match cKDTree. Thanks guys. Cheers, Peter ________________________________ From: SciPy-Dev <scipy-dev-bounces+peterbell10=live.co.uk@python.org> on behalf of Ralf Gommers <ralf.gommers@gmail.com> Sent: 04 August 2020 22:58 To: SciPy Developers List <scipy-dev@python.org> Subject: Re: [SciPy-Dev] Changing return types in a public API? On Tue, Aug 4, 2020 at 6:03 PM Eric Larson <larson.eric.d@gmail.com<mailto:larson.eric.d@gmail.com>> wrote: First, KDTree returns NumPy scalars everywhere because the results come from indexing NumPy arrays. Whereas, cKDTree returns python ints wherever possible. Is it reasonable for an API to change from returning NumPy scalars to python int? The NumPy scalars do mimic the array interface to some extent so there is a small interface incompatibility. However, the documentation usually just says a function returns int or float, not NumPy scalar specifically. Secondly, KDTree uses `dtype=int` everywhere which results in 64-bit integers on linux but 32-bit integers on windows. Ideally, I'd want to return 64-bit integers (or at least np.intp) on all platforms for consistency and to avoid issues with integer overflow. Again, this is behaviour that isn't documented but code could still be relying on implicitly. Changing to a NumPy int return type seems fine in this case, especially since it can be considered a bugfix (32-bit int problems seem to come up a lot across SciPy for Windows users). I agree, seems okay to make this change Cheers, Ralf
participants (4)
-
Eric Larson
-
Peter Bell
-
Peter Bell
-
Ralf Gommers