Comparison between numpy scalars returns numpy bool class and not native python bool class
![](https://secure.gravatar.com/avatar/f9ed6413b67cfa6ddc0a37675d9e065a.jpg?s=120&d=mm&r=g)
It is well known that ‘np.bool' is not interchangeable with python ‘bool’, and in fact 'issubclass(np.bool, bool)’ is false. On the contrary, numpy floats are subclassing python floats—'issubclass(np.float64, float) is true—so I’m wondering if the fact that scalar comparison returns a np.bool breaks the Liskov substitution principle. In fact ’(np.float64(1) > 0) is True’ is unexpectedly false. I was hit by this behaviour because in python structural pattern matching, the ‘a > 1’ subject will not match neither ’True’ or ‘False’ if ‘a' is a numpy scalar: In this short example import numpy as np a = np.float64(1) assert isinstance(a, float) match a > 1: case True | False: print('python float') case _: print('Huh?: numpy float’) the default clause is matched. If we set instead ‘a = float(1)’, the first clause will be matched. The surprise factor is quite high here, in my opinion. (Let me add that ‘True', ‘False', ‘None' are special in python structural pattern matching, because they are matched by identity and not by equality.) I’m not sure if this behaviour can be avoided, or if we have to live with the fact that numpy floats are to be kept well contained and never mixed with python floats. Stefano
![](https://secure.gravatar.com/avatar/0383e4cae325f65a1bbd906be4be2276.jpg?s=120&d=mm&r=g)
Apparently the reason this happens is that True, False, and None are compared using 'is' in structural pattern matching (see https://peps.python.org/pep-0634/#literal-patterns). There's no way NumPy could avoid this. First off, Python won't even let you subclass bool:
np.bool_ objects *do* compare equal to bool:
but that doesn't matter because True is specifically special cased in structural pattern matching. The workaround is to use np.True_ and np.False_ in your pattern python float Fortunately, since these compare equal to Python bool, this will also work even if a > 1 is a normal True or False: python float Aaron Meurer On Thu, Jun 27, 2024 at 3:33 PM Stefano Miccoli via NumPy-Discussion <numpy-discussion@python.org> wrote:
![](https://secure.gravatar.com/avatar/f9ed6413b67cfa6ddc0a37675d9e065a.jpg?s=120&d=mm&r=g)
Please let me stress that the ‘match/case’ snippet was only a concrete example of a situation in which, say ‘f(a)’ gives the correct result when ‘a’ is a ‘float’ instance and breaks down when ‘a’ is a ‘np.foat64’ instance. Now the fact that numpy floats are subclasses of python floats is quite a strong promise that this should never be the case… Realistically this can be solved in a couple of ways. (i) Refactoring ‘f(a)’ so that it is aware of the numpy float quirks… not always possible, especially if ‘f(a)’ belongs to an external package. (ii) Sanitizing numpy floats, lets say by ‘f(a.item())’ in the calling code. (iii) Ensuring that scalar comparisons always return python bools and not ‘np.bool' (i) and (ii) are quite simple user-side workarouns, but sometimes the surprise factor is high, as in the given code snippet. On the contrary (iii) is a radical solution on the library side, but I’m not sure if it’s worth implementing for a few edge cases. In fact ‘b is True’ is an anti-pattern in python, and probably the places in which this behaviour surfaces should be sparse. Stefano
![](https://secure.gravatar.com/avatar/0383e4cae325f65a1bbd906be4be2276.jpg?s=120&d=mm&r=g)
The fact of the matter is that a library like NumPy that creates its own versions of the things that are built-in to Python are going to run into these little corners of the language where things simply cannot be hooked into. Another example of this sort of thing is that 'and' and 'or' cannot be overridden, so you can't use them on arrays. It's just a fact of life that if you use NumPy you have to learn where these little gotchas are and avoid them. Maybe you could try to convince the CPython team that some issue is egregious enough to update the language somehow. But most of these things have been around for decades and the Python devs have more or less decided that it isn't worth the cost to do anything about them, but you could try. I honestly don't know why bool is unsubclassable, for instance, and it could be worth trying to change that for NumPy's sake (OTOH, it's highly unlikely that they would ever make it so that you could hook into 'is True'). On Fri, Jun 28, 2024 at 2:50 AM Stefano Miccoli via NumPy-Discussion <numpy-discussion@python.org> wrote:
The issue with this proposal is that NumPy scalars are also supposed to be 0-D arrays. All the NumPy attributes like .shape are on them and they work in any context an ndarray would. For a NumPy operation to not return an ndarray would be far more surprising and problematic than np.bool_ not being a bool is.
(i) and (ii) are quite simple user-side workarouns, but sometimes the surprise factor is high, as in the given code snippet.
I would do any kind of user workaround as close to the "gotcha" as possible. In this case, the gotcha is that True doesn't compare with np.True_ in match statements, so you should either sanitize the expression right before calling match or use np.True_ | np.False_ in the case as I suggested before. Aaron Meurer
![](https://secure.gravatar.com/avatar/0383e4cae325f65a1bbd906be4be2276.jpg?s=120&d=mm&r=g)
Apparently the reason this happens is that True, False, and None are compared using 'is' in structural pattern matching (see https://peps.python.org/pep-0634/#literal-patterns). There's no way NumPy could avoid this. First off, Python won't even let you subclass bool:
np.bool_ objects *do* compare equal to bool:
but that doesn't matter because True is specifically special cased in structural pattern matching. The workaround is to use np.True_ and np.False_ in your pattern python float Fortunately, since these compare equal to Python bool, this will also work even if a > 1 is a normal True or False: python float Aaron Meurer On Thu, Jun 27, 2024 at 3:33 PM Stefano Miccoli via NumPy-Discussion <numpy-discussion@python.org> wrote:
![](https://secure.gravatar.com/avatar/f9ed6413b67cfa6ddc0a37675d9e065a.jpg?s=120&d=mm&r=g)
Please let me stress that the ‘match/case’ snippet was only a concrete example of a situation in which, say ‘f(a)’ gives the correct result when ‘a’ is a ‘float’ instance and breaks down when ‘a’ is a ‘np.foat64’ instance. Now the fact that numpy floats are subclasses of python floats is quite a strong promise that this should never be the case… Realistically this can be solved in a couple of ways. (i) Refactoring ‘f(a)’ so that it is aware of the numpy float quirks… not always possible, especially if ‘f(a)’ belongs to an external package. (ii) Sanitizing numpy floats, lets say by ‘f(a.item())’ in the calling code. (iii) Ensuring that scalar comparisons always return python bools and not ‘np.bool' (i) and (ii) are quite simple user-side workarouns, but sometimes the surprise factor is high, as in the given code snippet. On the contrary (iii) is a radical solution on the library side, but I’m not sure if it’s worth implementing for a few edge cases. In fact ‘b is True’ is an anti-pattern in python, and probably the places in which this behaviour surfaces should be sparse. Stefano
![](https://secure.gravatar.com/avatar/0383e4cae325f65a1bbd906be4be2276.jpg?s=120&d=mm&r=g)
The fact of the matter is that a library like NumPy that creates its own versions of the things that are built-in to Python are going to run into these little corners of the language where things simply cannot be hooked into. Another example of this sort of thing is that 'and' and 'or' cannot be overridden, so you can't use them on arrays. It's just a fact of life that if you use NumPy you have to learn where these little gotchas are and avoid them. Maybe you could try to convince the CPython team that some issue is egregious enough to update the language somehow. But most of these things have been around for decades and the Python devs have more or less decided that it isn't worth the cost to do anything about them, but you could try. I honestly don't know why bool is unsubclassable, for instance, and it could be worth trying to change that for NumPy's sake (OTOH, it's highly unlikely that they would ever make it so that you could hook into 'is True'). On Fri, Jun 28, 2024 at 2:50 AM Stefano Miccoli via NumPy-Discussion <numpy-discussion@python.org> wrote:
The issue with this proposal is that NumPy scalars are also supposed to be 0-D arrays. All the NumPy attributes like .shape are on them and they work in any context an ndarray would. For a NumPy operation to not return an ndarray would be far more surprising and problematic than np.bool_ not being a bool is.
(i) and (ii) are quite simple user-side workarouns, but sometimes the surprise factor is high, as in the given code snippet.
I would do any kind of user workaround as close to the "gotcha" as possible. In this case, the gotcha is that True doesn't compare with np.True_ in match statements, so you should either sanitize the expression right before calling match or use np.True_ | np.False_ in the case as I suggested before. Aaron Meurer
participants (2)
-
Aaron Meurer
-
Stefano Miccoli