Suggestion: Extend integers to include iNaN
One of the strengths of the IEEE float, (to set against its many weaknesses), is the presence of the magic value NaN. Not a Number, or NaA, is especially useful in that it is a valid value in any mathematical operation, (always returning NaN), or comparison, (always returning False). In functional programming this is especially useful as it allows the chain to complete after an error while retaining the fact that an error occurred, (as we got NaN). In languages such as C integers can only be used to represent a limited range of values in integers and a less limited range of values, (but still limited), with a limited accuracy. However, one of Pythons strengths is that its integers can represent any whole number value, (up to the maximum available memory and in exchange for slow performance when numbers get huge). This is accomplished by Python Integers being objects rather than a fixed number of bytes. I think that it should be relatively simple to extend the Python integer class to have a NaN flag, possibly by having a bit length of 0, and have it follow the same rules for the handling of floating point NaN, i.e. any mathematical operation on an iNaN returns an iNaN and any comparison with one returns False. One specific use case that springs to mind would be for Libraries such as Pandas to return iNaN for entries that are not numbers in a column that it has been told to treat as integers. We would possibly need a flag to set this behaviour, rather than raising an Exception, or at the very least automatically (or provide a method to) set LHS integers to iNaN on such an exception. I thought that I would throw this out to Python Ideas for some discussion of whether such a feature is: a) Desirable? b) Possible, (I am sure that it could be done)? c) Likely to get me kicked off of the list? -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com
On Fri, Sep 28, 2018 at 11:31 PM, Steve Barnes <gadgetsteve@live.co.uk> wrote:
One specific use case that springs to mind would be for Libraries such as Pandas to return iNaN for entries that are not numbers in a column that it has been told to treat as integers.
Pandas doesn't use Python objects to store integers, though; it uses an array of unboxed machine integers. In places where you can use Python objects to represent numbers, can't you just use float("nan") instead of iNaN? -n -- Nathaniel J. Smith -- https://vorpus.org
On 29/09/2018 07:52, Nathaniel Smith wrote:
On Fri, Sep 28, 2018 at 11:31 PM, Steve Barnes <gadgetsteve@live.co.uk> wrote:
One specific use case that springs to mind would be for Libraries such as Pandas to return iNaN for entries that are not numbers in a column that it has been told to treat as integers.
Pandas doesn't use Python objects to store integers, though; it uses an array of unboxed machine integers.
In places where you can use Python objects to represent numbers, can't you just use float("nan") instead of iNaN?
-n
It is a shame about Pandas not using integers, (speed considerations I would guess). Using float("nan") would possibly be incompatible with operations down the chain which might be expecting an integer or handling a float differently. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com
On Fri, 28 Sep 2018 23:52:22 -0700 Nathaniel Smith <njs@pobox.com> wrote:
On Fri, Sep 28, 2018 at 11:31 PM, Steve Barnes <gadgetsteve@live.co.uk> wrote:
One specific use case that springs to mind would be for Libraries such as Pandas to return iNaN for entries that are not numbers in a column that it has been told to treat as integers.
Pandas doesn't use Python objects to store integers, though; it uses an array of unboxed machine integers.
In places where you can use Python objects to represent numbers, can't you just use float("nan") instead of iNaN?
Or simply None ;-) Regards Antoine.
On Fri, Sep 28, 2018 at 11:32 PM Steve Barnes <gadgetsteve@live.co.uk> wrote:
One of the strengths of the IEEE float, (to set against its many weaknesses), is the presence of the magic value NaN. Not a Number, or NaA, is especially useful in that it is a valid value in any mathematical operation, (always returning NaN), or comparison, (always returning False). In functional programming this is especially useful as it allows the chain to complete after an error while retaining the fact that an error occurred, (as we got NaN).
The inventor of "null reference" called it a billion-dollar mistake [0]. I appreciate the Zen of Python's encouragement that "errors should never pass silently." Rather than returning iNaN, I'd prefer my program raise an exception. Besides, you can use a None if you'd like. [0] https://en.wikipedia.org/wiki/Tony_Hoare
On 29/09/2018 08:18, Michael Selik wrote:
On Fri, Sep 28, 2018 at 11:32 PM Steve Barnes <gadgetsteve@live.co.uk> wrote:
One of the strengths of the IEEE float, (to set against its many weaknesses), is the presence of the magic value NaN. Not a Number, or NaA, is especially useful in that it is a valid value in any mathematical operation, (always returning NaN), or comparison, (always returning False). In functional programming this is especially useful as it allows the chain to complete after an error while retaining the fact that an error occurred, (as we got NaN).
The inventor of "null reference" called it a billion-dollar mistake [0]. I appreciate the Zen of Python's encouragement that "errors should never pass silently." Rather than returning iNaN, I'd prefer my program raise an exception. Besides, you can use a None if you'd like.
In the embedded world, (where I have spent most of my career), it is often the case that you need your code to always finish and if an error occurred you throw it away at the end or display the fact that you could not get a sensible answer - I am reasonably sure that the same is true of functional programming. I am not asking that the original error pass silently, (unless explicitly silenced), but rather having the option, when silencing (and hopefully logging hat an error occurred) to have a value that will pass through the rest of the processing chain without raising additional exceptions which None would be likely to do unless expressly tested for everywhere. This simplifies the overall code structure while retaining the fact that an error occurred, (and the log needs to be checked), without the dangerous practice of returning a valid value and setting an error flag, (checking of which is often neglected). -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com
29.09.18 09:31, Steve Barnes пише:
I think that it should be relatively simple to extend the Python integer class to have a NaN flag, possibly by having a bit length of 0, and have it follow the same rules for the handling of floating point NaN, i.e. any mathematical operation on an iNaN returns an iNaN and any comparison with one returns False.
How does it differ from float('nan')?
On 29/09/2018 08:24, Serhiy Storchaka wrote:
29.09.18 09:31, Steve Barnes пише:
I think that it should be relatively simple to extend the Python integer class to have a NaN flag, possibly by having a bit length of 0, and have it follow the same rules for the handling of floating point NaN, i.e. any mathematical operation on an iNaN returns an iNaN and any comparison with one returns False.
How does it differ from float('nan')?
It is still an integer and would pass through any processing that expected an integer as one, (with a value of iNaN). -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com
29.09.18 10:35, Steve Barnes пише:
On 29/09/2018 08:24, Serhiy Storchaka wrote:
29.09.18 09:31, Steve Barnes пише:
I think that it should be relatively simple to extend the Python integer class to have a NaN flag, possibly by having a bit length of 0, and have it follow the same rules for the handling of floating point NaN, i.e. any mathematical operation on an iNaN returns an iNaN and any comparison with one returns False.
How does it differ from float('nan')?
It is still an integer and would pass through any processing that expected an integer as one, (with a value of iNaN).
Python is dynamically typed language. What is such processing that would work with iNaN, but doesn't work with float('nan')?
On 29/09/2018 08:50, Serhiy Storchaka wrote:
29.09.18 10:35, Steve Barnes пише:
On 29/09/2018 08:24, Serhiy Storchaka wrote:
29.09.18 09:31, Steve Barnes пише:
I think that it should be relatively simple to extend the Python integer class to have a NaN flag, possibly by having a bit length of 0, and have it follow the same rules for the handling of floating point NaN, i.e. any mathematical operation on an iNaN returns an iNaN and any comparison with one returns False.
How does it differ from float('nan')?
It is still an integer and would pass through any processing that expected an integer as one, (with a value of iNaN).
Python is dynamically typed language. What is such processing that would work with iNaN, but doesn't work with float('nan')?
One simplistic example would be print(int(float('nan'))) (gives a ValueError) while print(int(iNaN)) should give 'nan' or maybe 'inan'. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com
29.09.18 11:43, Steve Barnes пише:
On 29/09/2018 08:50, Serhiy Storchaka wrote:
Python is dynamically typed language. What is such processing that would work with iNaN, but doesn't work with float('nan')?
One simplistic example would be print(int(float('nan'))) (gives a ValueError) while print(int(iNaN)) should give 'nan' or maybe 'inan'.
Why do you convert to int when you need a string representation? Just print(float('nan')).
On 29/09/2018 09:56, Serhiy Storchaka wrote:
29.09.18 11:43, Steve Barnes пише:
On 29/09/2018 08:50, Serhiy Storchaka wrote:
Python is dynamically typed language. What is such processing that would work with iNaN, but doesn't work with float('nan')?
One simplistic example would be print(int(float('nan'))) (gives a ValueError) while print(int(iNaN)) should give 'nan' or maybe 'inan'.
Why do you convert to int when you need a string representation? Just print(float('nan')). I converted to int because I needed a whole number, this was intended to represent some more complex process where a value is converted to a whole number down in the depths of the processing.
-- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com
29.09.18 21:38, Steve Barnes пише:
On 29/09/2018 09:56, Serhiy Storchaka wrote:
29.09.18 11:43, Steve Barnes пише:
On 29/09/2018 08:50, Serhiy Storchaka wrote:
Python is dynamically typed language. What is such processing that would work with iNaN, but doesn't work with float('nan')?
One simplistic example would be print(int(float('nan'))) (gives a ValueError) while print(int(iNaN)) should give 'nan' or maybe 'inan'.
Why do you convert to int when you need a string representation? Just print(float('nan')). I converted to int because I needed a whole number, this was intended to represent some more complex process where a value is converted to a whole number down in the depths of the processing.
float('nan') is a number (in Python sense). No need to convert it.
On Sat, Sep 29, 2018 at 10:05:39PM +0300, Serhiy Storchaka wrote:
29.09.18 21:38, Steve Barnes пише:
[...]
Why do you convert to int when you need a string representation? Just print(float('nan')).
I converted to int because I needed a whole number, this was intended to represent some more complex process where a value is converted to a whole number down in the depths of the processing.
float('nan') is a number (in Python sense). No need to convert it.
Steve just told you that he doesn't need a number, he needs a whole number (an integer), and that this represents a more complex process that includes a call to int. Why do you dismiss that and say there is no need to call int when you don't know the process involved? It *may* be that Steve could use math.floor() or math.ceil() instead, neither of which have the same meaning as calling int(). But more likely he DOES need to convert it by calling int, just as he says. Telling people that they don't understand their own code when you don't know their code is not very productive. -- Steve
Something to consider in all of this is that Python floats often *don't* produce NaNs for undefined operations, but raise exceptions instead:
1.0/0.0 Traceback (most recent call last): File "<stdin>", line 1, in <module> ZeroDivisionError: float division by zero
math.sqrt(-1.0) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: math domain error
So achieving the OP's goals would not only entail adding an integer version of NaN, but either making int arithmetic behave differently from floats, or changing the way float arithmetic behaves, to produce NaNs instead of exceptions. -- Greg
30.09.18 04:07, Steven D'Aprano пише:
Telling people that they don't understand their own code when you don't know their code is not very productive.
I can't tell him what he should do with his (not working) code, but it doesn't look like a good justification for changes in the Python core.
On Sun, Sep 30, 2018 at 12:09:45PM +0300, Serhiy Storchaka wrote:
30.09.18 04:07, Steven D'Aprano пише:
Telling people that they don't understand their own code when you don't know their code is not very productive.
I can't tell him what he should do with his (not working) code, but it doesn't look like a good justification for changes in the Python core.
You don't know that his code is not working. For all you know, Steve has working code that works around the lack of an int NAN in some other, more clumsy, less elegant, ugly and slow way. NANs are useful for when you don't want a calculation to halt on certain errors, or on missing data. That ability of a NAN to propogate through the calculation instead of halting can be useful when your data are ints, not just floats or Decimals. Earlier, I suggested that this proposal would probably be best done as a subclass of int. It certainly should be prototyped as a subclass before we consider making a builtin int NAN. Since Steve has already agreed to work on that first, I think any further discussion would be pointless until he comes back to us. He may decide that a subclass solves his problem and no longer want a builtin int NAN. -- Steve but not the same Steve as above...
On 30/09/2018 13:55, Steven D'Aprano wrote:
On Sun, Sep 30, 2018 at 12:09:45PM +0300, Serhiy Storchaka wrote:
30.09.18 04:07, Steven D'Aprano пише:
Telling people that they don't understand their own code when you don't know their code is not very productive.
I can't tell him what he should do with his (not working) code, but it doesn't look like a good justification for changes in the Python core.
You don't know that his code is not working. For all you know, Steve has working code that works around the lack of an int NAN in some other, more clumsy, less elegant, ugly and slow way.
NANs are useful for when you don't want a calculation to halt on certain errors, or on missing data. That ability of a NAN to propogate through the calculation instead of halting can be useful when your data are ints, not just floats or Decimals.
Earlier, I suggested that this proposal would probably be best done as a subclass of int. It certainly should be prototyped as a subclass before we consider making a builtin int NAN. Since Steve has already agreed to work on that first, I think any further discussion would be pointless until he comes back to us. He may decide that a subclass solves his problem and no longer want a builtin int NAN.
I have had (over the years) a lot of working code with lots of checks in and a huge number of paths through due to the lack of such of iNaN, or something to return for "that didn't work", floats & complex have NaN, strings have empty string list and sets can be empty but there is no such option for integers. Hence the suggestion. I am hartened that the authors of the Decimal library also felt the need for NaN (as well as INF & -INF). I am roughing out such a class and some test cases which will hopefully include some cases where the hoped for advantages can be realised. My thinking on bitwise operations is to do the same as arithmetic operations, i.e. (anything op iNaN) = iNaN and likewise for shift operations. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com
Notwithstanding my observation of one case where 'nan <op> float' doesn't stay a nan, I definitely want something like iNaN. Btw are there other operations on NaN's do not produce NaN's? I suspect a NaNAwareInt subclass is the easiest way to get there, but I'm agnostic on that detail. For the very same reasons that other numeric types benefit from NaN, ints would also. I.e. I want to do a series of numeric operations on a bunch of input numbers, and it's less cumbersome to check if we went to NaN-land at the end than it is to try/except around every op. On Sun, Sep 30, 2018, 9:42 AM Steve Barnes <gadgetsteve@live.co.uk> wrote:
On 30/09/2018 13:55, Steven D'Aprano wrote:
On Sun, Sep 30, 2018 at 12:09:45PM +0300, Serhiy Storchaka wrote:
30.09.18 04:07, Steven D'Aprano пише:
Telling people that they don't understand their own code when you don't know their code is not very productive.
I can't tell him what he should do with his (not working) code, but it doesn't look like a good justification for changes in the Python core.
You don't know that his code is not working. For all you know, Steve has working code that works around the lack of an int NAN in some other, more clumsy, less elegant, ugly and slow way.
NANs are useful for when you don't want a calculation to halt on certain errors, or on missing data. That ability of a NAN to propogate through the calculation instead of halting can be useful when your data are ints, not just floats or Decimals.
Earlier, I suggested that this proposal would probably be best done as a subclass of int. It certainly should be prototyped as a subclass before we consider making a builtin int NAN. Since Steve has already agreed to work on that first, I think any further discussion would be pointless until he comes back to us. He may decide that a subclass solves his problem and no longer want a builtin int NAN.
I have had (over the years) a lot of working code with lots of checks in and a huge number of paths through due to the lack of such of iNaN, or something to return for "that didn't work", floats & complex have NaN, strings have empty string list and sets can be empty but there is no such option for integers. Hence the suggestion. I am hartened that the authors of the Decimal library also felt the need for NaN (as well as INF & -INF).
I am roughing out such a class and some test cases which will hopefully include some cases where the hoped for advantages can be realised.
My thinking on bitwise operations is to do the same as arithmetic operations, i.e. (anything op iNaN) = iNaN and likewise for shift operations. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer.
--- This email has been checked for viruses by AVG. https://www.avg.com
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
For similar reasons, I'd like an iInf too, FWIW. It's good for an overflow value, although it's hard to get there in Python ints (would 'NaNAwareInt(1)/0' be an exception or iInf?). Bonus points for anyone who knows the actual maximum size of Python ints :-). However, the main use I'd have for iInf is simply as a starting value in a minimization loop. E.g. minimum = NaNAwareInt('inf') for i in the_data: minimum = min(i, minimum) other_stuff(i, minimum, a, b, c) I've written that code a fair number of times; usually I just pick a placeholder value that is "absurdly large relative to my domain", but a clean infinity would be slightly better. E.g. 'minimum = 10**100'. On Sun, Sep 30, 2018 at 9:55 AM David Mertz <mertz@gnosis.cx> wrote:
Notwithstanding my observation of one case where 'nan <op> float' doesn't stay a nan, I definitely want something like iNaN. Btw are there other operations on NaN's do not produce NaN's?
I suspect a NaNAwareInt subclass is the easiest way to get there, but I'm agnostic on that detail.
For the very same reasons that other numeric types benefit from NaN, ints would also. I.e. I want to do a series of numeric operations on a bunch of input numbers, and it's less cumbersome to check if we went to NaN-land at the end than it is to try/except around every op.
On Sun, Sep 30, 2018, 9:42 AM Steve Barnes <gadgetsteve@live.co.uk> wrote:
On 30/09/2018 13:55, Steven D'Aprano wrote:
On Sun, Sep 30, 2018 at 12:09:45PM +0300, Serhiy Storchaka wrote:
30.09.18 04:07, Steven D'Aprano пише:
Telling people that they don't understand their own code when you don't know their code is not very productive.
I can't tell him what he should do with his (not working) code, but it doesn't look like a good justification for changes in the Python core.
You don't know that his code is not working. For all you know, Steve has working code that works around the lack of an int NAN in some other, more clumsy, less elegant, ugly and slow way.
NANs are useful for when you don't want a calculation to halt on certain errors, or on missing data. That ability of a NAN to propogate through the calculation instead of halting can be useful when your data are ints, not just floats or Decimals.
Earlier, I suggested that this proposal would probably be best done as a subclass of int. It certainly should be prototyped as a subclass before we consider making a builtin int NAN. Since Steve has already agreed to work on that first, I think any further discussion would be pointless until he comes back to us. He may decide that a subclass solves his problem and no longer want a builtin int NAN.
I have had (over the years) a lot of working code with lots of checks in and a huge number of paths through due to the lack of such of iNaN, or something to return for "that didn't work", floats & complex have NaN, strings have empty string list and sets can be empty but there is no such option for integers. Hence the suggestion. I am hartened that the authors of the Decimal library also felt the need for NaN (as well as INF & -INF).
I am roughing out such a class and some test cases which will hopefully include some cases where the hoped for advantages can be realised.
My thinking on bitwise operations is to do the same as arithmetic operations, i.e. (anything op iNaN) = iNaN and likewise for shift operations. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer.
--- This email has been checked for viruses by AVG. https://www.avg.com
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On Mon, Oct 1, 2018 at 12:18 AM David Mertz <mertz@gnosis.cx> wrote:
For similar reasons, I'd like an iInf too, FWIW. It's good for an overflow value, although it's hard to get there in Python ints (would 'NaNAwareInt(1)/0' be an exception or iInf?). Bonus points for anyone who knows the actual maximum size of Python ints :-).
Whatever the maximum is, it's insanely huge. I basically consider that a Python int is as large as your computer has memory to store. I can work with numbers so large that converting to string takes a notable amount of time (never mind about actually printing it to a console, just 'x = str(x)' pauses the interpreter for ages). If there's a limit, it'll probably be described as something like 2**2**2**N for some ridiculously large N. Want to share what the maximum actually is? I'm very curious! ChrisA
On Sun, Sep 30, 2018 at 10:23 AM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Oct 1, 2018 at 12:18 AM David Mertz <mertz@gnosis.cx> wrote:
Bonus points for anyone who knows the actual maximum size of Python ints :-).
Whatever the maximum is, it's insanely huge. Want to share what the maximum actually is? I'm very curious!
Indeed. It's a lot bigger than any machine that will exist in my lifetime can hold. int.bit_length() is stored as a system-native integer, e.g. 64-bit, rather than recursively as a Python int. So the largest Python int is '2**sys.maxsize` (e.g. '2**(2**63-1)'). I may possibly have an off-by-one or off-by-power-of-two in there :-). -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On Mon, Oct 1, 2018 at 12:44 AM David Mertz <mertz@gnosis.cx> wrote:
On Sun, Sep 30, 2018 at 10:23 AM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Oct 1, 2018 at 12:18 AM David Mertz <mertz@gnosis.cx> wrote:
Bonus points for anyone who knows the actual maximum size of Python ints :-).
Whatever the maximum is, it's insanely huge. Want to share what the maximum actually is? I'm very curious!
Indeed. It's a lot bigger than any machine that will exist in my lifetime can hold.
int.bit_length() is stored as a system-native integer, e.g. 64-bit, rather than recursively as a Python int. So the largest Python int is '2**sys.maxsize` (e.g. '2**(2**63-1)'). I may possibly have an off-by-one or off-by-power-of-two in there :-).
Hah. Is that a fundamental limit based on the underlying representation, or would it mean that bit_length would bomb with an exception if the number is larger than that? I'm not sure what's going on. I have a Py3 busily calculating 2**(2**65) and it's pegging a CPU core while progressively consuming memory, but it responds to Ctrl-C, which suggests that Python bytecode is still being executed. ChrisA
On Sun, Sep 30, 2018 at 10:49 AM Chris Angelico <rosuav@gmail.com> wrote:
int.bit_length() is stored as a system-native integer, e.g. 64-bit, rather than recursively as a Python int. So the largest Python int is '2**sys.maxsize` (e.g. '2**(2**63-1)'). I may possibly have an off-by-one or off-by-power-of-two in there :-).
Hah. Is that a fundamental limit based on the underlying representation, or would it mean that bit_length would bomb with an exception if the number is larger than that?
It's implementation specific. In concept, a version of Python other than CPython 3.7 could store bit-length as either a Python int or a system-native int, to whatever recursive depth was needed to prevent overflows. Or perhaps as a linked list of native ints. Or something else. There's no sane reason to bother doing that, but there's never been a *promise* in Python semantics not to represent numbers with more than 1e19 bits in them.
I'm not sure what's going on. I have a Py3 busily calculating 2**(2**65) and it's pegging a CPU core while progressively consuming memory, but it responds to Ctrl-C, which suggests that Python bytecode is still being executed.
I'm not quite sure, but my guess is that at SOME POINT you'll get an overflow exception when the current value gets too big to store as a native int. Or maybe it'll be a segfault; I don't know. I'm also not sure if you'll see this error before or after the heat death of the universe ;-). I *am* sure that your swap space on your puny few terabyte disk will fill up before you complete the calculation, so that might be a system level crash not a caught exception. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On Mon, Oct 1, 2018 at 12:58 AM David Mertz <mertz@gnosis.cx> wrote:
I'm not sure what's going on. I have a Py3 busily calculating 2**(2**65) and it's pegging a CPU core while progressively consuming memory, but it responds to Ctrl-C, which suggests that Python bytecode is still being executed.
I'm not quite sure, but my guess is that at SOME POINT you'll get an overflow exception when the current value gets too big to store as a native int. Or maybe it'll be a segfault; I don't know.
I'm also not sure if you'll see this error before or after the heat death of the universe ;-).
I *am* sure that your swap space on your puny few terabyte disk will fill up before you complete the calculation, so that might be a system level crash not a caught exception.
Hahahaha. I was trying to compare to this:
"a" * (2**63 - 1) Traceback (most recent call last): File "<stdin>", line 1, in <module> MemoryError
Bam, instant. (Interestingly, trying to use 2**63 says "OverflowError: cannot fit 'int' into an index-sized integer", suggesting that "index-sized integer" is signed, even though a size can and should be unsigned.) Were there some kind of hard limit, it would be entirely possible to exceed that and get an instant error, without actually calculating all the way up there. But it looks like that doesn't happen. In any case, the colloquial definition that I usually cite ("Python can store infinitely big integers" or "integers can be as big as you have RAM to store") is within epsilon of correct :D Thanks for the info. Cool to know! ChrisA
On Sun, Sep 30, 2018 at 11:04 AM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Oct 1, 2018 at 12:58 AM David Mertz <mertz@gnosis.cx> wrote:
I'm not sure what's going on. I have a Py3 busily calculating 2**(2**65) and it's pegging a CPU core while progressively consuming memory, but it responds to Ctrl-C, which suggests that Python bytecode is still being executed. I'm not quite sure, but my guess is that at SOME POINT you'll get an overflow exception when the current value gets too big to store as a native int. Or maybe it'll be a segfault; I don't know.
"a" * (2**63 - 1) Traceback (most recent call last): File "<stdin>", line 1, in <module> MemoryError
Bam, instant. (Interestingly, trying to use 2**63 says "OverflowError: cannot fit 'int' into an index-sized integer", suggesting that "index-sized integer" is signed, even though a size can and should be unsigned.) Were there some kind of hard limit, it would be entirely possible to exceed that and get an instant error, without actually calculating all the way up there. But it looks like that doesn't happen.
Sure, it wouldn't be THAT hard to do bounds checking in the Python implementation to make '2**(2**65))' an instance error rather than a wait-to-exhaust-swap one. But it's a corner case, and probably not worth the overhead for all the non-crazy uses of integer arithmetic.
In any case, the colloquial definition that I usually cite ("Python can store infinitely big integers" or "integers can be as big as you have RAM to store") is within epsilon of correct :D
I teach the same thing. For beginners or intermediate students, I just say "unbounded ints." Occasionally for advanced students I add the footnote. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On 2018-09-30 10:15, David Mertz wrote:
For similar reasons, I'd like an iInf too, FWIW. It's good for an overflow value, although it's hard to get there in Python ints (would 'NaNAwareInt(1)/0' be an exception or iInf?). Bonus points for anyone who knows the actual maximum size of Python ints :-).
However, the main use I'd have for iInf is simply as a starting value in a minimization loop. E.g.
minimum = NaNAwareInt('inf') for i in the_data: minimum = min(i, minimum)
other_stuff(i, minimum, a, b, c)
I've written that code a fair number of times; usually I just pick a placeholder value that is "absurdly large relative to my domain", but a clean infinity would be slightly better. E.g. 'minimum = 10**100'.
If we conceptualize iNan as "not an integer", then we can define operators in two manners: Let |●| be any operator: 1) "conservative" - operators that define a|● |b==iNaN if either a or b is iNan 2) "decisive" - operators that never return iNan With a decisive min(a, b), you can write the code you want without needing iINF
On 30/09/2018 15:15, David Mertz wrote:
For similar reasons, I'd like an iInf too, FWIW. It's good for an overflow value, although it's hard to get there in Python ints (would 'NaNAwareInt(1)/0' be an exception or iInf?). Bonus points for anyone who knows the actual maximum size of Python ints :-).
However, the main use I'd have for iInf is simply as a starting value in a minimization loop. E.g.
minimum = NaNAwareInt('inf') for i in the_data: minimum = min(i, minimum)
other_stuff(i, minimum, a, b, c)
I've written that code a fair number of times; usually I just pick a placeholder value that is "absurdly large relative to my domain", but a clean infinity would be slightly better. E.g. 'minimum = 10**100'.
The official maximum for a Python integer is x where x.bit_length()/8 == total_available_memory, (notice the word available which includes addressing limitations, other memory constraints, etc.). Adding inf & -inf would be nice but to do so we would need a better name than NaNAwareInt. It would also be nice if Decimal(NaNAwareInt('nan')) = Decimal('NaN'), float(NaNAwareInt('nan')) = float('nan'), etc. I have been doing some reading up on Signalling vs. Quiet NaN and think that this convention could be well worth following, (and possibly storing some information about where the NaN was raised on first encountering a Signalling NaN (and converting it to Quiet). -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com
On Sun, Sep 30, 2018 at 11:01 AM Steve Barnes <gadgetsteve@live.co.uk> wrote:
Adding inf & -inf would be nice but to do so we would need a better name than NaNAwareInt.
My placeholder name is deliberately awkward. I think it gestures at the concept for discussion purposes though.
It would also be nice if Decimal(NaNAwareInt('nan')) = Decimal('NaN'), float(NaNAwareInt('nan')) = float('nan'), etc.
This seems like bad behavior given (per IEEE-754 spec):
float('nan') == float('nan') False nan = float('nan') nan == nan False
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On 30/09/2018 16:13, David Mertz wrote:
On Sun, Sep 30, 2018 at 11:01 AM Steve Barnes <gadgetsteve@live.co.uk <mailto:gadgetsteve@live.co.uk>> wrote:
Adding inf & -inf would be nice but to do so we would need a better name than NaNAwareInt.
My placeholder name is deliberately awkward. I think it gestures at the concept for discussion purposes though.
It would also be nice if Decimal(NaNAwareInt('nan')) = Decimal('NaN'), float(NaNAwareInt('nan')) = float('nan'), etc.
This seems like bad behavior given (per IEEE-754 spec):
>>> float('nan') == float('nan') False >>> nan = float('nan') >>> nan == nan False
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. David,
Note that my statements above had a single = i.e. float(NaNAwareInt('nan')) produces float('nan'), etc., as does: In [42]: nan = decimal.Decimal('nan') In [43]: decimal.Decimal(nan) Out[43]: Decimal('NaN') In [44]: float(nan) Out[44]: nan and vice-versa. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com
On Sun, Sep 30, 2018 at 11:17 AM Steve Barnes <gadgetsteve@live.co.uk> wrote:
Note that my statements above had a single = i.e. float(NaNAwareInt('nan')) produces float('nan'), etc., as does:
In [42]: nan = decimal.Decimal('nan') In [43]: decimal.Decimal(nan) Out[43]: Decimal('NaN') In [44]: float(nan) Out[44]: nan
I think this explanation is still a little confusing. I take it what you're getting at is that a "NaN" of any particular type (float, Decimal, complex, NanAwareInt) should be a perfectly good initializer to create a NaN of a different type using its constructor. I think that is sensible (not sure about complex). Currently we have:
complex(nan) (nan+0j) float(complex('nan')) Traceback (most recent call last): File "<ipython-input-39-069ef735716e>", line 1, in <module> float(complex('nan')) TypeError: can't convert complex to float
complex(float('nan')) (nan+0j) float(complex('nan')) Traceback (most recent call last): File "<ipython-input-41-069ef735716e>", line 1, in <module> float(complex('nan')) TypeError: can't convert complex to float
from decimal import Decimal Decimal('nan') Decimal('NaN') float(Decimal('nan')) nan Decimal(float('nan')) Decimal('NaN') complex(Decimal('nan')) (nan+0j) Decimal(complex('nan')) Traceback (most recent call last): File "<ipython-input-47-f48726d59102>", line 1, in <module> Decimal(complex('nan')) TypeError: conversion from complex to Decimal is not supported
I don't think we can change the "cast-from-complex" behavior... even though I think it maybe should have been different from the start. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On 30/09/2018 16:26, David Mertz wrote:
On Sun, Sep 30, 2018 at 11:17 AM Steve Barnes <gadgetsteve@live.co.uk <mailto:gadgetsteve@live.co.uk>> wrote:
Note that my statements above had a single = i.e. float(NaNAwareInt('nan')) produces float('nan'), etc., as does:
In [42]: nan = decimal.Decimal('nan') In [43]: decimal.Decimal(nan) Out[43]: Decimal('NaN') In [44]: float(nan) Out[44]: nan
I think this explanation is still a little confusing. I take it what you're getting at is that a "NaN" of any particular type (float, Decimal, complex, NanAwareInt) should be a perfectly good initializer to create a NaN of a different type using its constructor.
I think that is sensible (not sure about complex). Currently we have:
>>> complex(nan) (nan+0j) >>> float(complex('nan')) Traceback (most recent call last): File "<ipython-input-39-069ef735716e>", line 1, in <module> float(complex('nan')) TypeError: can't convert complex to float
>>> complex(float('nan')) (nan+0j) >>> float(complex('nan')) Traceback (most recent call last): File "<ipython-input-41-069ef735716e>", line 1, in <module> float(complex('nan')) TypeError: can't convert complex to float
>>> from decimal import Decimal >>> Decimal('nan') Decimal('NaN') >>> float(Decimal('nan')) nan >>> Decimal(float('nan')) Decimal('NaN') >>> complex(Decimal('nan')) (nan+0j) >>> Decimal(complex('nan')) Traceback (most recent call last): File "<ipython-input-47-f48726d59102>", line 1, in <module> Decimal(complex('nan')) TypeError: conversion from complex to Decimal is not supported
I don't think we can change the "cast-from-complex" behavior... even though I think it maybe should have been different from the start.
No complex can be converted to float without accessing either the real or imag component. In [51]: cn=complex(4, float('nan')) In [52]: cn Out[52]: (4+nanj) In [53]: cn.real Out[53]: 4.0 In [54]: cn.imag Out[54]: nan In [55]: float(cn.imag) Out[55]: nan -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com
On Sun, Sep 30, 2018 at 11:35 AM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Oct 1, 2018 at 1:32 AM Steve Barnes <gadgetsteve@live.co.uk> wrote:
No complex can be converted to float without accessing either the real or imag component. Or taking its absolute value, which will return nan if either part is nan.
Well, various other operations as well as abs(). Anything that reduces a complex to a float already... I guess you could argue that behind the scenest hese functions all access .real and/or .imag.
float(abs(1+1j)) 1.4142135623730951 float(cmath.phase(1+1j)) 0.7853981633974483 float(cmath.isfinite(1+1j)) 1.0
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On Sun, Sep 30, 2018 at 11:31 AM Steve Barnes <gadgetsteve@live.co.uk> wrote:
No complex can be converted to float without accessing either the real or imag component.
Sure. Not in Python 3.7. But mathematically, it seems really straightforward to say that Complex numbers that lie on the Real line (i.e. imaginary part is zero) map in an obvious way to Real numbers. I haven't done an inventory, but I'd guess most—but not all—other PLs do the same thing Python does. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On 30/09/2018 16:36, David Mertz wrote:
On Sun, Sep 30, 2018 at 11:31 AM Steve Barnes <gadgetsteve@live.co.uk <mailto:gadgetsteve@live.co.uk>> wrote:
No complex can be converted to float without accessing either the real or imag component.
Sure. Not in Python 3.7. But mathematically, it seems really straightforward to say that Complex numbers that lie on the Real line (i.e. imaginary part is zero) map in an obvious way to Real numbers.
I haven't done an inventory, but I'd guess most—but not all—other PLs do the same thing Python does.
Personally I agree that float(2.0+0j) should possibly be a valid value (2.0) but there is the complication, as always, of how near zero is zero. But that is a battle for another time. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com
On Sun, Sep 30, 2018 at 09:55:58AM -0400, David Mertz wrote:
Notwithstanding my observation of one case where 'nan <op> float' doesn't stay a nan, I definitely want something like iNaN. Btw are there other operations on NaN's do not produce NaN's?
Yes. The (informal?) rule applied by IEEE-754 is that if a function takes multiple arguments, and the result is entirely determined by all the non-NAN inputs, then that value ought to be returned. For example: py> math.hypot(INF, NAN) inf py> 1**NAN 1.0 But generally, any operation (apart from comparisons) on a NAN is usually going to return a NAN.
I suspect a NaNAwareInt subclass is the easiest way to get there, but I'm agnostic on that detail.
I think that adding a NAN to int itself will be too controversial to be accepted :-) -- Steve
On 2018-09-30 09:41, Steve Barnes wrote:
I am roughing out such a class and some test cases which will hopefully include some cases where the hoped for advantages can be realised.
My thinking on bitwise operations is to do the same as arithmetic operations, i.e. (anything op iNaN) = iNaN and likewise for shift operations.
Steve, While you are extending a number system, can every int be truthy, while only iNan be falsey? I found that behaviour more useful because checking if there is a value is more common than checking if it is a zero value. Thank you
I am roughing out such a class and some test cases which will hopefully include some cases where the hoped for advantages can be realised.
My thinking on bitwise operations is to do the same as arithmetic operations, i.e. (anything op iNaN) = iNaN and likewise for shift operations.
Steve,
While you are extending a number system, can every int be truthy, while only iNan be falsey? I found that behaviour more useful because checking if there is a value is more common than checking if it is a zero value.
I’m not saying you’re wrong in principle but such a change to Python seems extremely disruptive. And if we’re talking about robustness of code then truthiness would be better like in Java (!) imo, where only true is true and false is false and everything else is an error. If we’re actually talking about changing the truth table of Python for basic types then this is the logical next step. But making any change to the basic types truth table is a big -1 from me. This seems like a Python 2-3 transition to me. / Anders
On 30/09/2018 15:48, Chris Angelico wrote:
On Mon, Oct 1, 2018 at 12:45 AM Anders Hovmöller <boxed@killingar.net> wrote:
But making any change to the basic types truth table is a big -1 from me. This seems like a Python 2-3 transition to me.
Far FAR worse than anything that changed in Py2->Py3.
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
I can see that breaking a LOT of existing code so would not think that it would be a practical option, I am thinking of having a isnan &/or isvalue. If the behaviour of integers were changed so that all NaN producing operations that took non-NaN inputs were to produce a signalling NaN, i.e. also produce an Exception but a pass on that Exception (explicitly changed Signalling NaN to Quiet NaN) I don't see any non-NaN aware code being forced to change. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com
On 2018-09-30 10:45, Anders Hovmöller wrote:
I am roughing out such a class and some test cases which will hopefully include some cases where the hoped for advantages can be realised.
My thinking on bitwise operations is to do the same as arithmetic operations, i.e. (anything op iNaN) = iNaN and likewise for shift operations. Steve,
While you are extending a number system, can every int be truthy, while only iNan be falsey? I found that behaviour more useful because checking if there is a value is more common than checking if it is a zero value. I’m not saying you’re wrong in principle but such a change to Python seems extremely disruptive. And if we’re talking about robustness of code then truthiness would be better like in Java (!) imo, where only true is true and false is false and everything else is an error. If we’re actually talking about changing the truth table of Python for basic types then this is the logical next step.
But making any change to the basic types truth table is a big -1 from me. This seems like a Python 2-3 transition to me.
Sorry, I can see I was unclear: I was only asking that the new number system (and the class that implements it) have truthy defined differently. My imagination never considered extending ints with iNaN. I would imagine the iNaN checks on every int operation to be noticeably slower, so out of the question.
On 2018-09-30 09:41, Steve Barnes wrote:
I am roughing out such a class and some test cases which will hopefully include some cases where the hoped for advantages can be realised.
My thinking on bitwise operations is to do the same as arithmetic operations, i.e. (anything op iNaN) = iNaN and likewise for shift operations.
Steve, While you are extending a number system, can every int be truthy, while only iNan be falsey? I found that behaviour more useful because checking if there is a value is more common than checking if it is a zero value. Thank you
On Sat, 29 Sep 2018 at 19:38, Steve Barnes <gadgetsteve@live.co.uk> wrote:
On 29/09/2018 09:56, Serhiy Storchaka wrote:
29.09.18 11:43, Steve Barnes пише:
On 29/09/2018 08:50, Serhiy Storchaka wrote:
Python is dynamically typed language. What is such processing that would work with iNaN, but doesn't work with float('nan')?
One simplistic example would be print(int(float('nan'))) (gives a ValueError) while print(int(iNaN)) should give 'nan' or maybe 'inan'.
Why do you convert to int when you need a string representation? Just print(float('nan')). I converted to int because I needed a whole number, this was intended to represent some more complex process where a value is converted to a whole number down in the depths of the processing.
Your requirement to have a whole number cannot meaningfully be satisfied if your input is nan so an exception is the most useful result. -- Oscar
On Sat, Sep 29, 2018 at 09:43:42PM +0100, Oscar Benjamin wrote:
On Sat, 29 Sep 2018 at 19:38, Steve Barnes <gadgetsteve@live.co.uk> wrote:
I converted to int because I needed a whole number, this was intended to represent some more complex process where a value is converted to a whole number down in the depths of the processing.
Your requirement to have a whole number cannot meaningfully be satisfied if your input is nan so an exception is the most useful result.
Not to Steve it isn't. Be careful about making value judgements like that: Steve is asking for an integer NAN because for *him* an integer NAN is more useful than an exception. You shouldn't tell him that he is wrong, unless you know his use-case and his code, which you don't. -- Steve
On Sun, 30 Sep 2018 at 02:01, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Sep 29, 2018 at 09:43:42PM +0100, Oscar Benjamin wrote:
On Sat, 29 Sep 2018 at 19:38, Steve Barnes <gadgetsteve@live.co.uk> wrote:
I converted to int because I needed a whole number, this was intended to represent some more complex process where a value is converted to a whole number down in the depths of the processing.
Your requirement to have a whole number cannot meaningfully be satisfied if your input is nan so an exception is the most useful result.
Not to Steve it isn't.
Be careful about making value judgements like that: Steve is asking for an integer NAN because for *him* an integer NAN is more useful than an exception. You shouldn't tell him that he is wrong, unless you know his use-case and his code, which you don't.
Then he can catch the exception and do something else. If I called int(x) because my subsequent code "needed a whole number" then I would definitely not want to end up with a nan. The proposal requested is that int(x) could return something other than a well defined integer. That would break a lot of code! In what way is iNaN superior to a plain nan? In C this sort of thing makes sense but in Python there's no reason you can't just use float('nan'). (This was raised by Serhiy earlier in the thread, resulting in Steve saying that he wants int(float('nan')) to return iNaN which then results in the quoted context above). I don't mean to make a judgment about Steve's use-cases: I have read the messages in this thread and I haven't yet seen a use-case for this proposal. -- Oscar
On Mon, Oct 1, 2018 at 8:53 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On Sun, 30 Sep 2018 at 02:01, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Sep 29, 2018 at 09:43:42PM +0100, Oscar Benjamin wrote:
On Sat, 29 Sep 2018 at 19:38, Steve Barnes <gadgetsteve@live.co.uk> wrote:
I converted to int because I needed a whole number, this was intended to represent some more complex process where a value is converted to a whole number down in the depths of the processing.
Your requirement to have a whole number cannot meaningfully be satisfied if your input is nan so an exception is the most useful result.
Not to Steve it isn't.
Be careful about making value judgements like that: Steve is asking for an integer NAN because for *him* an integer NAN is more useful than an exception. You shouldn't tell him that he is wrong, unless you know his use-case and his code, which you don't.
Then he can catch the exception and do something else. If I called int(x) because my subsequent code "needed a whole number" then I would definitely not want to end up with a nan. The proposal requested is that int(x) could return something other than a well defined integer. That would break a lot of code!
At no point was the behaviour of int(x) ever proposed to be changed. Don't overreact here. The recommended use-case was for a library to return iNaN instead of None when it is unable to return an actual value. ChrisA
On Mon, 1 Oct 2018 at 00:00, Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Oct 1, 2018 at 8:53 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On Sun, 30 Sep 2018 at 02:01, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Sep 29, 2018 at 09:43:42PM +0100, Oscar Benjamin wrote:
On Sat, 29 Sep 2018 at 19:38, Steve Barnes <gadgetsteve@live.co.uk> wrote:
I converted to int because I needed a whole number, this was intended to represent some more complex process where a value is converted to a whole number down in the depths of the processing.
Your requirement to have a whole number cannot meaningfully be satisfied if your input is nan so an exception is the most useful result.
Not to Steve it isn't.
Be careful about making value judgements like that: Steve is asking for an integer NAN because for *him* an integer NAN is more useful than an exception. You shouldn't tell him that he is wrong, unless you know his use-case and his code, which you don't.
Then he can catch the exception and do something else. If I called int(x) because my subsequent code "needed a whole number" then I would definitely not want to end up with a nan. The proposal requested is that int(x) could return something other than a well defined integer. That would break a lot of code!
At no point was the behaviour of int(x) ever proposed to be changed. Don't overreact here.
The context got trimmed a bit too much. You can see here the messages preceding what is quoted above: https://mail.python.org/pipermail/python-ideas/2018-September/053840.html Quoting here:
One simplistic example would be print(int(float('nan'))) (gives a ValueError) while print(int(iNaN)) should give 'nan' or maybe 'inan'
That would mean that the result of int(x) is no longer guaranteed to be a well-defined integer.
The recommended use-case was for a library to return iNaN instead of None when it is unable to return an actual value.
You can already use None or float('nan') for this. You can also create your own singleton nan object if you want. When I said I haven't seen a use-case what I mean is that no one has presented a situation where the existing language facilities are considered insufficient (apart from the suggestion about int(iNaN) that I refer to above). -- Oscar
On Sat, Sep 29, 2018 at 10:50:24AM +0300, Serhiy Storchaka wrote:
How does it differ from float('nan')?
It is still an integer and would pass through any processing that expected an integer as one, (with a value of iNaN).
Python is dynamically typed language. What is such processing that would work with iNaN, but doesn't work with float('nan')?
The most obvious difference is that any code which checks for isinstance(x, int) will fail with a float NAN. If you use MyPy for static type checking, passing a float NAN to something annotated to only accept ints will be flagged as an error. Bitwise operators don't work: py> NAN = float("nan") py> NAN & 1 Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for &: 'float' and 'int' Now I'm not sure what Steve expects NANs to do with bitwise operators. But raising TypeError is probably not what we want. A few more operations which aren't supported by floats: NAN.numerator NAN.denominator NAN.from_bytes NAN.bit_length NAN.to_bytes -- Steve
29.09.18 15:19, Steven D'Aprano пише:
On Sat, Sep 29, 2018 at 10:50:24AM +0300, Serhiy Storchaka wrote:
How does it differ from float('nan')?
It is still an integer and would pass through any processing that expected an integer as one, (with a value of iNaN).
Python is dynamically typed language. What is such processing that would work with iNaN, but doesn't work with float('nan')?
The most obvious difference is that any code which checks for isinstance(x, int) will fail with a float NAN.
Yes, an explicit check. But why do you need an explicit check? What will you do with True returned for iNaN? Can you convert it to a machine integer or use it as length or index?
If you use MyPy for static type checking, passing a float NAN to something annotated to only accept ints will be flagged as an error.
I think that passing iNaN to most of functions which expect int is an error. Does MyPy supports something like "int | iNaN"? Than it should be used for functions which accept int and iNaN.
Bitwise operators don't work:
py> NAN = float("nan") py> NAN & 1 Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for &: 'float' and 'int'
Now I'm not sure what Steve expects NANs to do with bitwise operators. But raising TypeError is probably not what we want.
Since these operations make no sense, it makes no sense to discuss them.
A few more operations which aren't supported by floats:
NAN.numerator NAN.denominator
Do you often use these attributes of ints?
NAN.from_bytes > NAN.bit_length NAN.to_bytes
What is the meaning of this?
On Sat, Sep 29, 2018 at 2:32 AM Steve Barnes <gadgetsteve@live.co.uk> wrote:
One of the strengths of the IEEE float, (to set against its many weaknesses), is the presence of the magic value NaN. Not a Number, or NaA, is especially useful in that it is a valid value in any mathematical operation, (always returning NaN), or comparison, (always returning False).
nan = float('nan') nan**0 1.0
... most operations. :-) -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On Sat, Sep 29, 2018 at 06:31:46AM +0000, Steve Barnes wrote:
One of the strengths of the IEEE float, (to set against its many weaknesses),
o_O Since I'm old enough to (just barely) remember the chaos and horror of numeric programming before IEEE-754, I find that comment rather shocking. I'm curious what you think those weaknesses are.
is the presence of the magic value NaN. Not a Number, or NaA, is especially useful in that it is a valid value in any mathematical operation, (always returning NaN), or comparison, (always returning False).
Almost so. But the exceptions don't matter for this discussion.
In functional programming this is especially useful as it allows the chain to complete after an error while retaining the fact that an error occurred, (as we got NaN).
Not just functional programming. [...]
I think that it should be relatively simple to extend the Python integer class to have a NaN flag, possibly by having a bit length of 0, and have it follow the same rules for the handling of floating point NaN, i.e. any mathematical operation on an iNaN returns an iNaN and any comparison with one returns False.
Alas, a bit length of 0 is zero: py> (0).bit_length() 0 I too have often wished that integers would include three special values, namely plus and minus infinity and a NAN. On the other hand, that would add some complexity to the type, and make them harder to learn and use. Perhaps it would be better to subclass int and put the special values in the subclass. A subclass could be written in Python, and would act as a good proof of concept, demonstrating the behaviour of iNAN. For example, what would you expect iNAN & 1 to return? Back in the 1990s, Apple Computers introduced their implementation of IEEE-754, called "SANE" (Standard Apple Numeric Environment). It included a 64-bit integer format "comp", which included a single NAN value (but no infinities), so there is definately prior art to having an integer iNAN value. Likewise, R includes a special NA value, distinct from IEEE-754 NANs, which we could think of as something very vaguely like an integer NAN. https://stat.ethz.ch/R-manual/R-devel/library/base/html/NA.html -- Steve
On 29/09/2018 14:00, Steven D'Aprano wrote:
On Sat, Sep 29, 2018 at 06:31:46AM +0000, Steve Barnes wrote:
One of the strengths of the IEEE float, (to set against its many weaknesses),
o_O
Since I'm old enough to (just barely) remember the chaos and horror of numeric programming before IEEE-754, I find that comment rather shocking.
I'm curious what you think those weaknesses are.
I am likewise old enough - the weaknesses that I am thinking of include: - Variable precision - Non-linearity around zero - Common real world, (decimal), values being irrational, 0.1 anybody? - In many cases being very limited range (then number of people who get caught out on statistical calculations involving permutations).
is the presence of the magic value NaN. Not a Number, or NaA, is especially useful in that it is a valid value in any mathematical operation, (always returning NaN), or comparison, (always returning False).
Almost so. But the exceptions don't matter for this discussion.
Indeed, (and possibly those exceptions should be matched in many cases).
In functional programming this is especially useful as it allows the chain to complete after an error while retaining the fact that an error occurred, (as we got NaN).
Not just functional programming.
True, functional programming was the most obvious to spring to mind but in a lot of my code I need to return some value even in the case of an exception so as to allow the system to carry on running.
[...]
I think that it should be relatively simple to extend the Python integer class to have a NaN flag, possibly by having a bit length of 0, and have it follow the same rules for the handling of floating point NaN, i.e. any mathematical operation on an iNaN returns an iNaN and any comparison with one returns False.
Alas, a bit length of 0 is zero:
py> (0).bit_length() 0
I still think that iNaN.bit_length() should return 0 but obviously that would not be enough in itself to denote iNaN.
I too have often wished that integers would include three special values, namely plus and minus infinity and a NAN. On the other hand, that would add some complexity to the type, and make them harder to learn and use. Perhaps it would be better to subclass int and put the special values in the subclass.
I thought of including pINF & nINF in the original email and then decided to take on a single dragon at a time.
A subclass could be written in Python, and would act as a good proof of concept, demonstrating the behaviour of iNAN. For example, what would you expect iNAN & 1 to return?
I am thinking of trying to put together an overload of integer with iNaN overloads for all of the dunder operations as a proof of concept.
Back in the 1990s, Apple Computers introduced their implementation of IEEE-754, called "SANE" (Standard Apple Numeric Environment). It included a 64-bit integer format "comp", which included a single NAN value (but no infinities), so there is definately prior art to having an integer iNAN value.
I had forgotten this.
Likewise, R includes a special NA value, distinct from IEEE-754 NANs, which we could think of as something very vaguely like an integer NAN.
https://stat.ethz.ch/R-manual/R-devel/library/base/html/NA.html
Not done enough R programming to have come across it.
Thanks! -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com
On 9/29/18 2:31 AM, Steve Barnes wrote:
One of the strengths of the IEEE float, (to set against its many weaknesses), is the presence of the magic value NaN. Not a Number, or NaA, is especially useful in that it is a valid value in any mathematical operation, (always returning NaN), or comparison, (always returning False). In functional programming this is especially useful as it allows the chain to complete after an error while retaining the fact that an error occurred, (as we got NaN).
In languages such as C integers can only be used to represent a limited range of values in integers and a less limited range of values, (but still limited), with a limited accuracy. However, one of Pythons strengths is that its integers can represent any whole number value, (up to the maximum available memory and in exchange for slow performance when numbers get huge). This is accomplished by Python Integers being objects rather than a fixed number of bytes.
I think that it should be relatively simple to extend the Python integer class to have a NaN flag, possibly by having a bit length of 0, and have it follow the same rules for the handling of floating point NaN, i.e. any mathematical operation on an iNaN returns an iNaN and any comparison with one returns False.
One specific use case that springs to mind would be for Libraries such as Pandas to return iNaN for entries that are not numbers in a column that it has been told to treat as integers.
We would possibly need a flag to set this behaviour, rather than raising an Exception, or at the very least automatically (or provide a method to) set LHS integers to iNaN on such an exception.
I thought that I would throw this out to Python Ideas for some discussion of whether such a feature is: a) Desirable? b) Possible, (I am sure that it could be done)? c) Likely to get me kicked off of the list?
I would think that a possibly better solution would be the creation of a NAN type (similar to NONE) that implement this sort of property. That way the feature can be added to integers, rationals, and any other numeric types that exist (why do just integers need this addition). -- Richard Damon
On Sat, Sep 29, 2018 at 09:11:41AM -0400, Richard Damon wrote:
I would think that a possibly better solution would be the creation of a NAN type (similar to NONE) that implement this sort of property. That way the feature can be added to integers, rationals, and any other numeric types that exist (why do just integers need this addition).
Having NAN be a seperate type wouldn't help. If x needs to be an int, it can't be a separate NAN object because that's the wrong type. If ints had a NAN value, then rationals would automatically also get a NAN value, simply by using NAN/1. That's similar to the way that the complex type automatically gets NANs, on the basis that either the real or imaginary part can be a float NAN: py> complex(1, float('nan')) (1+nanj) -- Steve
Hi Steve You've suggested that we add to Python an integer NaN object, similar to the already existing float NaN object.
x = float('nan') x, type(x) (nan, <class 'float'>)
I've learnt that decimal also has a NaN object, but not fractions. https://stackoverflow.com/questions/19374254/
from decimal import Decimal y = Decimal('nan') y, type(y) (Decimal('NaN'), <class 'decimal.Decimal'>)
Numpy has its own fixed size int classes
from numpy import int16 x = int16(2358); x, type(x) (2358, <class 'numpy.int16'>) x = x * x; x, type(x) __main__:1: RuntimeWarning: overflow encountered in short_scalars (-10396, <class 'numpy.int16'>)
I'm confident that one could create classes similar to numpy.int16, except that
z = int16('nan') Traceback (most recent call last): ValueError: invalid literal for int() with base 10: 'nan'
would not raise an exception (and would have the semantics you wish for). I wonder, would this be sufficient for the use cases you have in mind? -- Jonathan
participants (14)
-
Anders Hovmöller
-
Antoine Pitrou
-
Chris Angelico
-
David Mertz
-
Greg Ewing
-
Jonathan Fine
-
Kyle Lahnakoski
-
Michael Selik
-
Nathaniel Smith
-
Oscar Benjamin
-
Richard Damon
-
Serhiy Storchaka
-
Steve Barnes
-
Steven D'Aprano