status of long double support and what to do about it
Hi all, We have to do something about long double support. This is something I wanted to propose a long time ago already, and moving build systems has resurfaced the pain yet again. This is not a full proposal yet, but the start of a discussion and gradual plan of attack. The problem ----------- The main problem is that long double support is *extremely* painful to maintain, probably far more than justified. I could write a very long story about that, but instead I'll just illustrate with some of the key points: (1) `long double` is the main reason why we're having such a hard time with building wheels on Windows, for SciPy in particular. This is because MSVC makes long double 64-bit, and Mingw-w64 defaults to 80-bit. So we have to deal with Mingw-w64 toolchains, proposed compiler patches, etc. This alone has been a massive time sink. A couple of threads: https://github.com/numpy/numpy/issues/20348 https://discuss.scientific-python.org/t/releasing-or-not-32-bit-windows-whee... The first issue linked above is one of the key ones, with a lot of detail about the fundamental problems with `long double`. The Scientific Python thread focused more on Fortran, however that Fortran + Windows problem is at least partly the fault of `long double`. And Fortran may be rejuvenated with LFortran and fortran-lang.org; `long double` is a dead end. (2) `long double` is not a well-defined format. We support 9 specific binary representations, and have a ton of code floating around to check for those, and manually fiddle with individual bits in long double numbers. Part of that is the immediate pain point for me right now: in the configure stage of the build we consume object files produced by the compiler and parse them, matching some known bit patterns. This check is so weird that it's the only one that I cannot implement in Meson (short of porting the hundreds of lines of Python code for it to C), see https://github.com/mesonbuild/meson/issues/11068 for details. To get an idea of the complexity here: https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6... https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6... https://github.com/numpy/numpy/blob/main/numpy/core/src/npymath/npy_math_com... https://github.com/numpy/numpy/blob/main/numpy/core/src/multiarray/dragon4.c... Typically `long double` has multiple branches, and requires more code than float/double. (3) We spend a lot of time dealing with issues and PR to keep `long double` working, as well as dealing with hard-to-diagnose build issues. Which sometimes even stops people from building/contributing, especially on Windows. Some recent examples: https://github.com/numpy/numpy/pull/20360 https://github.com/numpy/numpy/pull/18536 https://github.com/numpy/numpy/pull/21813 https://github.com/numpy/numpy/pull/22405 https://github.com/numpy/numpy/pull/19950 https://github.com/numpy/numpy/pull/18330/commits/aa9fd3c7cb https://github.com/scipy/scipy/issues/16769 https://github.com/numpy/numpy/issues/14574 (4) `long double` isn't all that useful. On both Windows and macOS `long double` is 64-bit, which means it is just a poor alias to `double`. So it does literally nothing for the majority of our users, except confuse them and take up extra memory. On Linux, `long double` is 80-bit precision, which means it doesn't do all that much there either, just a modest bump in precision. Let me also note that it's not just the user-visible dtypes that we have to consider; long double types are also baked into the libnpymath static library that we ship with NumPy. That's a thing we have to do something about anyway (shipping static libraries is not the best idea, see https://github.com/numpy/numpy/issues/20880). We just have to make sure to not forget about it when thinking about solutions here. Potential solutions ------------------- (A) The ideal solution would be to have a proper, portable quad-precision (128 bits of precision) dtype. It's now possible to write that externally, after all the work that Sebastian and others have put into the dtype infrastructure. The dtype itself already exists ( https://github.com/peytondmurray/quaddtype, maybe there are more implementations floating around). It just need the people who have an actual need for it to drive that. It's still a significant amount of work, so I'll not go into this one more right now. (B) Full deprecation and removal of all `long double` support from NumPy (and SciPy), irrespective of whether the quad dtype comes to life. Given the current state, I'm personally convinced that that is easily justified. However, I know some folks are going to be hesitant, given that we don't know how many remaining users we have or what use cases they have. So let's see if we can find more granular solutions (note: these are ideas, not all fully researched solutions that we can pick from and are guaranteed to work out well). (C) Only support `long double` where it has proper compiler support (C99/C++11), so using it "just works". And remove all the support for old formats and accessing bit representations directly. This also implies making some optional functions mandatory. For example, the issue I ran into today showed up at runtime in a fallback path for systems that don't have `strtold_l`. We don't test such fallback paths in CI, so they're going to be fragile anyway. (D) Add a build mode with a command-line flag, where we typedef `long double` to `double`. I'll note that we already touched on that once (see https://github.com/numpy/numpy/issues/20348); I'm not sure though if it's fundamentally not a good idea or that we just didn't do it intentionally enough. Next steps ---------- First, it would be great to hear from folks who have use cases for long double support in NumPy. So far we have very little knowledge of that, we only see the problems and work it causes us. Second, I'm going to have to add support for (C) or (D) temporarily to the Meson build anyway, as we run into things. It can be worked around if we really have to by implementing support for long double format detection in Meson, or by rewriting all the detection logic so it's all-in-one in C. But that takes a significant amount of time. Third, let's figure out which way we'd like to go. Do you see alternative solutions? Or like any of the ones I listed more than others? Are you willing to jump in and work on a quad dtype? Cheers, Ralf
Thanks Ralf, this indeed strikes a chord also from SciPy point of view. In my limited opinion, it's not just the dtype but also everything else that this dtype is going to be used in defines its proper implementation. Upstream (lapack, fft, so on) and downstream (practically everything else) does not have generic support on all platforms hence we are providing convenience to some and taking it away afterwards by not providing the necessary machinery essentially making it a storage format (similar but less problematic situation with half) with some valuable and convenient arithmetics on top. Hence I'd lean on (B) but I'm curious how we can pull off (C) because that is going to be tricky for Cythonization etc. which would, probably in practice, force downcasting to double to do anything with it. Or maybe I'm missing the context of its utilization. On Thu, Nov 17, 2022 at 11:12 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
Hi all,
We have to do something about long double support. This is something I wanted to propose a long time ago already, and moving build systems has resurfaced the pain yet again.
This is not a full proposal yet, but the start of a discussion and gradual plan of attack.
The problem ----------- The main problem is that long double support is *extremely* painful to maintain, probably far more than justified. I could write a very long story about that, but instead I'll just illustrate with some of the key points:
(1) `long double` is the main reason why we're having such a hard time with building wheels on Windows, for SciPy in particular. This is because MSVC makes long double 64-bit, and Mingw-w64 defaults to 80-bit. So we have to deal with Mingw-w64 toolchains, proposed compiler patches, etc. This alone has been a massive time sink. A couple of threads: https://github.com/numpy/numpy/issues/20348
https://discuss.scientific-python.org/t/releasing-or-not-32-bit-windows-whee... The first issue linked above is one of the key ones, with a lot of detail about the fundamental problems with `long double`. The Scientific Python thread focused more on Fortran, however that Fortran + Windows problem is at least partly the fault of `long double`. And Fortran may be rejuvenated with LFortran and fortran-lang.org; `long double` is a dead end.
(2) `long double` is not a well-defined format. We support 9 specific binary representations, and have a ton of code floating around to check for those, and manually fiddle with individual bits in long double numbers. Part of that is the immediate pain point for me right now: in the configure stage of the build we consume object files produced by the compiler and parse them, matching some known bit patterns. This check is so weird that it's the only one that I cannot implement in Meson (short of porting the hundreds of lines of Python code for it to C), see https://github.com/mesonbuild/meson/issues/11068 for details. To get an idea of the complexity here:
https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6...
https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6...
https://github.com/numpy/numpy/blob/main/numpy/core/src/npymath/npy_math_com...
https://github.com/numpy/numpy/blob/main/numpy/core/src/multiarray/dragon4.c... Typically `long double` has multiple branches, and requires more code than float/double.
(3) We spend a lot of time dealing with issues and PR to keep `long double` working, as well as dealing with hard-to-diagnose build issues. Which sometimes even stops people from building/contributing, especially on Windows. Some recent examples: https://github.com/numpy/numpy/pull/20360 https://github.com/numpy/numpy/pull/18536 https://github.com/numpy/numpy/pull/21813 https://github.com/numpy/numpy/pull/22405 https://github.com/numpy/numpy/pull/19950 https://github.com/numpy/numpy/pull/18330/commits/aa9fd3c7cb https://github.com/scipy/scipy/issues/16769 https://github.com/numpy/numpy/issues/14574
(4) `long double` isn't all that useful. On both Windows and macOS `long double` is 64-bit, which means it is just a poor alias to `double`. So it does literally nothing for the majority of our users, except confuse them and take up extra memory. On Linux, `long double` is 80-bit precision, which means it doesn't do all that much there either, just a modest bump in precision.
Let me also note that it's not just the user-visible dtypes that we have to consider; long double types are also baked into the libnpymath static library that we ship with NumPy. That's a thing we have to do something about anyway (shipping static libraries is not the best idea, see https://github.com/numpy/numpy/issues/20880). We just have to make sure to not forget about it when thinking about solutions here.
Potential solutions -------------------
(A) The ideal solution would be to have a proper, portable quad-precision (128 bits of precision) dtype. It's now possible to write that externally, after all the work that Sebastian and others have put into the dtype infrastructure. The dtype itself already exists ( https://github.com/peytondmurray/quaddtype, maybe there are more implementations floating around). It just need the people who have an actual need for it to drive that. It's still a significant amount of work, so I'll not go into this one more right now.
(B) Full deprecation and removal of all `long double` support from NumPy (and SciPy), irrespective of whether the quad dtype comes to life.
Given the current state, I'm personally convinced that that is easily justified. However, I know some folks are going to be hesitant, given that we don't know how many remaining users we have or what use cases they have. So let's see if we can find more granular solutions (note: these are ideas, not all fully researched solutions that we can pick from and are guaranteed to work out well).
(C) Only support `long double` where it has proper compiler support (C99/C++11), so using it "just works". And remove all the support for old formats and accessing bit representations directly. This also implies making some optional functions mandatory. For example, the issue I ran into today showed up at runtime in a fallback path for systems that don't have `strtold_l`. We don't test such fallback paths in CI, so they're going to be fragile anyway.
(D) Add a build mode with a command-line flag, where we typedef `long double` to `double`. I'll note that we already touched on that once (see https://github.com/numpy/numpy/issues/20348); I'm not sure though if it's fundamentally not a good idea or that we just didn't do it intentionally enough.
Next steps ----------
First, it would be great to hear from folks who have use cases for long double support in NumPy. So far we have very little knowledge of that, we only see the problems and work it causes us.
Second, I'm going to have to add support for (C) or (D) temporarily to the Meson build anyway, as we run into things. It can be worked around if we really have to by implementing support for long double format detection in Meson, or by rewriting all the detection logic so it's all-in-one in C. But that takes a significant amount of time.
Third, let's figure out which way we'd like to go. Do you see alternative solutions? Or like any of the ones I listed more than others? Are you willing to jump in and work on a quad dtype?
Cheers, Ralf
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: ilhanpolat@gmail.com
On Thu, Nov 17, 2022 at 3:15 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
Hi all,
We have to do something about long double support. This is something I wanted to propose a long time ago already, and moving build systems has resurfaced the pain yet again.
This is not a full proposal yet, but the start of a discussion and gradual plan of attack.
The problem ----------- The main problem is that long double support is *extremely* painful to maintain, probably far more than justified. I could write a very long story about that, but instead I'll just illustrate with some of the key points:
(1) `long double` is the main reason why we're having such a hard time with building wheels on Windows, for SciPy in particular. This is because MSVC makes long double 64-bit, and Mingw-w64 defaults to 80-bit. So we have to deal with Mingw-w64 toolchains, proposed compiler patches, etc. This alone has been a massive time sink. A couple of threads: https://github.com/numpy/numpy/issues/20348
https://discuss.scientific-python.org/t/releasing-or-not-32-bit-windows-whee... The first issue linked above is one of the key ones, with a lot of detail about the fundamental problems with `long double`. The Scientific Python thread focused more on Fortran, however that Fortran + Windows problem is at least partly the fault of `long double`. And Fortran may be rejuvenated with LFortran and fortran-lang.org; `long double` is a dead end.
(2) `long double` is not a well-defined format. We support 9 specific binary representations, and have a ton of code floating around to check for those, and manually fiddle with individual bits in long double numbers. Part of that is the immediate pain point for me right now: in the configure stage of the build we consume object files produced by the compiler and parse them, matching some known bit patterns. This check is so weird that it's the only one that I cannot implement in Meson (short of porting the hundreds of lines of Python code for it to C), see https://github.com/mesonbuild/meson/issues/11068 for details. To get an idea of the complexity here:
https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6...
https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6...
https://github.com/numpy/numpy/blob/main/numpy/core/src/npymath/npy_math_com...
https://github.com/numpy/numpy/blob/main/numpy/core/src/multiarray/dragon4.c... Typically `long double` has multiple branches, and requires more code than float/double.
(3) We spend a lot of time dealing with issues and PR to keep `long double` working, as well as dealing with hard-to-diagnose build issues. Which sometimes even stops people from building/contributing, especially on Windows. Some recent examples: https://github.com/numpy/numpy/pull/20360 https://github.com/numpy/numpy/pull/18536 https://github.com/numpy/numpy/pull/21813 https://github.com/numpy/numpy/pull/22405 https://github.com/numpy/numpy/pull/19950 https://github.com/numpy/numpy/pull/18330/commits/aa9fd3c7cb https://github.com/scipy/scipy/issues/16769 https://github.com/numpy/numpy/issues/14574
(4) `long double` isn't all that useful. On both Windows and macOS `long double` is 64-bit, which means it is just a poor alias to `double`. So it does literally nothing for the majority of our users, except confuse them and take up extra memory. On Linux, `long double` is 80-bit precision, which means it doesn't do all that much there either, just a modest bump in precision.
Let me also note that it's not just the user-visible dtypes that we have to consider; long double types are also baked into the libnpymath static library that we ship with NumPy. That's a thing we have to do something about anyway (shipping static libraries is not the best idea, see https://github.com/numpy/numpy/issues/20880). We just have to make sure to not forget about it when thinking about solutions here.
Potential solutions -------------------
(A) The ideal solution would be to have a proper, portable quad-precision (128 bits of precision) dtype. It's now possible to write that externally, after all the work that Sebastian and others have put into the dtype infrastructure. The dtype itself already exists ( https://github.com/peytondmurray/quaddtype, maybe there are more implementations floating around). It just need the people who have an actual need for it to drive that. It's still a significant amount of work, so I'll not go into this one more right now.
(B) Full deprecation and removal of all `long double` support from NumPy (and SciPy), irrespective of whether the quad dtype comes to life.
Given the current state, I'm personally convinced that that is easily justified. However, I know some folks are going to be hesitant, given that we don't know how many remaining users we have or what use cases they have. So let's see if we can find more granular solutions (note: these are ideas, not all fully researched solutions that we can pick from and are guaranteed to work out well).
(C) Only support `long double` where it has proper compiler support (C99/C++11), so using it "just works". And remove all the support for old formats and accessing bit representations directly. This also implies making some optional functions mandatory. For example, the issue I ran into today showed up at runtime in a fallback path for systems that don't have `strtold_l`. We don't test such fallback paths in CI, so they're going to be fragile anyway.
(D) Add a build mode with a command-line flag, where we typedef `long double` to `double`. I'll note that we already touched on that once (see https://github.com/numpy/numpy/issues/20348); I'm not sure though if it's fundamentally not a good idea or that we just didn't do it intentionally enough.
Next steps ----------
First, it would be great to hear from folks who have use cases for long double support in NumPy. So far we have very little knowledge of that, we only see the problems and work it causes us.
Second, I'm going to have to add support for (C) or (D) temporarily to the Meson build anyway, as we run into things. It can be worked around if we really have to by implementing support for long double format detection in Meson, or by rewriting all the detection logic so it's all-in-one in C. But that takes a significant amount of time.
Third, let's figure out which way we'd like to go. Do you see alternative solutions? Or like any of the ones I listed more than others? Are you willing to jump in and work on a quad dtype?
Cheers, Ralf
I would agree that extended precision is pretty useless, IIRC, it was mostly intended as an accurate way to produce double precision results. That idea was eventually dropped as not very useful. I'd happily do away with subnormal doubles as well, they were another not very useful idea. And strictly speaking, we should not support IBM double-double either, it is not in the IEEE standard. That said, I would like to have a quad precision type. That precision is useful for some things, and I have a dream that someday it can be used for a time type. Unfortunately, last time I looked around, none of the available implementations had a NumPy compatible license. The tricky thing here is to not break downstream projects, but that may be unavoidable. I suspect the fallout will not be that bad. Chuck
On 11/17/22 7:13 PM, Charles R Harris wrote:
On Thu, Nov 17, 2022 at 3:15 PM Ralf Gommers <ralf.gommers@gmail.com <mailto:ralf.gommers@gmail.com>> wrote:
Hi all,
We have to do something about long double support. This is something I wanted to propose a long time ago already, and moving build systems has resurfaced the pain yet again.
This is not a full proposal yet, but the start of a discussion and gradual plan of attack.
The problem ----------- The main problem is that long double support is *extremely* painful to maintain, probably far more than justified. I could write a very long story about that, but instead I'll just illustrate with some of the key points:
(1) `long double` is the main reason why we're having such a hard time with building wheels on Windows, for SciPy in particular. This is because MSVC makes long double 64-bit, and Mingw-w64 defaults to 80-bit. So we have to deal with Mingw-w64 toolchains, proposed compiler patches, etc. This alone has been a massive time sink. A couple of threads: https://github.com/numpy/numpy/issues/20348 <https://github.com/numpy/numpy/issues/20348> https://discuss.scientific-python.org/t/releasing-or-not-32-bit-windows-whee... <https://discuss.scientific-python.org/t/releasing-or-not-32-bit-windows-wheels/282> The first issue linked above is one of the key ones, with a lot of detail about the fundamental problems with `long double`. The Scientific Python thread focused more on Fortran, however that Fortran + Windows problem is at least partly the fault of `long double`. And Fortran may be rejuvenated with LFortran and fortran-lang.org <http://fortran-lang.org>; `long double` is a dead end.
(2) `long double` is not a well-defined format. We support 9 specific binary representations, and have a ton of code floating around to check for those, and manually fiddle with individual bits in long double numbers. Part of that is the immediate pain point for me right now: in the configure stage of the build we consume object files produced by the compiler and parse them, matching some known bit patterns. This check is so weird that it's the only one that I cannot implement in Meson (short of porting the hundreds of lines of Python code for it to C), see https://github.com/mesonbuild/meson/issues/11068 <https://github.com/mesonbuild/meson/issues/11068> for details. To get an idea of the complexity here: https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6... <https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6/numpy/core/setup_common.py#L264-L434> https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6... <https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6/numpy/core/src/npymath/npy_math_private.h#L179-L484> https://github.com/numpy/numpy/blob/main/numpy/core/src/npymath/npy_math_com... <https://github.com/numpy/numpy/blob/main/numpy/core/src/npymath/npy_math_complex.c.src#L598-L619> https://github.com/numpy/numpy/blob/main/numpy/core/src/multiarray/dragon4.c... <https://github.com/numpy/numpy/blob/main/numpy/core/src/multiarray/dragon4.c#L2480-L3052> Typically `long double` has multiple branches, and requires more code than float/double.
(3) We spend a lot of time dealing with issues and PR to keep `long double` working, as well as dealing with hard-to-diagnose build issues. Which sometimes even stops people from building/contributing, especially on Windows. Some recent examples: https://github.com/numpy/numpy/pull/20360 <https://github.com/numpy/numpy/pull/20360> https://github.com/numpy/numpy/pull/18536 <https://github.com/numpy/numpy/pull/18536> https://github.com/numpy/numpy/pull/21813 <https://github.com/numpy/numpy/pull/21813> https://github.com/numpy/numpy/pull/22405 <https://github.com/numpy/numpy/pull/22405> https://github.com/numpy/numpy/pull/19950 <https://github.com/numpy/numpy/pull/19950> https://github.com/numpy/numpy/pull/18330/commits/aa9fd3c7cb <https://github.com/numpy/numpy/pull/18330/commits/aa9fd3c7cb> https://github.com/scipy/scipy/issues/16769 <https://github.com/scipy/scipy/issues/16769> https://github.com/numpy/numpy/issues/14574 <https://github.com/numpy/numpy/issues/14574>
(4) `long double` isn't all that useful. On both Windows and macOS `long double` is 64-bit, which means it is just a poor alias to `double`. So it does literally nothing for the majority of our users, except confuse them and take up extra memory. On Linux, `long double` is 80-bit precision, which means it doesn't do all that much there either, just a modest bump in precision.
Let me also note that it's not just the user-visible dtypes that we have to consider; long double types are also baked into the libnpymath static library that we ship with NumPy. That's a thing we have to do something about anyway (shipping static libraries is not the best idea, see https://github.com/numpy/numpy/issues/20880 <https://github.com/numpy/numpy/issues/20880>). We just have to make sure to not forget about it when thinking about solutions here.
Potential solutions -------------------
(A) The ideal solution would be to have a proper, portable quad-precision (128 bits of precision) dtype. It's now possible to write that externally, after all the work that Sebastian and others have put into the dtype infrastructure. The dtype itself already exists (https://github.com/peytondmurray/quaddtype <https://github.com/peytondmurray/quaddtype>, maybe there are more implementations floating around). It just need the people who have an actual need for it to drive that. It's still a significant amount of work, so I'll not go into this one more right now.
(B) Full deprecation and removal of all `long double` support from NumPy (and SciPy), irrespective of whether the quad dtype comes to life.
Given the current state, I'm personally convinced that that is easily justified. However, I know some folks are going to be hesitant, given that we don't know how many remaining users we have or what use cases they have. So let's see if we can find more granular solutions (note: these are ideas, not all fully researched solutions that we can pick from and are guaranteed to work out well).
(C) Only support `long double` where it has proper compiler support (C99/C++11), so using it "just works". And remove all the support for old formats and accessing bit representations directly. This also implies making some optional functions mandatory. For example, the issue I ran into today showed up at runtime in a fallback path for systems that don't have `strtold_l`. We don't test such fallback paths in CI, so they're going to be fragile anyway.
(D) Add a build mode with a command-line flag, where we typedef `long double` to `double`. I'll note that we already touched on that once (see https://github.com/numpy/numpy/issues/20348 <https://github.com/numpy/numpy/issues/20348>); I'm not sure though if it's fundamentally not a good idea or that we just didn't do it intentionally enough.
Next steps ----------
First, it would be great to hear from folks who have use cases for long double support in NumPy. So far we have very little knowledge of that, we only see the problems and work it causes us.
Second, I'm going to have to add support for (C) or (D) temporarily to the Meson build anyway, as we run into things. It can be worked around if we really have to by implementing support for long double format detection in Meson, or by rewriting all the detection logic so it's all-in-one in C. But that takes a significant amount of time.
Third, let's figure out which way we'd like to go. Do you see alternative solutions? Or like any of the ones I listed more than others? Are you willing to jump in and work on a quad dtype?
Cheers, Ralf
I would agree that extended precision is pretty useless, IIRC, it was mostly intended as an accurate way to produce double precision results. That idea was eventually dropped as not very useful. I'd happily do away with subnormal doubles as well, they were another not very useful idea. And strictly speaking, we should not support IBM double-double either, it is not in the IEEE standard.
That said, I would like to have a quad precision type. That precision is useful for some things, and I have a dream that someday it can be used for a time type. Unfortunately, last time I looked around, none of the available implementations had a NumPy compatible license.
The tricky thing here is to not break downstream projects, but that may be unavoidable. I suspect the fallout will not be that bad.
Chuck
A quick response from one of the leaders of a team that requires 80bit extended precision for astronomical work... "extended precision is pretty useless" unless you need it. And the high-precision pulsar timing community needs it. Standard double precision (64-bit) values do not contain enough precision for us to pass relative astronomical times via a single float without extended precision (the precision ends up being at the ~1 microsec level over decades of time differences, and we need it at the ~1-10ns level) nor can we store the measured spin frequencies (or do calculations on them) of our millisecond pulsars with enough precision. Those spin frequencies can have 16-17 digits of base-10 precision (i.e. we measure them to that precision). This is why we use 80-bit floats (usually via Linux, but also on non X1 Mac hardware if you use the correct compilers) extensively. Numpy is a key component of the PINT software to do high-precision pulsar timing, and we use it partly *because* it has long double support (with 80-bit extended precision): https://github.com/nanograv/PINT And see the published paper here, particularly Sec 3.3.1 and footnote #42: https://ui.adsabs.harvard.edu/abs/2021ApJ...911...45L/abstract Going to software quad precision would certainly work, but it would definitely make things much slower for our matrix and vector math. We would definitely love to see a solution for this that allows us to get the extra precision we need on other platforms besides Intel/AMD64+Linux (primarily), but giving up extended precision on those platforms would *definitely* hurt. I can tell you that the pulsar community would definitely be against option "B". And I suspect that there are other users out there as well. Scott NANOGrav Chair www.nanograv.org -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom@nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: A40A 94F2 3F48 4136 3AC4 9598 92D5 25CB 22A6 7B65
On Thu, Nov 17, 2022 at 5:29 PM Scott Ransom <sransom@nrao.edu> wrote:
A quick response from one of the leaders of a team that requires 80bit extended precision for astronomical work...
"extended precision is pretty useless" unless you need it. And the high-precision pulsar timing community needs it. Standard double precision (64-bit) values do not contain enough precision for us to pass relative astronomical times via a single float without extended precision (the precision ends up being at the ~1 microsec level over decades of time differences, and we need it at the ~1-10ns level) nor can we store the measured spin frequencies (or do calculations on them) of our millisecond pulsars with enough precision. Those spin frequencies can have 16-17 digits of base-10 precision (i.e. we measure them to that precision). This is why we use 80-bit floats (usually via Linux, but also on non X1 Mac hardware if you use the correct compilers) extensively.
Numpy is a key component of the PINT software to do high-precision pulsar timing, and we use it partly *because* it has long double support (with 80-bit extended precision): https://github.com/nanograv/PINT And see the published paper here, particularly Sec 3.3.1 and footnote #42: https://ui.adsabs.harvard.edu/abs/2021ApJ...911...45L/abstract
Going to software quad precision would certainly work, but it would definitely make things much slower for our matrix and vector math.
We would definitely love to see a solution for this that allows us to get the extra precision we need on other platforms besides Intel/AMD64+Linux (primarily), but giving up extended precision on those platforms would *definitely* hurt. I can tell you that the pulsar community would definitely be against option "B". And I suspect that there are other users out there as well.
Hi Scott, Thanks for sharing your feedback! Would you or some of your colleagues be open to helping maintain a library that adds the 80-bit extended precision dtype into NumPy? This would be a variation of Ralf's "option A." Best, Stephan
Scott NANOGrav Chair www.nanograv.org
-- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom@nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: A40A 94F2 3F48 4136 3AC4 9598 92D5 25CB 22A6 7B65 _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: shoyer@gmail.com
On Thu, 2022-11-17 at 17:48 -0800, Stephan Hoyer wrote:
On Thu, Nov 17, 2022 at 5:29 PM Scott Ransom <sransom@nrao.edu> wrote:
<snip>
Hi Scott,
Thanks for sharing your feedback!
Would you or some of your colleagues be open to helping maintain a library that adds the 80-bit extended precision dtype into NumPy? This would be a variation of Ralf's "option A."
As you know, I am a hesitant, mainly because we don't clearly know how many users are affected. To explain the hesitation: * This change is "critical": For affected users their code just breaks without even a clear path to fix it. * The number of affected users is rather limited: But, at this time, we do not know that it is very very small :(. (Not counting users who could modify their code to use doubles) Now, given that, it falls into a category that to me makes the break big enough to be a tough choice. We don't normally break users in a critical way, and when we do I think it is either an important bug fix elsewhere or because we suspect there so few users that we could practically help everyone individually to find a solution. Fortunately, I do think either (A) or (C) would make the situation much simpler: (A) is great because it should give an alternative for many (maybe not all) users. I.e. for them the change has big impact, but hopefully not critical. Considering this discussion, we might need the longdouble DType and not just a quad precision one to actually achieve that. (C) Would effectively reduce the number of affected users to be exceedingly small (I think). So to me personally, given how tricky continuing support seems to be, either (A) or (C) seems sufficient to make it pretty clear cut (it is still a large compatibility change that should be formally accepted as a brief NEP, similar to yanking financial). Without any "mitigation" it is a tough decision to make. Maybe there is no decision, because longdouble is a house of cards bound to collapse unless dedicated maintainers speak up... Cheers, Sebastian PS: One problem we may have is API/ABI compatibility when we yank things out. I do think NumPy 2.0 is on the horizon, so that should help. ABI incompatibility should (IMO) be avoided at almost all cost, which means we may need a way to make sure if you compile for NumPy 2.0, but against an old NumPy version you still have some compatibility header in-place which ensures that e.g. `npy_longdouble` is not even defined (or its use at least carefully curated). So I do suspect and hope that we can pull of this (and other such changes) as an "API break", but in a way that allows to compile once and be ABI compatible to both old and new NumPy.
Best, Stephan
Scott NANOGrav Chair www.nanograv.org
-- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom@nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: A40A 94F2 3F48 4136 3AC4 9598 92D5 25CB 22A6 7B65 _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: shoyer@gmail.com
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
On Thu, Nov 17, 2022 at 6:30 PM Scott Ransom <sransom@nrao.edu> wrote:
On 11/17/22 7:13 PM, Charles R Harris wrote:
On Thu, Nov 17, 2022 at 3:15 PM Ralf Gommers <ralf.gommers@gmail.com <mailto:ralf.gommers@gmail.com>> wrote:
Hi all,
We have to do something about long double support. This is something
I wanted to propose a long
time ago already, and moving build systems has resurfaced the pain
yet again.
This is not a full proposal yet, but the start of a discussion and
I would agree that extended precision is pretty useless, IIRC, it was mostly intended as an accurate way to produce double precision results. That idea was eventually dropped as not very useful. I'd happily do away with subnormal doubles as well, they were another not very useful idea. And strictly speaking, we should not support IBM double-double either, it is not in
gradual plan of attack. <snip> the IEEE standard.
That said, I would like to have a quad precision type. That precision is
I have a dream that someday it can be used for a time type. Unfortunately, last time I looked around, none of the available implementations had a NumPy compatible
useful for some things, and license.
The tricky thing here is to not break downstream projects, but that may
be unavoidable. I suspect
the fallout will not be that bad.
Chuck
A quick response from one of the leaders of a team that requires 80bit extended precision for astronomical work...
"extended precision is pretty useless" unless you need it. And the high-precision pulsar timing community needs it. Standard double precision (64-bit) values do not contain enough precision for us to pass relative astronomical times via a single float without extended precision (the precision ends up being at the ~1 microsec level over decades of time differences, and we need it at the ~1-10ns level) nor can we store the measured spin frequencies (or do calculations on them) of our millisecond pulsars with enough precision. Those spin frequencies can have 16-17 digits of base-10 precision (i.e. we measure them to that precision). This is why we use 80-bit floats (usually via Linux, but also on non X1 Mac hardware if you use the correct compilers) extensively.
Numpy is a key component of the PINT software to do high-precision pulsar timing, and we use it partly *because* it has long double support (with 80-bit extended precision): https://github.com/nanograv/PINT And see the published paper here, particularly Sec 3.3.1 and footnote #42: https://ui.adsabs.harvard.edu/abs/2021ApJ...911...45L/abstract
Going to software quad precision would certainly work, but it would definitely make things much slower for our matrix and vector math.
We would definitely love to see a solution for this that allows us to get the extra precision we need on other platforms besides Intel/AMD64+Linux (primarily), but giving up extended precision on those platforms would *definitely* hurt. I can tell you that the pulsar community would definitely be against option "B". And I suspect that there are other users out there as well.
Scott NANOGrav Chair www.nanograv.org
Pulsar timing is one reason I wanted a quad precision time type. I thought Astropy was using a self implemented double-double type to work around that? Chuck
On 11/17/22 8:53 PM, Charles R Harris wrote:
On Thu, Nov 17, 2022 at 6:30 PM Scott Ransom <sransom@nrao.edu <mailto:sransom@nrao.edu>> wrote:
On 11/17/22 7:13 PM, Charles R Harris wrote: > > > On Thu, Nov 17, 2022 at 3:15 PM Ralf Gommers <ralf.gommers@gmail.com <mailto:ralf.gommers@gmail.com> > <mailto:ralf.gommers@gmail.com <mailto:ralf.gommers@gmail.com>>> wrote: > > Hi all, > > We have to do something about long double support. This is something I wanted to propose a long > time ago already, and moving build systems has resurfaced the pain yet again. > > This is not a full proposal yet, but the start of a discussion and gradual plan of attack. <snip> > I would agree that extended precision is pretty useless, IIRC, it was mostly intended as an accurate > way to produce double precision results. That idea was eventually dropped as not very useful. I'd > happily do away with subnormal doubles as well, they were another not very useful idea. And strictly > speaking, we should not support IBM double-double either, it is not in the IEEE standard. > > That said, I would like to have a quad precision type. That precision is useful for some things, and > I have a dream that someday it can be used for a time type. Unfortunately, last time I looked > around, none of the available implementations had a NumPy compatible license. > > The tricky thing here is to not break downstream projects, but that may be unavoidable. I suspect > the fallout will not be that bad. > > Chuck
A quick response from one of the leaders of a team that requires 80bit extended precision for astronomical work...
"extended precision is pretty useless" unless you need it. And the high-precision pulsar timing community needs it. Standard double precision (64-bit) values do not contain enough precision for us to pass relative astronomical times via a single float without extended precision (the precision ends up being at the ~1 microsec level over decades of time differences, and we need it at the ~1-10ns level) nor can we store the measured spin frequencies (or do calculations on them) of our millisecond pulsars with enough precision. Those spin frequencies can have 16-17 digits of base-10 precision (i.e. we measure them to that precision). This is why we use 80-bit floats (usually via Linux, but also on non X1 Mac hardware if you use the correct compilers) extensively.
Numpy is a key component of the PINT software to do high-precision pulsar timing, and we use it partly *because* it has long double support (with 80-bit extended precision): https://github.com/nanograv/PINT <https://github.com/nanograv/PINT> And see the published paper here, particularly Sec 3.3.1 and footnote #42: https://ui.adsabs.harvard.edu/abs/2021ApJ...911...45L/abstract <https://ui.adsabs.harvard.edu/abs/2021ApJ...911...45L/abstract>
Going to software quad precision would certainly work, but it would definitely make things much slower for our matrix and vector math.
We would definitely love to see a solution for this that allows us to get the extra precision we need on other platforms besides Intel/AMD64+Linux (primarily), but giving up extended precision on those platforms would *definitely* hurt. I can tell you that the pulsar community would definitely be against option "B". And I suspect that there are other users out there as well.
Scott NANOGrav Chair www.nanograv.org <http://www.nanograv.org>
Pulsar timing is one reason I wanted a quad precision time type. I thought Astropy was using a self implemented double-double type to work around that?
That is correct. For non-compute-intensive time calculations, Astropy as a Time object that internally uses two 64-bit floats. We use it, and it works great for high precision timekeeping over astronomical times. *However*, it ain't fast. So you can't do fast matrix/vector math on time differences where your precision exceeds a single 64-bit float. That's exactly where we are with extended precision for our pulsar timing work. Scott -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom@nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: A40A 94F2 3F48 4136 3AC4 9598 92D5 25CB 22A6 7B65
Hi Scott - Thanks for providing input! Do you have a minimal example that shows the type of calculations you would like to be faster with extended precision? Just so we are clear on the goal for this use case. Would you or some of your colleagues be open to helping maintain a library that adds the 80-bit extended precision dtype into NumPy? This would be a variation of Ralf's "option A." The NumPy community has resources and assistance for onboarding new contributors, and I'm happy to provide additional assistance. Ralf - Are there other use cases that you have already identified? Best, Mike Mike McCarty Software Engineering Manager NVIDIA<https://www.nvidia.com> ________________________________ From: Scott Ransom <sransom@nrao.edu> Sent: Thursday, November 17, 2022 9:04 PM To: numpy-discussion@python.org <numpy-discussion@python.org> Subject: [Numpy-discussion] Re: status of long double support and what to do about it External email: Use caution opening links or attachments On 11/17/22 8:53 PM, Charles R Harris wrote:
On Thu, Nov 17, 2022 at 6:30 PM Scott Ransom <sransom@nrao.edu <mailto:sransom@nrao.edu>> wrote:
On 11/17/22 7:13 PM, Charles R Harris wrote: > > > On Thu, Nov 17, 2022 at 3:15 PM Ralf Gommers <ralf.gommers@gmail.com <mailto:ralf.gommers@gmail.com> > <mailto:ralf.gommers@gmail.com <mailto:ralf.gommers@gmail.com>>> wrote: > > Hi all, > > We have to do something about long double support. This is something I wanted to propose a long > time ago already, and moving build systems has resurfaced the pain yet again. > > This is not a full proposal yet, but the start of a discussion and gradual plan of attack. <snip> > I would agree that extended precision is pretty useless, IIRC, it was mostly intended as an accurate > way to produce double precision results. That idea was eventually dropped as not very useful. I'd > happily do away with subnormal doubles as well, they were another not very useful idea. And strictly > speaking, we should not support IBM double-double either, it is not in the IEEE standard. > > That said, I would like to have a quad precision type. That precision is useful for some things, and > I have a dream that someday it can be used for a time type. Unfortunately, last time I looked > around, none of the available implementations had a NumPy compatible license. > > The tricky thing here is to not break downstream projects, but that may be unavoidable. I suspect > the fallout will not be that bad. > > Chuck
A quick response from one of the leaders of a team that requires 80bit extended precision for astronomical work...
"extended precision is pretty useless" unless you need it. And the high-precision pulsar timing community needs it. Standard double precision (64-bit) values do not contain enough precision for us to pass relative astronomical times via a single float without extended precision (the precision ends up being at the ~1 microsec level over decades of time differences, and we need it at the ~1-10ns level) nor can we store the measured spin frequencies (or do calculations on them) of our millisecond pulsars with enough precision. Those spin frequencies can have 16-17 digits of base-10 precision (i.e. we measure them to that precision). This is why we use 80-bit floats (usually via Linux, but also on non X1 Mac hardware if you use the correct compilers) extensively.
Numpy is a key component of the PINT software to do high-precision pulsar timing, and we use it partly *because* it has long double support (with 80-bit extended precision): https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com... <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnanograv%2FPINT&data=05%7C01%7Cmmccarty%40nvidia.com%7C6bc649fe7b1047fe14d108dac909c126%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638043340985084225%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VgjqmdHsQmp2K%2Bqil47D2JD9h5zWIHbUJD%2Fle8MqxgM%3D&reserved=0> And see the published paper here, particularly Sec 3.3.1 and footnote #42: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fui.adsabs.... <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fui.adsabs.harvard.edu%2Fabs%2F2021ApJ...911...45L%2Fabstract&data=05%7C01%7Cmmccarty%40nvidia.com%7C6bc649fe7b1047fe14d108dac909c126%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638043340985084225%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Nzq6djAHcLXEt0%2FDLCvtG0rjpTpK2%2FcmzVpSScCiw1k%3D&reserved=0>
Going to software quad precision would certainly work, but it would definitely make things much slower for our matrix and vector math.
We would definitely love to see a solution for this that allows us to get the extra precision we need on other platforms besides Intel/AMD64+Linux (primarily), but giving up extended precision on those platforms would *definitely* hurt. I can tell you that the pulsar community would definitely be against option "B". And I suspect that there are other users out there as well.
Scott NANOGrav Chair https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.nanogra... <https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.nanograv.org%2F&data=05%7C01%7Cmmccarty%40nvidia.com%7C6bc649fe7b1047fe14d108dac909c126%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638043340985084225%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=6%2BW87KQ9YifhT8KavDj%2Bi1Rtsn3MuaaKHeEqD%2F%2Bxp2o%3D&reserved=0>
Pulsar timing is one reason I wanted a quad precision time type. I thought Astropy was using a self implemented double-double type to work around that?
That is correct. For non-compute-intensive time calculations, Astropy as a Time object that internally uses two 64-bit floats. We use it, and it works great for high precision timekeeping over astronomical times. *However*, it ain't fast. So you can't do fast matrix/vector math on time differences where your precision exceeds a single 64-bit float. That's exactly where we are with extended precision for our pulsar timing work. Scott -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom@nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: A40A 94F2 3F48 4136 3AC4 9598 92D5 25CB 22A6 7B65 _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.pytho... Member address: mmccarty@nvidia.com
Hi Mike, Good to hear from you! On 11/17/22 10:36 PM, Mike McCarty wrote:
Hi Scott - Thanks for providing input! Do you have a minimal example that shows the type of calculations you would like to be faster with extended precision? Just so we are clear on the goal for this use case.
The majority of what we do is have a vector of extended precision times that we do various kinds of algebra on, using many NumPy routines and vectorized custom functions. We don't usually have to do too many LAPACK-style heavy-duty matrix operations in extended precision.
Would you or some of your colleagues be open to helping maintain a library that adds the 80-bit extended precision dtype into NumPy? This would be a variation of Ralf's "option A."
The NumPy community has resources and assistance for onboarding new contributors, and I'm happy to provide additional assistance.
There are possibly a couple of us who might be willing to have a look at that. It is important enough for what we do that we really need a solution. Scott -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom@nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: A40A 94F2 3F48 4136 3AC4 9598 92D5 25CB 22A6 7B65
Hello, I am interested in contributing to this. I am a contributor in the PINT pulsar timing package Scott mentioned. But I am new to contributing to numpy, and I am not sure where or how to begin. Are there beginner-level resources that I can look at? Regards Abhimanyu Susobhanan Postdoctoral fellow Centre for Gravitation, Cosmology, and Astrophysics University of Wisconsin Milwaukee
On Sun, Dec 4, 2022 at 9:36 AM <abhisrkckl@gmail.com> wrote:
Hello,
I am interested in contributing to this. I am a contributor in the PINT pulsar timing package Scott mentioned. But I am new to contributing to numpy, and I am not sure where or how to begin. Are there beginner-level resources that I can look at?
Regards
Abhimanyu Susobhanan Postdoctoral fellow Centre for Gravitation, Cosmology, and Astrophysics University of Wisconsin Milwaukee
I think the first thing to do is have a discussion in the astropy community. Dependence on extended precision is fragile at best, it isn't available on all platforms and probably doesn't offer enough precision to be future proof. Then gather some performance data. 1. What computations are needed? 2. Can the computational algorithms be improved? 3. Would parallel computation help? 4. How much faster is extended precision? 5. Where are the performance bottlenecks in the double-double time type? 6. Is there hope of improving the double-double performance. My own hope would be that a combination parallel computation and lower level code for double-double would be a viable solution. Chuck
Hi Scott, may I ask you which kind of vector / matrix operation in extended precision (np.longdouble) is supported in 'pint' ? It can't be backed by the underlying blas library as extended precision functions are not supported there. Carl Am Di., 6. Dez. 2022 um 15:24 Uhr schrieb Charles R Harris < charlesr.harris@gmail.com>:
On Sun, Dec 4, 2022 at 9:36 AM <abhisrkckl@gmail.com> wrote:
Hello,
I am interested in contributing to this. I am a contributor in the PINT pulsar timing package Scott mentioned. But I am new to contributing to numpy, and I am not sure where or how to begin. Are there beginner-level resources that I can look at?
Regards
Abhimanyu Susobhanan Postdoctoral fellow Centre for Gravitation, Cosmology, and Astrophysics University of Wisconsin Milwaukee
I think the first thing to do is have a discussion in the astropy community. Dependence on extended precision is fragile at best, it isn't available on all platforms and probably doesn't offer enough precision to be future proof. Then gather some performance data.
1. What computations are needed? 2. Can the computational algorithms be improved? 3. Would parallel computation help? 4. How much faster is extended precision? 5. Where are the performance bottlenecks in the double-double time type? 6. Is there hope of improving the double-double performance.
My own hope would be that a combination parallel computation and lower level code for double-double would be a viable solution.
Chuck
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: cmkleffner@gmail.com
On 12/14/22 3:01 PM, Carl Kleffner wrote:
Hi Scott,
may I ask you which kind of vector / matrix operation in extended precision (np.longdouble) is supported in 'pint' ? It can't be backed by the underlying blas library as extended precision functions are not supported there.
Carl
Hi Carl, The vast majority of them are simply a vector of np.longdoubles (which are usually times or time differences) which are operated on by regular Numpy math functions to compute new quantities. For example, polynomial computations base on time, many trigonometric operations on the times for orbit calculations, things like that. We don't really do big longdouble matrix operations. Scott -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom@nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: A40A 94F2 3F48 4136 3AC4 9598 92D5 25CB 22A6 7B65
On Wed, Dec 14, 2022 at 1:44 PM Scott Ransom <sransom@nrao.edu> wrote:
Hi Scott,
may I ask you which kind of vector / matrix operation in extended
On 12/14/22 3:01 PM, Carl Kleffner wrote: precision (np.longdouble) is
supported in 'pint' ? It can't be backed by the underlying blas library as extended precision functions are not supported there.
Carl
Hi Carl,
The vast majority of them are simply a vector of np.longdoubles (which are usually times or time differences) which are operated on by regular Numpy math functions to compute new quantities. For example, polynomial computations base on time, many trigonometric operations on the times for orbit calculations, things like that.
We don't really do big longdouble matrix operations.
Sounds like a good place to start looking for improvements is in addition and multiplication. Chuck
Around 2015-2016 I created a fork of numpy with IEEE 128bit float support, but I never upstreamed it because I couldn't see a way to make it not depend on GCC/libquadmath (the licensing issue Chuck brought up). I wonder if it's possible now with the dtype changes to have different dtypes on different platforms (it didn't appear to be the case when I looked then), which would avoid the need to distribute third-party libraries to cover certain platforms. Regards James On Fri, 18 Nov 2022 at 12:28, Scott Ransom <sransom@nrao.edu> wrote:
On 11/17/22 7:13 PM, Charles R Harris wrote:
On Thu, Nov 17, 2022 at 3:15 PM Ralf Gommers <ralf.gommers@gmail.com <mailto:ralf.gommers@gmail.com>> wrote:
Hi all,
We have to do something about long double support. This is something I wanted to propose a long time ago already, and moving build systems has resurfaced the pain yet again.
This is not a full proposal yet, but the start of a discussion and gradual plan of attack.
The problem ----------- The main problem is that long double support is *extremely* painful to maintain, probably far more than justified. I could write a very long story about that, but instead I'll just illustrate with some of the key points:
(1) `long double` is the main reason why we're having such a hard time with building wheels on Windows, for SciPy in particular. This is because MSVC makes long double 64-bit, and Mingw-w64 defaults to 80-bit. So we have to deal with Mingw-w64 toolchains, proposed compiler patches, etc. This alone has been a massive time sink. A couple of threads: https://github.com/numpy/numpy/issues/20348 <https://github.com/numpy/numpy/issues/20348> https://discuss.scientific-python.org/t/releasing-or-not-32-bit-windows-whee... <https://discuss.scientific-python.org/t/releasing-or-not-32-bit-windows-wheels/282> The first issue linked above is one of the key ones, with a lot of detail about the fundamental problems with `long double`. The Scientific Python thread focused more on Fortran, however that Fortran + Windows problem is at least partly the fault of `long double`. And Fortran may be rejuvenated with LFortran and fortran-lang.org <http://fortran-lang.org>; `long double` is a dead end.
(2) `long double` is not a well-defined format. We support 9 specific binary representations, and have a ton of code floating around to check for those, and manually fiddle with individual bits in long double numbers. Part of that is the immediate pain point for me right now: in the configure stage of the build we consume object files produced by the compiler and parse them, matching some known bit patterns. This check is so weird that it's the only one that I cannot implement in Meson (short of porting the hundreds of lines of Python code for it to C), see https://github.com/mesonbuild/meson/issues/11068 <https://github.com/mesonbuild/meson/issues/11068> for details. To get an idea of the complexity here: https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6... <https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6/numpy/core/setup_common.py#L264-L434> https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6... <https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6/numpy/core/src/npymath/npy_math_private.h#L179-L484> https://github.com/numpy/numpy/blob/main/numpy/core/src/npymath/npy_math_com... <https://github.com/numpy/numpy/blob/main/numpy/core/src/npymath/npy_math_complex.c.src#L598-L619> https://github.com/numpy/numpy/blob/main/numpy/core/src/multiarray/dragon4.c... <https://github.com/numpy/numpy/blob/main/numpy/core/src/multiarray/dragon4.c#L2480-L3052> Typically `long double` has multiple branches, and requires more code than float/double.
(3) We spend a lot of time dealing with issues and PR to keep `long double` working, as well as dealing with hard-to-diagnose build issues. Which sometimes even stops people from building/contributing, especially on Windows. Some recent examples: https://github.com/numpy/numpy/pull/20360 <https://github.com/numpy/numpy/pull/20360> https://github.com/numpy/numpy/pull/18536 <https://github.com/numpy/numpy/pull/18536> https://github.com/numpy/numpy/pull/21813 <https://github.com/numpy/numpy/pull/21813> https://github.com/numpy/numpy/pull/22405 <https://github.com/numpy/numpy/pull/22405> https://github.com/numpy/numpy/pull/19950 <https://github.com/numpy/numpy/pull/19950> https://github.com/numpy/numpy/pull/18330/commits/aa9fd3c7cb <https://github.com/numpy/numpy/pull/18330/commits/aa9fd3c7cb> https://github.com/scipy/scipy/issues/16769 <https://github.com/scipy/scipy/issues/16769> https://github.com/numpy/numpy/issues/14574 <https://github.com/numpy/numpy/issues/14574>
(4) `long double` isn't all that useful. On both Windows and macOS `long double` is 64-bit, which means it is just a poor alias to `double`. So it does literally nothing for the majority of our users, except confuse them and take up extra memory. On Linux, `long double` is 80-bit precision, which means it doesn't do all that much there either, just a modest bump in precision.
Let me also note that it's not just the user-visible dtypes that we have to consider; long double types are also baked into the libnpymath static library that we ship with NumPy. That's a thing we have to do something about anyway (shipping static libraries is not the best idea, see https://github.com/numpy/numpy/issues/20880 <https://github.com/numpy/numpy/issues/20880>). We just have to make sure to not forget about it when thinking about solutions here.
Potential solutions -------------------
(A) The ideal solution would be to have a proper, portable quad-precision (128 bits of precision) dtype. It's now possible to write that externally, after all the work that Sebastian and others have put into the dtype infrastructure. The dtype itself already exists (https://github.com/peytondmurray/quaddtype <https://github.com/peytondmurray/quaddtype>, maybe there are more implementations floating around). It just need the people who have an actual need for it to drive that. It's still a significant amount of work, so I'll not go into this one more right now.
(B) Full deprecation and removal of all `long double` support from NumPy (and SciPy), irrespective of whether the quad dtype comes to life.
Given the current state, I'm personally convinced that that is easily justified. However, I know some folks are going to be hesitant, given that we don't know how many remaining users we have or what use cases they have. So let's see if we can find more granular solutions (note: these are ideas, not all fully researched solutions that we can pick from and are guaranteed to work out well).
(C) Only support `long double` where it has proper compiler support (C99/C++11), so using it "just works". And remove all the support for old formats and accessing bit representations directly. This also implies making some optional functions mandatory. For example, the issue I ran into today showed up at runtime in a fallback path for systems that don't have `strtold_l`. We don't test such fallback paths in CI, so they're going to be fragile anyway.
(D) Add a build mode with a command-line flag, where we typedef `long double` to `double`. I'll note that we already touched on that once (see https://github.com/numpy/numpy/issues/20348 <https://github.com/numpy/numpy/issues/20348>); I'm not sure though if it's fundamentally not a good idea or that we just didn't do it intentionally enough.
Next steps ----------
First, it would be great to hear from folks who have use cases for long double support in NumPy. So far we have very little knowledge of that, we only see the problems and work it causes us.
Second, I'm going to have to add support for (C) or (D) temporarily to the Meson build anyway, as we run into things. It can be worked around if we really have to by implementing support for long double format detection in Meson, or by rewriting all the detection logic so it's all-in-one in C. But that takes a significant amount of time.
Third, let's figure out which way we'd like to go. Do you see alternative solutions? Or like any of the ones I listed more than others? Are you willing to jump in and work on a quad dtype?
Cheers, Ralf
I would agree that extended precision is pretty useless, IIRC, it was mostly intended as an accurate way to produce double precision results. That idea was eventually dropped as not very useful. I'd happily do away with subnormal doubles as well, they were another not very useful idea. And strictly speaking, we should not support IBM double-double either, it is not in the IEEE standard.
That said, I would like to have a quad precision type. That precision is useful for some things, and I have a dream that someday it can be used for a time type. Unfortunately, last time I looked around, none of the available implementations had a NumPy compatible license.
The tricky thing here is to not break downstream projects, but that may be unavoidable. I suspect the fallout will not be that bad.
Chuck
A quick response from one of the leaders of a team that requires 80bit extended precision for astronomical work...
"extended precision is pretty useless" unless you need it. And the high-precision pulsar timing community needs it. Standard double precision (64-bit) values do not contain enough precision for us to pass relative astronomical times via a single float without extended precision (the precision ends up being at the ~1 microsec level over decades of time differences, and we need it at the ~1-10ns level) nor can we store the measured spin frequencies (or do calculations on them) of our millisecond pulsars with enough precision. Those spin frequencies can have 16-17 digits of base-10 precision (i.e. we measure them to that precision). This is why we use 80-bit floats (usually via Linux, but also on non X1 Mac hardware if you use the correct compilers) extensively.
Numpy is a key component of the PINT software to do high-precision pulsar timing, and we use it partly *because* it has long double support (with 80-bit extended precision): https://github.com/nanograv/PINT And see the published paper here, particularly Sec 3.3.1 and footnote #42: https://ui.adsabs.harvard.edu/abs/2021ApJ...911...45L/abstract
Going to software quad precision would certainly work, but it would definitely make things much slower for our matrix and vector math.
We would definitely love to see a solution for this that allows us to get the extra precision we need on other platforms besides Intel/AMD64+Linux (primarily), but giving up extended precision on those platforms would *definitely* hurt. I can tell you that the pulsar community would definitely be against option "B". And I suspect that there are other users out there as well.
Scott NANOGrav Chair www.nanograv.org
-- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom@nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: A40A 94F2 3F48 4136 3AC4 9598 92D5 25CB 22A6 7B65 _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: aragilar+numpy@gmail.com
participants (10)
-
abhisrkckl@gmail.com
-
Carl Kleffner
-
Charles R Harris
-
Ilhan Polat
-
James Tocknell
-
Mike McCarty
-
Ralf Gommers
-
Scott Ransom
-
Sebastian Berg
-
Stephan Hoyer