make_lsq_spline and conversion to float
Hello, When using make_lsq_spline(x,y,t) with x being int64, I am running into the issue that the x-array is internally converted to float. While on an int64-level the input x-values are ordered, after the internal conversion to float, an exception is raised, because as a result some of these values are identical to each other. I was wondering about the reason for that internal conversion to float (it seems to reside only in handling complex dtypes or is there some other mechanism in Cox-de-Bor algorithm or Cholesky?). It is not obvious to me. Right now I can solve this problem by shifting the x-values, such that the numbers are smaller and the conversion does not harm this way. However, this makes things really slow or very unhandy codewise. Maybe it is feasible to modify _get_dtype a little? Thanks and cheers, Thomas.
Hi, Several reasons, in fact. First and foremost, linear algebra is delegated to LAPACK, which only works in floating point. If you're using integer values so large that the conversion to float64 is lossy, essentially any integer arithmetics is nearly guaranteed to overflow. So, the computations should stay in floating-point. There might be a case for preserving float32 dtypes, but so far no use case was reported, so it's separate. All in all, you're best off shifting and scaling the inputs yourself. I'm actually surprised why this makes things slow : this is only O(N), while the linear algebra part should definitely be more costly. Cheers, Evgeni On Tue, Oct 26, 2021 at 9:21 AM Thomas Hilger <thomas.hilger@gmail.com> wrote:
Hello,
When using make_lsq_spline(x,y,t) with x being int64, I am running into the issue that the x-array is internally converted to float. While on an int64-level the input x-values are ordered, after the internal conversion to float, an exception is raised, because as a result some of these values are identical to each other.
I was wondering about the reason for that internal conversion to float (it seems to reside only in handling complex dtypes or is there some other mechanism in Cox-de-Bor algorithm or Cholesky?). It is not obvious to me.
Right now I can solve this problem by shifting the x-values, such that the numbers are smaller and the conversion does not harm this way. However, this makes things really slow or very unhandy codewise.
Maybe it is feasible to modify _get_dtype a little?
Thanks and cheers, Thomas. _______________________________________________ SciPy-User mailing list -- scipy-user@python.org To unsubscribe send an email to scipy-user-leave@python.org https://mail.python.org/mailman3/lists/scipy-user.python.org/ Member address: evgeny.burovskiy@gmail.com
Thanks a lot for the explanation. With this information I am now wondering as well why it became so much slower. I can only guess it is because the problem is a bit nested (~500000 splines) and a lot of things happen in the surroundings. So the problem is somewhere else. Maybe the whole thing got a bit clumsy...and can be structured differently...time to re-think what actually needs to be done. Cheers, Thomas. Am Di., 26. Okt. 2021 um 15:25 Uhr schrieb Evgeni Burovski < evgeny.burovskiy@gmail.com>:
Hi,
Several reasons, in fact. First and foremost, linear algebra is delegated to LAPACK, which only works in floating point. If you're using integer values so large that the conversion to float64 is lossy, essentially any integer arithmetics is nearly guaranteed to overflow. So, the computations should stay in floating-point. There might be a case for preserving float32 dtypes, but so far no use case was reported, so it's separate.
All in all, you're best off shifting and scaling the inputs yourself. I'm actually surprised why this makes things slow : this is only O(N), while the linear algebra part should definitely be more costly.
Cheers,
Evgeni
On Tue, Oct 26, 2021 at 9:21 AM Thomas Hilger <thomas.hilger@gmail.com> wrote:
Hello,
When using make_lsq_spline(x,y,t) with x being int64, I am running into
the issue that the x-array is internally converted to float. While on an int64-level the input x-values are ordered, after the internal conversion to float, an exception is raised, because as a result some of these values are identical to each other.
I was wondering about the reason for that internal conversion to float
(it seems to reside only in handling complex dtypes or is there some other mechanism in Cox-de-Bor algorithm or Cholesky?). It is not obvious to me.
Right now I can solve this problem by shifting the x-values, such that
the numbers are smaller and the conversion does not harm this way. However, this makes things really slow or very unhandy codewise.
Maybe it is feasible to modify _get_dtype a little?
Thanks and cheers, Thomas. _______________________________________________ SciPy-User mailing list -- scipy-user@python.org To unsubscribe send an email to scipy-user-leave@python.org https://mail.python.org/mailman3/lists/scipy-user.python.org/ Member address: evgeny.burovskiy@gmail.com
_______________________________________________ SciPy-User mailing list -- scipy-user@python.org To unsubscribe send an email to scipy-user-leave@python.org https://mail.python.org/mailman3/lists/scipy-user.python.org/ Member address: thomas.hilger@gmail.com
participants (2)
-
Evgeni Burovski
-
Thomas Hilger