Question about 64-bit integers being cast to double precision
![](https://secure.gravatar.com/avatar/4d021a1d1319f36ad861ebef0eb5ba44.jpg?s=120&d=mm&r=g)
What is the opinion of people here regarding the casting of 64-bit integers to double precision. In scipy (as in Numeric), there is the concept of "Casting safely" to a type. This concept is used when choosing a ufunc, for example. My understanding is that a 64-bit integer cannot be cast safely to a double-precision floating point number, because precision is lost in the conversion. However, at least a signed 64-bit integer can usually be cast safely to a long double precision floating point number. This is not too big a deal on 32-bit systems where people rarely request 64-bit integers. However, on some 64-bit systems (where the C long is 64-bit), Python's default integer is 64-bit. Therefore, simple expressions like sqrt(2) which require conversion to floating point will look for the first floating point number that it can convert a 64-bit integer to safely. This can only be a long double. The result is that on 64-bit systems, the long double type gets used a lot more. Is this acceptable? expected? What do those of you on 64-bit systems think? -Travis
![](https://secure.gravatar.com/avatar/80473ff660f57aa7f90affadd2240008.jpg?s=120&d=mm&r=g)
Travis Oliphant wrote:
In scipy (as in Numeric), there is the concept of "Casting safely" to a type. This concept is used when choosing a ufunc, for example.
My understanding is that a 64-bit integer cannot be cast safely to a double-precision floating point number, because precision is lost in the conversion...The result is that on 64-bit systems, the long double type gets used a lot more. Is this acceptable? expected? What do those of you on 64-bit systems think?
I am not on a 64 bit system but can give you the perspective of someone who's thought a lot about floating point precision in the context of both my research and of teaching classes on numerical analysis for physics majors. To take your example, and looking at it from an experimentalist's viewpoint, sqrt(2) where 2 is an integer has only one significant figure, and so casting it to a long double seems like extreme overkill. The numerical analysis community has probably had the greatest influence on the design of Fortran, and there sqrt(2) (2 integer) is simply not defined. The user must specify sqrt(2.0) to get a REAL result, sqrt(2.0d0) to get a DOUBLE PRECISION result. These usually map to IEEE 32 and 64 bit REALs today, respectively, on 32-bit hardware and to IEEE 64 and 128 bit (is there such a thing?) on 64-bit hardware. I imagine that if there were an integer square root function in Fortran, it would simply round to the nearest integer. In addition, the idea of "casting safely" would, it seems to me, also require sqrt(2) to return a double on a 32-bit machine. The question, I think, is part of the larger question: to what extent should the language leave precision issues under the user's control, and to what extent should it make decisions automatically? A lot of the behind-the-scenes stuff which goes on in all the Fortran routines from Netlib which are now part of Scipy involve using the machine precision to decide on step sizes and other algorithmic choices. These choices become wrong if the underlying language changes precision without telling the user, a la C's old habit of automatically casting all floats to doubles. With all that, my vote on Travis's specific question: if conversion of an N-bit integer in scipy_core is required, it gets converted to an N-bit float. The only cases in which precision will be lost is if the integer is large enough to require more than (N-e) bits for its representation, where e is the number of bits in the exponent of the floating point representation. Those who really need to control precision should, in my view, create arrays of the appropriate type to begin with. I suppose these sorts of questions are why there are now special purpose libraries for fixed precision numbers.
![](https://secure.gravatar.com/avatar/4d021a1d1319f36ad861ebef0eb5ba44.jpg?s=120&d=mm&r=g)
Stephen Walton wrote:
Travis Oliphant wrote:
In scipy (as in Numeric), there is the concept of "Casting safely" to a type. This concept is used when choosing a ufunc, for example.
My understanding is that a 64-bit integer cannot be cast safely to a double-precision floating point number, because precision is lost in the conversion...The result is that on 64-bit systems, the long double type gets used a lot more. Is this acceptable? expected? What do those of you on 64-bit systems think?
I am not on a 64 bit system but can give you the perspective of someone who's thought a lot about floating point precision in the context of both my research and of teaching classes on numerical analysis for physics majors. To take your example, and looking at it from an experimentalist's viewpoint, sqrt(2) where 2 is an integer has only one significant figure, and so casting it to a long double seems like extreme overkill.
I agree, which is why it concerned me when I saw it. But, it is consistent with the rest of the casting features.
With all that, my vote on Travis's specific question: if conversion of an N-bit integer in scipy_core is required, it gets converted to an N-bit float. The only cases in which precision will be lost is if the integer is large enough to require more than (N-e) bits for its representation, where e is the number of bits in the exponent of the floating point representation.
Yes, it is only for large integers that problems arise. I like this scheme and it would be very easy to implement, and it would provide a consistent interface. The only problem is that it would mean that on current 32-bit systems sqrt(2) would cast 2 to a "single-precision" float and return a single-precision result. If that is not a problem, then great... Otherwise, a more complicated (and less consistent) rule like integer float ============== 8-bit 32-bit 16-bit 32-bit 32-bit 64-bit 64-bit 64-bit would be needed (this is also not too hard to do). -Travis
![](https://secure.gravatar.com/avatar/5a7d8a4d756bb1f1b2ea729a7e5dcbce.jpg?s=120&d=mm&r=g)
Travis Oliphant wrote:
With all that, my vote on Travis's specific question: if conversion of an N-bit integer in scipy_core is required, it gets converted to an N-bit float. The only cases in which precision will be lost is if the integer is large enough to require more than (N-e) bits for its representation, where e is the number of bits in the exponent of the floating point representation.
Yes, it is only for large integers that problems arise. I like this scheme and it would be very easy to implement, and it would provide a consistent interface.
The only problem is that it would mean that on current 32-bit systems
sqrt(2) would cast 2 to a "single-precision" float and return a single-precision result.
If that is not a problem, then great...
Otherwise, a more complicated (and less consistent) rule like
integer float ============== 8-bit 32-bit 16-bit 32-bit 32-bit 64-bit 64-bit 64-bit
would be needed (this is also not too hard to do).
Here's a different way to think about this issue: instead of thinking in terms of bit-width, let's look at it in terms of exact vs inexact numbers. Integers are exact, and their bit size only impacts the range of them which is representable. If we look at it this way, then seems to me justifiable to suggest that sqrt(2) would upcast to the highest-available precision floating point format. Obviously this can have an enormous memory impact if we're talking about a big array of numbers instead of sqrt(2), so I'm not 100% sure it's the right solution. However, I think that the rule 'if you apply "floating point" operations to integer inputs, the system will upcast the integers to give you as much precision as possible' is a reasonable one. Users needing tight memory control could always first convert their small integers to the smallest existing floats, and then operate on that. Just my 1e-2 Cheers, f
![](https://secure.gravatar.com/avatar/7ae1f973711b916a188ec3962f4aa701.jpg?s=120&d=mm&r=g)
On Wed, 2005-10-12 at 16:33 -0600, Fernando Perez wrote:
Travis Oliphant wrote:
With all that, my vote on Travis's specific question: if conversion of an N-bit integer in scipy_core is required, it gets converted to an N-bit float. The only cases in which precision will be lost is if the integer is large enough to require more than (N-e) bits for its representation, where e is the number of bits in the exponent of the floating point representation.
Yes, it is only for large integers that problems arise. I like this scheme and it would be very easy to implement, and it would provide a consistent interface.
The only problem is that it would mean that on current 32-bit systems
sqrt(2) would cast 2 to a "single-precision" float and return a single-precision result.
If that is not a problem, then great...
Otherwise, a more complicated (and less consistent) rule like
integer float ============== 8-bit 32-bit 16-bit 32-bit 32-bit 64-bit 64-bit 64-bit
would be needed (this is also not too hard to do).
Here's a different way to think about this issue: instead of thinking in terms of bit-width, let's look at it in terms of exact vs inexact numbers. Integers are exact, and their bit size only impacts the range of them which is representable.
If we look at it this way, then seems to me justifiable to suggest that sqrt(2) would upcast to the highest-available precision floating point format. Obviously this can have an enormous memory impact if we're talking about a big array of numbers instead of sqrt(2), so I'm not 100% sure it's the right solution. However, I think that the rule 'if you apply "floating point" operations to integer inputs, the system will upcast the integers to give you as much precision as possible' is a reasonable one. Users needing tight memory control could always first convert their small integers to the smallest existing floats, and then operate on that.
I think it is a good idea to keep double as the default, if only because Python expects it. If someone needs more control over the precision of arrays, why not do as c does and add functions sqrtf and sqrtl? Chuck
![](https://secure.gravatar.com/avatar/80473ff660f57aa7f90affadd2240008.jpg?s=120&d=mm&r=g)
Charles R Harris wrote:
I think it is a good idea to keep double as the default, if only because Python expects it. If someone needs more control over the precision of arrays, why not do as c does and add functions sqrtf and sqrtl?
Usages like sqrtf() and sqrtl() begin to look like pre-1975 Fortran, before generic functions were introduced. I can't change Python's basic behavior, but would rather that sqrt(scipy_integer_array) be simply disallowed in favor of requiring the user to explicitly change the type of the array to float, double, or long double.
![](https://secure.gravatar.com/avatar/beebe07772844149dcd47e23e5276e72.jpg?s=120&d=mm&r=g)
On Wed, 12 Oct 2005, Charles R Harris wrote:
On Wed, 2005-10-12 at 16:33 -0600, Fernando Perez wrote:
Travis Oliphant wrote:
With all that, my vote on Travis's specific question: if conversion of an N-bit integer in scipy_core is required, it gets converted to an N-bit float. The only cases in which precision will be lost is if the integer is large enough to require more than (N-e) bits for its representation, where e is the number of bits in the exponent of the floating point representation.
Yes, it is only for large integers that problems arise. I like this scheme and it would be very easy to implement, and it would provide a consistent interface.
The only problem is that it would mean that on current 32-bit systems
sqrt(2) would cast 2 to a "single-precision" float and return a single-precision result.
If that is not a problem, then great...
Otherwise, a more complicated (and less consistent) rule like
integer float ============== 8-bit 32-bit 16-bit 32-bit 32-bit 64-bit 64-bit 64-bit
would be needed (this is also not too hard to do).
Here's a different way to think about this issue: instead of thinking in terms of bit-width, let's look at it in terms of exact vs inexact numbers. Integers are exact, and their bit size only impacts the range of them which is representable.
If we look at it this way, then seems to me justifiable to suggest that sqrt(2) would upcast to the highest-available precision floating point format. Obviously this can have an enormous memory impact if we're talking about a big array of numbers instead of sqrt(2), so I'm not 100% sure it's the right solution. However, I think that the rule 'if you apply "floating point" operations to integer inputs, the system will upcast the integers to give you as much precision as possible' is a reasonable one. Users needing tight memory control could always first convert their small integers to the smallest existing floats, and then operate on that.
I think it is a good idea to keep double as the default, if only because Python expects it. If someone needs more control over the precision of arrays, why not do as c does and add functions sqrtf and sqrtl?
I also think that double should be kept as default. If I understand things correctly, both normal python and all the libraries for scipy can only deal with that at the moment. The need for long double precision (and even multiple precision arithmetic) does arise in some situations, but I am not sure if this will be the default in the near future. Still it would be great if there was a `long double` version of scipy on those platforms which support this natively. This would require long double versions of basic math,cmath functions, of cephes (and all other routines from scipy.special), of fft, ATLAS, root finding, etc. etc. This would require major work, I fear, as for example several constants are hard coded to work for double precision and nothing else. Does this mean that one would need a ``parallel`` installation of scipy_long_double to do import scipy_long_double as scipy to perform all computations using `long double` (possibly after some modifications to the array declarations)? If double precision is kept as default, a conversion of a large integer would would raise an OverflowError as it is done right now. Best, Arnd
![](https://secure.gravatar.com/avatar/80473ff660f57aa7f90affadd2240008.jpg?s=120&d=mm&r=g)
Arnd Baecker wrote:
I also think that double should be kept as default. If I understand things correctly, both normal python and all the libraries for scipy can only deal with that at the moment.
I respectfully disagree that double should be the default target for upcasts. This is a holdover from C and was a bad decision when made for that language. And, as Pearu points out, it has dire consequences for storage. If I get a 16 Megapixel image from HST with two-byte integers, I definitely would not want that image upcast to 64 or, heaven forfend, 128 bits the first time I did an operation on it.
![](https://secure.gravatar.com/avatar/7ae1f973711b916a188ec3962f4aa701.jpg?s=120&d=mm&r=g)
On Wed, 2005-10-26 at 09:00 -0700, Stephen Walton wrote:
Arnd Baecker wrote:
I also think that double should be kept as default. If I understand things correctly, both normal python and all the libraries for scipy can only deal with that at the moment.
I respectfully disagree that double should be the default target for upcasts. This is a holdover from C and was a bad decision when made for that language. And, as Pearu points out, it has dire consequences for storage. If I get a 16 Megapixel image from HST with two-byte integers, I definitely would not want that image upcast to 64 or, heaven forfend, 128 bits the first time I did an operation on it.
I think there are two goals here: 1) it just works 2) it is efficient. These goals are not always compatible. In order to just work, certain defaults need to be assumed. Python works like that, it is one of the reasons it is so convenient. On the other hand, efficiency, space efficiency in particular, requires greater control on the part of the programmer who has to take the trouble pick the types he wants to use, making a trade between precision, space, and speed. So I think that we should choose reasonable defaults that carry on the Python spirit, while leaving open options for the programmer who wants more control. How to do this without making a mess is the question. Now, python does the following:
from math import * sqrt(2) 1.4142135623730951
and if we are going to overload sqrt we should keep this precision. Do we really want to make a distinction in this case between math.sqrt and Numeric.sqrt ? I myself don't think so. On the other hand, it is reasonable that scipy not promote float types in this situation. Integral types remain a problem. What about uint8 vs uint64 for instance? Maybe we should either require a cast of integral types to a float type for arrays or define distinct functions like sqrtf and sqrtl to handle this. I note that a complaint has been made that this is unwieldy and a throwback, but I don't think so. The integer case is, after all, ambiguous. The automatic selection of type only really makes sense for floats or if we explicitly state that maximum precision, but no more than necessary, should be maintained. But what happens then for int64 when we have a machine whose default float is double double? Chuck
![](https://secure.gravatar.com/avatar/5a7d8a4d756bb1f1b2ea729a7e5dcbce.jpg?s=120&d=mm&r=g)
Charles R Harris wrote: [...]
Now, python does the following:
from math import * sqrt(2)
1.4142135623730951
and if we are going to overload sqrt we should keep this precision. Do we really want to make a distinction in this case between math.sqrt and Numeric.sqrt ? I myself don't think so. On the other hand, it is reasonable that scipy not promote float types in this situation. Integral types remain a problem. What about uint8 vs uint64 for instance?
Again, I find it simplest to think about this problem in terms of exact/approximate numbers. All integer types (of any bit-width) are exact, all float numbers are approximate. The question is then how to handle functions, which can be (in terms of their domain/range relation): 1. f : exact -> exact 2. f : exact -> approximate etc. My argument is that for #2, there should be upcasting to the widest possible approximate type, in an attempt to preserve as much of the original information as we can. For example, sqrt(2) should upcast to double, because truncation to integer makes very little practical sense. The case of accumulators is special, because they are of type 1 above, but the result may not (and often doesn't) fit in the input type. Travis already agreed that in this case, an upcast was a reasonable compromise. However, for functions of the kind 3. f : approx -> approx there should be in general no upcasting (except for accumulators, as we've discussed). Doing a*b to two float arrays should certainly not produce an enormous result, which may not even fit in memory. Just my opinion. Cheers, f
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 10/26/05, Fernando Perez <Fernando.Perez@colorado.edu> wrote:
Charles R Harris wrote:
[...]
Now, python does the following:
from math import * sqrt(2)
1.4142135623730951
and if we are going to overload sqrt we should keep this precision. Do we really want to make a distinction in this case between math.sqrt and Numeric.sqrt ? I myself don't think so. On the other hand, it is reasonable that scipy not promote float types in this situation. Integral types remain a problem. What about uint8 vs uint64 for instance?
Again, I find it simplest to think about this problem in terms of exact/approximate numbers. All integer types (of any bit-width) are exact, all float numbers are approximate. The question is then how to handle functions, which can be (in terms of their domain/range relation):
1. f : exact -> exact 2. f : exact -> approximate
etc.
My argument is that for #2, there should be upcasting to the widest possible approximate type, in an attempt to preserve as much of the original information as we can. For example, sqrt(2) should upcast to double, because truncation to integer makes very little practical sense.
Yes, I agree with this. The only problem I see is if someone wants to save space when taking the sqrt of an integral array. There are at least three possiblilities: 1. cast the result to a float 2. cast the argument to a float 3. use a special sqrtf function The first two options use more temporary space, take more time, and look uglier (IMHO). On the other hand, the needed commands are already implemented. The last option is clear and concise, but needs a new ufunc. Chuck
![](https://secure.gravatar.com/avatar/5a7d8a4d756bb1f1b2ea729a7e5dcbce.jpg?s=120&d=mm&r=g)
Charles R Harris wrote:
Yes, I agree with this. The only problem I see is if someone wants to save space when taking the sqrt of an integral array. There are at least three possiblilities:
1. cast the result to a float 2. cast the argument to a float 3. use a special sqrtf function
The first two options use more temporary space, take more time, and look uglier (IMHO). On the other hand, the needed commands are already implemented. The last option is clear and concise, but needs a new ufunc.
Well, while I agree with the recently posted design guideline from Guido of using different functions for different purposes rather than flags, this may be a case where a flag would be a good choice. Especially because we already have a conceptual precedent for the accumulators of specifying the return type via a flag: a.sum(rtype=int). Since the 'mental slot' is already in scipy's users heads for saying 'modify the default output of this function to accumulate/store data in a different type', I think it would be reasonable to offer sqrt(a,rtype=float) as an optional way to prevent automatic upcasting in cases where users want that kind of very fine level control. This can be done uniformly across the library, rather than growing a zillion foof/food/foo* post-fixed forms of every ufunc in the library. We would then have: - A basic principle for how upcasting is done, driven by the idea of 'protecting precision even at the cost of storage'. This principle forces sqrt(2) to be a double and anint_array.sum() to accumulate to a wider type. - A uniform mechanism for overriding upcasting across the library, via the rtype flag. If most/all of scipy implements this, it seems like a small price of learning to pay for a reasonable balance between convenience, correctness and efficiency. Or am I missing some usage case that this would not satisfy? Cheers, f
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 10/26/05, Fernando Perez <Fernando.Perez@colorado.edu> wrote:
Charles R Harris wrote:
Yes, I agree with this. The only problem I see is if someone wants to save space when taking the sqrt of an integral array. There are at least three possiblilities:
1. cast the result to a float 2. cast the argument to a float 3. use a special sqrtf function
The first two options use more temporary space, take more time, and look uglier (IMHO). On the other hand, the needed commands are already implemented. The last option is clear and concise, but needs a new ufunc.
Well, while I agree with the recently posted design guideline from Guido of using different functions for different purposes rather than flags, this may be a case where a flag would be a good choice. Especially because we already have a conceptual precedent for the accumulators of specifying the return type via a flag: a.sum(rtype=int).
Since the 'mental slot' is already in scipy's users heads for saying 'modify the default output of this function to accumulate/store data in a different type', I think it would be reasonable to offer
sqrt(a,rtype=float)
as an optional way to prevent automatic upcasting in cases where users want that kind of very fine level control. This can be done uniformly across the library, rather than growing a zillion foof/food/foo* post-fixed forms of every ufunc in the library.
We would then have:
- A basic principle for how upcasting is done, driven by the idea of 'protecting precision even at the cost of storage'. This principle forces sqrt(2) to be a double and anint_array.sum() to accumulate to a wider type.
- A uniform mechanism for overriding upcasting across the library, via the rtype flag. If most/all of scipy implements this, it seems like a small price of learning to pay for a reasonable balance between convenience, correctness and efficiency.
Yes, I think that would work well. Most of us, most of the time, could then rely on the unmodified functions to do the right thing. On the rare occasion that space really mattered, there would be a fallback position. It would also be easy to use a global type string mytype = 'Float32' and call everything critical with rtype=mytype. That would make it easy to change the behaviour of fairly large programs. Chuck
![](https://secure.gravatar.com/avatar/4d021a1d1319f36ad861ebef0eb5ba44.jpg?s=120&d=mm&r=g)
Charles R Harris wrote:
Since the 'mental slot' is already in scipy's users heads for saying 'modify the default output of this function to accumulate/store data in a different type', I think it would be reasonable to offer
sqrt(a,rtype=float)
This would requires some rewriting of the internals which might be tricky to get right because of the conflict with the optional output arguments that are already available. Look at sqrt.types This shows you the types that actually have functions available. Everything else has to be cast to something. Right now, the rule is basically, don't cast unless we can "safely," where the notion of "safely" is defined in a switch-statement. I suppose some way to bypass this default function selection and pick the function you specify instead might be useful, especially because with the way type conversion is handled now through a buffer, it is a lot different (for large arrays) to cast during calculation of a ufunc then to do sqrt(a.astype(float)) which would make a copy of the data first. -Travis
![](https://secure.gravatar.com/avatar/c248bc10c66731ecd789b0135b635329.jpg?s=120&d=mm&r=g)
Charles R Harris wrote:
On 10/26/05, *Fernando Perez* <Fernando.Perez@colorado.edu <mailto:Fernando.Perez@colorado.edu>> wrote:
Charles R Harris wrote:
Since the 'mental slot' is already in scipy's users heads for saying 'modify the default output of this function to accumulate/store data in a different type', I think it would be reasonable to offer
sqrt(a,rtype=float)
as an optional way to prevent automatic upcasting in cases where users want that kind of very fine level control. This can be done uniformly across the library, rather than growing a zillion foof/food/foo* post-fixed forms of every ufunc in the library.
We would then have:
- A basic principle for how upcasting is done, driven by the idea of 'protecting precision even at the cost of storage'. This principle forces sqrt(2) to be a double and anint_array.sum() to accumulate to a wider type.
- A uniform mechanism for overriding upcasting across the library, via the rtype flag. If most/all of scipy implements this, it seems like a small price of learning to pay for a reasonable balance between convenience, correctness and efficiency.
Yes, I think that would work well. Most of us, most of the time, could then rely on the unmodified functions to do the right thing. On the rare occasion that space really mattered, there would be a fallback position. It would also be easy to use a global type string mytype = 'Float32' and call everything critical with rtype=mytype. That would make it easy to change the behaviour of fairly large programs.
I agree that this would be a nice consistent interface (if we can implement it) :) I've added text to the docstrings for a.sum() and a.mean() to reflect their new behaviour (re. thread on int8 array operations) and the role of the 'rtype' argument there. Let me know if you think anything's wrong. Otherwise we could aim to migrate gradually to similar behaviour with other functions. I'm not sure that 'rtype' (for 'return type'?) is the most accurate name. For a.mean() the rtype is currently the type used for intermediate calculations (in a.sum()), not the return type. (The return type is float, even if the 'rtype' is int, and I agree with this behaviour.) The same is true, in a sense, for a.sum(). The second example in the new a.sum() docstring is: >>> array([0.5, 1.5]).sum(rtype=int32) 1 where the floats are downcast to int32 before the sum. My guess is that a user who goes to the trouble of specifying a non-default data type for an operation is at least as interested in the data type of the intermediate operations as in the return type. Perhaps we should think instead about the data types used for intermediate operations, as sum() and mean() do now, and rename the argument 'itype'. Another option would be to change the behaviour of a.sum() and a.mean() so they really do return the given type. But I'm not keen on this, since we can already achieve this without any 'rtype' argument by casting the output to the desired type, and this leaves us less control over what actually goes on behind the scenes... Comments?! -- Ed
![](https://secure.gravatar.com/avatar/4d021a1d1319f36ad861ebef0eb5ba44.jpg?s=120&d=mm&r=g)
Fernando Perez wrote:
Charles R Harris wrote:
Yes, I agree with this. The only problem I see is if someone wants to save space when taking the sqrt of an integral array. There are at least three possiblilities:
1. cast the result to a float 2. cast the argument to a float 3. use a special sqrtf function
The first two options use more temporary space, take more time, and look uglier (IMHO). On the other hand, the needed commands are already implemented. The last option is clear and concise, but needs a new ufunc.
Well, while I agree with the recently posted design guideline from Guido of using different functions for different purposes rather than flags, this may be a case where a flag would be a good choice. Especially because we already have a conceptual precedent for the accumulators of specifying the return type via a flag: a.sum(rtype=int).
Since the 'mental slot' is already in scipy's users heads for saying 'modify the default output of this function to accumulate/store data in a different type', I think it would be reasonable to offer
sqrt(a,rtype=float)
One thing we could do is take advantage of the "indexing capabilities" of the ufunc object which are unused at the moment and do something like sqrt[float](a) Where the argument to the index would be the desired output type or something. -Travis
![](https://secure.gravatar.com/avatar/5a7d8a4d756bb1f1b2ea729a7e5dcbce.jpg?s=120&d=mm&r=g)
Travis Oliphant wrote:
Fernando Perez wrote:
Since the 'mental slot' is already in scipy's users heads for saying 'modify the default output of this function to accumulate/store data in a different type', I think it would be reasonable to offer
sqrt(a,rtype=float)
One thing we could do is take advantage of the "indexing capabilities" of the ufunc object which are unused at the moment and do something like
sqrt[float](a)
Where the argument to the index would be the desired output type or something.
This one is a bit intriguing. I kind of like it, but I worry that it's a bit too unique of a usage. I've never seen this kind of use elsewhere in python code 'in the wild', and I wonder if it's not too orthogonal to common usage to force people to learn this particular special case. Cheers, f
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 10/31/05, Fernando Perez <Fernando.Perez@colorado.edu> wrote:
Travis Oliphant wrote:
Fernando Perez wrote:
Since the 'mental slot' is already in scipy's users heads for saying 'modify the default output of this function to accumulate/store data in a different type', I think it would be reasonable to offer
sqrt(a,rtype=float)
One thing we could do is take advantage of the "indexing capabilities" of the ufunc object which are unused at the moment and do something like
sqrt[float](a)
Where the argument to the index would be the desired output type or something.
This one is a bit intriguing. I kind of like it, but I worry that it's a bit too unique of a usage. I've never seen this kind of use elsewhere in python code 'in the wild', and I wonder if it's not too orthogonal to common usage to force people to learn this particular special case.
I kind of like it too. I'm not to worried about the special usage case because controlling types in python is itself a special usage. I guess another question is why the indexing capability was originally added. Chuck
![](https://secure.gravatar.com/avatar/4d021a1d1319f36ad861ebef0eb5ba44.jpg?s=120&d=mm&r=g)
I kind of like it too. I'm not to worried about the special usage case because controlling types in python is itself a special usage. I guess another question is why the indexing capability was originally added.
I'm not sure what you mean. Perhaps my wording was confusing. There is currently no indexing capability of ufunc objects. But, because ufuncs are Python types, we could add indexing capability to accomodate a use such as this one. Most likely the use case would return a special ufunc object that would handle casting differently then the default. -Travis
![](https://secure.gravatar.com/avatar/573ed3088eea742af8bcfeac90bc882d.jpg?s=120&d=mm&r=g)
Charles R Harris wrote:
On Wed, 2005-10-26 at 09:00 -0700, Stephen Walton wrote:
Arnd Baecker wrote:
I also think that double should be kept as default. If I understand things correctly, both normal python and all the libraries for scipy can only deal with that at the moment.
I respectfully disagree that double should be the default target for upcasts. This is a holdover from C and was a bad decision when made for that language. And, as Pearu points out, it has dire consequences for storage. If I get a 16 Megapixel image from HST with two-byte integers, I definitely would not want that image upcast to 64 or, heaven forfend, 128 bits the first time I did an operation on it.
I think there are two goals here: 1) it just works 2) it is efficient. These goals are not always compatible. In order to just work, certain defaults need to be assumed. Python works like that, it is one of the reasons it is so convenient. On the other hand, efficiency, space efficiency in particular, requires greater control on the part of the programmer who has to take the trouble pick the types he wants to use, making a trade between precision, space, and speed. So I think that we should choose reasonable defaults that carry on the Python spirit, while leaving open options for the programmer who wants more control. How to do this without making a mess is the question.
Maybe the arrays could have some 'manual type control' flag (which could be set on e.g. when explicitly stating type in an array constructor) - then 1) everything would just work and 2) a user could always set 'manual on', causing all ops on that array to return the array of the same (or given (via rtype?)) type. I know, it still does not solve how to do the 'it just works' part. with just my 2 cents, r.
participants (8)
-
Arnd Baecker
-
Charles R Harris
-
Charles R Harris
-
Ed Schofield
-
Fernando Perez
-
Robert Cimrman
-
Stephen Walton
-
Travis Oliphant