[Numpy-discussion] CASTABLE flag
Scott Ransom
sransom at nrao.edu
Mon Jan 7 14:51:21 EST 2008
On Monday 07 January 2008 02:13:56 pm Charles R Harris wrote:
> On Jan 7, 2008 12:00 PM, Travis E. Oliphant <oliphant at enthought.com>
wrote:
> > Charles R Harris wrote:
> > > Hi All,
> > >
> > > I'm thinking that one way to make the automatic type conversion a
> > > bit safer to use would be to add a CASTABLE flag to arrays. Then
> > > we could write something like
> > >
> > > a[...] = typecast(b)
> > >
> > > where typecast returns a view of b with the CASTABLE flag set so
> > > that the assignment operator can check whether to implement the
> > > current behavior or to raise an error. Maybe this could even be
> > > part of the dtype scalar, although that might cause a lot of
> > > problems with the current default conversions. What do folks
> > > think?
> >
> > That is an interesting approach. The issue raised of having to
> > convert lines of code that currently work (which does implicit
> > casting) would still be there (like ndimage), but it would not
> > cause unnecessary data copying, and would help with this complaint
> > (that I've heard before and have some sympathy towards).
> >
> > I'm intrigued.
>
> Maybe we could also set a global flag, typesafe, that in the current
> Numpy version would default to false, giving current behavior, but
> could be set true to get the new behavior. Then when Numpy 1.1 comes
> out we could make the global default true. That way folks could keep
> the current code working until they are ready to use the typesafe
> feature.
I'm a bit confused as to which types of casting you are proposing to
change. As has been pointed out by several people, users very often
_want_ to "lose information". And as I pointed out, it is one of the
reasons why we are all using numpy today as opposed to numeric!
I'd bet that the vast majority of the people on this list believe that
the OPs problem of complex numbers being automatically cast into floats
is a real problem. Fine. We should be able to raise an exception in
that case.
However, two other very common cases of "lost information" are not
obviously a problems, and are (for many of us) the _preferred_ actions.
These examples are:
1. Performing floating point math in higher precision, but casting to a
lower-precision float if that float is on the lhs of the assignment.
For example:
In [22]: a = arange(5, dtype=float32)
In [23]: a += arange(5.0)
In [24]: a
Out[24]: array([ 0., 2., 4., 6., 8.], dtype=float32)
To me, that is fantastic. I've obviously explicitly requested that I
want "a" to hold 32-bit floats. And if I'm careful and use in-place
math, I get 32-bit floats at the end (and no problems with large
temporaries or memory doubling by an automatic cast to float64).
In [25]: a = a + arange(5.0)
In [26]: a
Out[26]: array([ 0., 3., 6., 9., 12.])
In this case, I'm reassigning "a" from 32-bits to 64-bits because I'm
not using in-place math. The temporary array created on the rhs
defines the type of the new assignment. Once again, I think this is
good.
2. Similarly, if I want to stuff floats into a int array:
In [28]: a
Out[28]: array([0, 1, 2, 3, 4])
In [29]: a += 2.5
In [30]: a
Out[30]: array([2, 3, 4, 5, 6])
Here, I get C-like rounding/casting of my originally integer array
because I'm using in-place math. This is often a very useful behavior.
In [31]: a = a + 2.5
In [32]: a
Out[32]: array([ 4.5, 5.5, 6.5, 7.5, 8.5])
But here, without the in-place math, a gets converted to doubles.
I can certainly say that in my code (which is used by a fair number of
people in my field), each of these use cases are common. And I think
they are one of the _strengths_ of numpy.
I will be very disappointed if this default behavior changes.
Scott
--
Scott M. Ransom Address: NRAO
Phone: (434) 296-0320 520 Edgemont Rd.
email: sransom at nrao.edu Charlottesville, VA 22903 USA
GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989
More information about the NumPy-Discussion
mailing list