Hi all,
So one of the things exposed in the numpy namespace are objects called np.int np.float np.bool etc.
These are commonly used -- in fact, just yesterday on another project I saw a senior person reviewing a pull request instruct a more junior person that they should use np.float instead of float or np.float64. But AFAICT everyone who is actually using them is doing this based on a very easy-to-fall-for misconception, i.e., that these objects have something to do with numpy.
In fact they are just aliases for the regular builtin Python types: 'int', 'float', 'bool', etc. NumPy *does have* special numpy-specific types -- but these are called np.int_, np.float_, np.bool_, etc.
Apparently they were set up this way back in numpy 0.something, as a backwards compatibility (!) hack: https://github.com/numpy/numpy/pull/6103#issuecomment-123801937
Now, 10+ years later, they continue to confuse people on a regular, ongoing basis, and new users are still being taught misleading "facts" about them. I suggest that we should deprecate them, with no fixed schedule for actually removing them. (I have no idea if/when people will actually stop using them to the point that we can get away with removing them entirely, but in the mean time we should at least be publicizing that any code which is using them is almost certainly based on a misunderstanding.)
The technical challenge here is that historically it has simply been impossible to deprecate a global constant like this without using version-specific hacks or accepting unacceptable slowdowns on every attribute access. But, python 3.5 finally adds the necessary machinery to do this in a future-proof way, so now it can be done safely across all versions of Python that we care about, including future unreleased versions: https://github.com/njsmith/metamodule/
Hence: https://github.com/numpy/numpy/pull/6103
Thoughts?
-n
P.S.: using metamodule.py also gives us the option of making np.testing lazily imported, which last time this came up was benchmarked to improve numpy's import speed by ~35% [1] -- not too bad given that most production code will never touch np.testing. But this is just a teaser postscript; I'm not proposing that we actually do this at this time :-).
[1] http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063147.html
On 07/23/2015 04:29 AM, Nathaniel Smith wrote:
Hi all,
So one of the things exposed in the numpy namespace are objects called np.int np.float np.bool etc.
These are commonly used -- in fact, just yesterday on another project I saw a senior person reviewing a pull request instruct a more junior person that they should use np.float instead of float or np.float64. But AFAICT everyone who is actually using them is doing this based on a very easy-to-fall-for misconception, i.e., that these objects have something to do with numpy.
I don't see the issue. They are just aliases so how is np.float worse than just float? Too me this does not seem worth the bother of deprecation. An argument could be made for deprecating creating dtypes from python builtin types as they are ambiguous (C float != python float) and platform dependent. E.g. dtype=int is just an endless source of bugs. But this is also so invasive that the deprecation would never be completed and just be a bother to everyone.
So -1 from me.
P.S.: using metamodule.py also gives us the option of making np.testing lazily imported, which last time this came up was benchmarked to improve numpy's import speed by ~35% [1] -- not too bad given that most production code will never touch np.testing. But this is just a teaser postscript; I'm not proposing that we actually do this at this time :-).
[1] http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063147.html
I doubt these numbers from 2012 are still correct. When this was last profiled last year the import there were two main offenders, add_docs and np.polynomial. Both have been fixed in 1.9. I don't recall np.testing even showing up.
Julian Taylor jtaylor.debian@googlemail.com wrote:
I don't see the issue. They are just aliases so how is np.float worse than just float?
I have burned my fingers on it.
Since np.double is a C double I assumed np.float is a C float. It is not.
np.int has the same problem by being a C long. Pure evil. Most users of NumPy probably expect the np.foobar dtype to map to the corresponding foobar C type. This is actually inconsistent and plain dangerous.
It would be much better if dtype=float meant Python float, dtype=np.float meant C float, dtype=int meant Python int, and dtype=np.int meant C int.
Sturla
On Fri, Jul 24, 2015 at 10:03 AM, Sturla Molden sturla.molden@gmail.com wrote:
I don't see the issue. They are just aliases so how is np.float worse than just float?
I have burned my fingers on it.
I must have too -- but I don't recall, because I am VERY careful about not using np.float, no.int, etc... but I do have to constantly evangelize and correct code others put in my code base.
This really is very, very, ugly.
we get away with np.float, because every OS/compiler that gets any regular use has np.float == a c double, which is always 64 bit.
but, as Sturla points our, no.int being a C long is a disaster!
So +inf on deprecating this, though I have no opinion about the mechanism.
Ans sadly, it will be a long time before we can actually remove them, so the evangelizing and code reviews will need to co continue for a long time...
-Chris
Since np.double is a C double I assumed np.float is a C float. It is not.
np.int has the same problem by being a C long. Pure evil. Most users of NumPy probably expect the np.foobar dtype to map to the corresponding foobar C type. This is actually inconsistent and plain dangerous.
It would be much better if dtype=float meant Python float, dtype=np.float meant C float, dtype=int meant Python int, and dtype=np.int meant C int.
Sturla
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Chris Barker chris.barker@noaa.gov wrote:
we get away with np.float, because every OS/compiler that gets any regular use has np.float == a c double, which is always 64 bit.
Not if we are passing an array of np.float to a ac routine that expects float*, e.g. in OpenGL, BLAS or LAPACK. That will for sure give crazy results, just hang, or segfault.
I got away with pisting a PR with a "bugfix" which supposedly should fix a case of precision loss in a SciPy routine, because I thought np.float was np.float32 and not np.float64 (which it is). But it did make me feel rather stupid.
Sturla
On Sun, Jul 26, 2015 at 11:19 AM, Sturla Molden sturla.molden@gmail.com wrote:
we get away with np.float, because every OS/compiler that gets any
regular
use has np.float == a c double, which is always 64 bit.
Not if we are passing an array of np.float to a ac routine that expects float*, e.g. in OpenGL, BLAS or LAPACK. That will for sure give crazy results, just hang, or segfault.
well, yes, it is confusing, but at least consistent. So if you use it once correctly in your Python-C transition code, it should work the same way everywhere. As opposed to a np.int which is a python int, which is (if I have this right):
32 bits on all (most) 32 bit platforms 64 bits on 64 bit Linux and OS-X 32 bits on 64 bit Windows (also if compiled by cygwin??)
And who knows on a Cray or ARM, or???
Ouch!!!
Anyway -- we agree on this -- having the python types in the numpy namespace is confusing and dangerous -- even if it will take forever to deprecate them!
-CHB
Chris Barker chris.barker@noaa.gov wrote:
32 bits on all (most) 32 bit platforms 64 bits on 64 bit Linux and OS-X 32 bits on 64 bit Windows (also if compiled by cygwin??)
sizeof(long) is 8 on 64-bit Cygwin. This is to make sure it is inconsistent with MSVC and MinGW-w64, and make sure there will always be ABI mismatches unless the headerfiles are modified accordingly.
OTOH, it is one only sane 64-bit compiler on Windows. You can actually take code written for 64 bit Linux or OSX and expect that it will work correctly.
Sturla
Been using numpy in it's various forms since like 2005. burned on int, int_ just today with boost.python / ndarray conversions and a number of times before that. intc being C's int!? Didn't even know it existed till today. This isn't the first time, esp with float. Bool is actually expected for me and I'd prefer it stay 1 byte for storage efficiency - I'll use a long if I want it machine word wide.
This really needs changing though. scientific researchers don't catch this subtlety and expect it to be just like the c and matlab types they know a little about. I can't even keep it straight in all circumstances, how can I expect them to? This makes all the newcomers face the same pain and introduce more bugs into otherwise good code.
+1 Change it now like ripping off a bandaid. Match C11/C++11 types and solve much pain past present and future in exchange for a few lashings for the remainder of the year. Thankfully stdint like types have existed for quite some times so protocol descriptions have been correct most of the time.
-Jason
On Fri, Jul 24, 2015 at 8:51 AM, Julian Taylor < jtaylor.debian@googlemail.com> wrote:
On 07/23/2015 04:29 AM, Nathaniel Smith wrote:
Hi all,
So one of the things exposed in the numpy namespace are objects called np.int np.float np.bool etc.
These are commonly used -- in fact, just yesterday on another project I saw a senior person reviewing a pull request instruct a more junior person that they should use np.float instead of float or np.float64. But AFAICT everyone who is actually using them is doing this based on a very easy-to-fall-for misconception, i.e., that these objects have something to do with numpy.
I don't see the issue. They are just aliases so how is np.float worse than just float? Too me this does not seem worth the bother of deprecation. An argument could be made for deprecating creating dtypes from python builtin types as they are ambiguous (C float != python float) and platform dependent. E.g. dtype=int is just an endless source of bugs. But this is also so invasive that the deprecation would never be completed and just be a bother to everyone.
So -1 from me.
P.S.: using metamodule.py also gives us the option of making np.testing lazily imported, which last time this came up was benchmarked to improve numpy's import speed by ~35% [1] -- not too bad given that most production code will never touch np.testing. But this is just a teaser postscript; I'm not proposing that we actually do this at this time :-).
[1]
http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063147.html
I doubt these numbers from 2012 are still correct. When this was last profiled last year the import there were two main offenders, add_docs and np.polynomial. Both have been fixed in 1.9. I don't recall np.testing even showing up. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On 31.07.2015 08:24, Jason Newton wrote:
Been using numpy in it's various forms since like 2005. burned on int, int_ just today with boost.python / ndarray conversions and a number of times before that. intc being C's int!? Didn't even know it existed till today. This isn't the first time, esp with float. Bool is actually expected for me and I'd prefer it stay 1 byte for storage efficiency - I'll use a long if I want it machine word wide.
A long is only machine word wide on posix, in windows its not. This nonsense is unfortunately also in numpy. It also affects dtype=int. The correct word size type is actually np.intp.
btw. if something needs deprecating it is np.float128, this is the most confusing type name in all of numpy as its precision is actually a 80 bit in most cases (x86), 64 bit sometimes (arm) and very rarely actually 128 bit (sparc).
On 31/07/15 09:38, Julian Taylor wrote:
A long is only machine word wide on posix, in windows its not.
Actually it is the opposite. A pointer is 64 bit on AMD64, but the native integer and pointer offset is only 32 bit. But it does not matter because it is int that should be machine word sized, not long, which it is on both platforms.
Sturla
On Sun, Aug 2, 2015 at 5:13 AM, Sturla Molden sturla.molden@gmail.com wrote:
A long is only machine word wide on posix, in windows its not.
Actually it is the opposite. A pointer is 64 bit on AMD64, but the native integer and pointer offset is only 32 bit. But it does not matter because it is int that should be machine word sized, not long, which it is on both platforms.
All this illustrates that there is a lot of platform independence and complexity to the "standard" C types.
I suppose it's a good thing -- you can use something like "int" in C code, and presto! more precision in the future when you re-compile on a newer system.
However, for any code that needs some kind of binary compatibility between systems (or is dynamic, like python -- i.e. types are declared at run-time, not compile time), the "fixed width types are a lot safer (or at least easier to reason about). So we have tow issue with numpy:
1) confusing python types with C types -- e.g. np.int is currently a python integer, NOT a C int -- I think this is a litte too confusing, and should be depricated. (and np.long -- even more confusing!!!)
2) The vagaries of the standard C types: int, long, etc (spelled np.intc, which is a int32 on my machine, anyway) [NOTE: is there a C long dtype? I can't find it at the moment...]
It's probably a good idea to keep these, particularly for interfacing with C code (like my example of calling C code that use int). Though it would be good to make sure the docstring make it clear what they are.
However, I"d like to see a recommended practice of using sized types wherevver you can:
uint8 int32 float32 float54 etc....
not sure how to propagate that practice, but I'd love to see it become common.
Should we add aliases for the stdint names?
np.int_32_t, etc???
might be good to adhere to an established standard.
-CHB
Sturla
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On 08/03/2015 12:25 PM, Chris Barker wrote:
- The vagaries of the standard C types: int, long, etc (spelled
np.intc, which is a int32 on my machine, anyway) [NOTE: is there a C long dtype? I can't find it at the moment...]
Numpy does define "the platform dependent C integer types short, long, longlong and their unsigned versions" according to the docs. size_t is the same size as intc.
Even though float and double are virtually always IEEE single and double precision, maybe for consistency we should also define np.floatc, np.doublec and np.longdoublec?
Allan
On 03/08/15 18:25, Chris Barker wrote:
- The vagaries of the standard C types: int, long, etc (spelled
np.intc, which is a int32 on my machine, anyway) [NOTE: is there a C long dtype? I can't find it at the moment...]
There is, it is called np.int. This just illustrates the problem...
Sturla
On Mon, Aug 3, 2015 at 11:05 AM, Sturla Molden sturla.molden@gmail.com wrote:
On 03/08/15 18:25, Chris Barker wrote:
[NOTE: is there a C long dtype? I can't find it at the moment...]
There is, it is called np.int.
well, IIUC, np.int is the python integer type, which is a C long in all the implemtations of cPython that I know about -- but is that a guarantee? in the future as well? For instance, if it were up to me, I'd use an int_64_t on all 64 bit platforms, rather than having that odd 32 bit on Windows, 64 bit on *nix silliness....
This just illustrates the problem...
So another minor proposal: add a numpy.longc type, which would be platform C long. (and probably just an alias to something already there).
-Chris
On 03/08/15 20:51, Chris Barker wrote:
well, IIUC, np.int http://np.int is the python integer type, which is a C long in all the implemtations of cPython that I know about -- but is that a guarantee?in the future as well?
It is a Python int on Python 2.
On Python 3 dtype=np.int means the dtype will be C long, because a Python int has no size limit. But np.int aliases Python int. And creating an array with dype=int therefore does not create an array of Python int, it creates an array of C long. To actually get dtype=int we have to write dtype=object, which is just crazy.
Sturla
On Mo, 2015-08-03 at 21:32 +0200, Sturla Molden wrote:
On 03/08/15 20:51, Chris Barker wrote:
well, IIUC, np.int http://np.int is the python integer type, which is a C long in all the implemtations of cPython that I know about -- but is that a guarantee?in the future as well?
It is a Python int on Python 2.
On Python 3 dtype=np.int means the dtype will be C long, because a Python int has no size limit. But np.int aliases Python int. And creating an array with dype=int therefore does not create an array of Python int, it creates an array of C long. To actually get dtype=int we have to write dtype=object, which is just crazy.
Since it seemes there may be a few half truths flying around in this thread. See http://docs.scipy.org/doc/numpy/user/basics.types.html
and also note the sentence below the table (maybe the table should also note these):
Additionally to intc the platform dependent C integer types short, long, longlong and their unsigned versions are defined.
- Sebastian
Sturla
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Tue, Aug 4, 2015 at 4:39 AM, Sebastian Berg sebastian@sipsolutions.net wrote:
On Mo, 2015-08-03 at 21:32 +0200, Sturla Molden wrote:
On 03/08/15 20:51, Chris Barker wrote:
well, IIUC, np.int http://np.int is the python integer type, which
is
a C long in all the implemtations of cPython that I know about -- but
is
that a guarantee?in the future as well?
It is a Python int on Python 2.
On Python 3 dtype=np.int means the dtype will be C long, because a Python int has no size limit. But np.int aliases Python int. And creating an array with dype=int therefore does not create an array of Python int, it creates an array of C long. To actually get dtype=int we have to write dtype=object, which is just crazy.
Since it seemes there may be a few half truths flying around in this thread. See http://docs.scipy.org/doc/numpy/user/basics.types.html
Quote:
"Note that, above, we use the *Python* float object as a dtype. NumPy knows that int refers to np.int_, bool meansnp.bool_, that float is np.float_ and complex is np.complex_. The other data-types do not have Python equivalents."
Is there a conflict with the current thread?
Josef (I'm not a C person, so most of this is outside my scope, except for watching bugfixes to make older code work for larger datasets. Use `intp`, Luke.)
and also note the sentence below the table (maybe the table should also note these):
Additionally to intc the platform dependent C integer types short, long, longlong and their unsigned versions are defined.
- Sebastian
Sturla
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Di, 2015-08-04 at 05:57 -0400, josef.pktd@gmail.com wrote:
On Tue, Aug 4, 2015 at 4:39 AM, Sebastian Berg sebastian@sipsolutions.net wrote: On Mo, 2015-08-03 at 21:32 +0200, Sturla Molden wrote: > On 03/08/15 20:51, Chris Barker wrote: > > > well, IIUC, np.int http://np.int is the python integer type, which is > > a C long in all the implemtations of cPython that I know about -- but is > > that a guarantee?in the future as well? > > It is a Python int on Python 2. > > On Python 3 dtype=np.int means the dtype will be C long, because a > Python int has no size limit. But np.int aliases Python int. And > creating an array with dype=int therefore does not create an array of > Python int, it creates an array of C long. To actually get dtype=int we > have to write dtype=object, which is just crazy. >
Since it seemes there may be a few half truths flying around in this thread. See http://docs.scipy.org/doc/numpy/user/basics.types.html
Quote:
"Note that, above, we use the Python float object as a dtype. NumPy knows that int refers to np.int_, bool meansnp.bool_, that float is np.float_ and complex is np.complex_. The other data-types do not have Python equivalents."
Is there a conflict with the current thread?
No, but I had the impression that the C compatible type names "short", "cint", "long", etc. where forgotten.
Josef
(I'm not a C person, so most of this is outside my scope, except for watching bugfixes to make older code work for larger datasets. Use `intp`, Luke.)
and also note the sentence below the table (maybe the table should also note these): Additionally to intc the platform dependent C integer types short, long, longlong and their unsigned versions are defined. - Sebastian > > Sturla > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Mo, 2015-08-03 at 21:32 +0200, Sturla Molden wrote:
On 03/08/15 20:51, Chris Barker wrote:
well, IIUC, np.int http://np.int is the python integer type, which is a C long in all the implemtations of cPython that I know about -- but is that a guarantee?in the future as well?
It is a Python int on Python 2.
On Python 3 dtype=np.int means the dtype will be C long, because a Python int has no size limit. But np.int aliases Python int. And creating an array with dype=int therefore does not create an array of Python int, it creates an array of C long. To actually get dtype=int we have to write dtype=object, which is just crazy.
PS: I guess longdouble/complexlongdouble (and its floatXXX variants) are missing. And it might be a good place to note that floatXXX is not IEEE floatXXX.
Sturla
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Thu, Jul 30, 2015 at 11:24 PM, Jason Newton nevion@gmail.com wrote:
This really needs changing though. scientific researchers don't catch this subtlety and expect it to be just like the c and matlab types they know a little about.
well, C types are a %&$ nightmare as well! In fact, one of the biggest issues comes from cPython's use of a C "long" for an integer -- which is not clearly defined. If you are writing code that needs any kind of binary compatibility, cross platform compatibility, and particularly if you want to be abel to distribute pre-compiled binaries of extensions, etc, then you'd better use well-defined types.
numpy has had well-defined types for ages, but it is a shame that it's so easy to use the poorly-defined ones.
I can't even keep it straight in all circumstances, how can I expect them
to? This makes all the newcomers face the same pain and introduce more bugs into otherwise good code.
indeed.
+1 Change it now like ripping off a bandaid. Match C11/C++11 types and solve much pain past present and future in exchange for a few lashings for the remainder of the year.
Sorry -- I'm not sure what C11 types are -- is "int", "long", etc, deprecated? If so, then yes.
What about Fortan -- I've been out of that loop for ages -- does semi-modern Fortran use well defined integer types?
Is it possible to deprecate a bunch of the built-in numpy dtypes? Without annoying the heck out everyone -- because tehre is a LOT of code out there that just uses np.float, np.int, etc.....
An argument could be made for deprecating creating dtypes from python
builtin types as they are ambiguous (C float != python float) and platform dependent. E.g. dtype=int is just an endless source of bugs. But this is also so invasive that the deprecation would never be completed and just be a bother to everyone.
yeah, that is a big concern. :-(
-Chris
Chris Barker chris.barker@noaa.gov wrote:
What about Fortan -- I've been out of that loop for ages -- does semi-modern Fortran use well defined integer types?
Modern Fortran is completely sane.
INTEGER without kind number (Fortran 77) is the fastest integer on the CPU. On AMD64 that is 32 bit, because it is designed to use a 64 bit pointer with a 32 bit offset. (That is also why Microsoft decided to use a 32 bit long, because it by definition is the fastest integer of at least 32 bits. One can actually claim that the C standard is violated with a 64 bit long on AMD64.) Because of this we use a 32 bit interger in BLAS and LAPACK linked to NumPy and SciPy.
The function KIND (Fortran 90) allows us to query the kind number of a given variable, e.g. to find out the size of INTEGER and REAL.
The function SELECTED_INT_KIND (Fortran 90) returns the kind number of smallest integer with a specified range.
The function SELECTED_REAL_KIND (Fortran 90) returns the kind number of smallest float with a given range and precision. THe returned kind number can be used for REAL and COMPLEX.
KIND, SELECTED_INT_KIND and SELECTED_REAL_KIND will all return compile-time constants, and can be used to declare other variables if the return value is stored in a variable with the attribute PARAMETER. This allows te programmer to get the REAL, COMPLEX or INTEGER the algorithm needs numerically, without thinking about how big they need to be in bits.
ISO_C_BINDING is a Fortran 2003 module which contains kind numbers corresponding to all C types, including size_t and void*, C structs, an attribute for using pass-by-value semantics, controlling the C name to avoid name mangling, as well as functions for converting between C and Fortran pointers. It allows portable interop between C and Fortran (either calling C from Fortran or calling Fortran from C).
ISO_FORTRAN_ENV is a Fortran 2003 and 2008 module. In F2003 it contain kind numbers for integers with specified size: INT8, INT16, INT32, and INT64. In F2008 it also contains kind numbers for IEEE floating point types: REAL32, REAL64, and REAL128. The kind numbers for floating point types can also be used to declare complex numbers.
So with modern Fortran we have a completely portable and unambiguous type system.
C11/C++11 is sane as well, but not quite as sane as that of modern Fortran.
Sturla
So one more bit of anecdotal evidence:
I just today revived some Cython code I wrote a couple years ago and haven't tested since.
It wraps a C library that uses a lot of "int" typed values.
Turns out I was passing in numpy arrays that I had typed as "np.int". It worked OK two years ago when I was testing only on 32 bit pythons, but today I got a bunch of failed tests on 64 bit OS-X -- a np.int is now a C long!
I really thought I knew better, even a couple years ago, but I guess it's just too easy to slip up there.
Yeah to Cython for keeping types straight (I got a run-time error). And Yeah to me for having at least some basic tests.
But Boo to numpy for a very easy to confuse type API.
-Chris
Sent from my iPhone
On Jul 31, 2015, at 10:45 AM, Sturla Molden sturla.molden@gmail.com wrote:
Chris Barker chris.barker@noaa.gov wrote:
What about Fortan -- I've been out of that loop for ages -- does semi-modern Fortran use well defined integer types?
Modern Fortran is completely sane.
INTEGER without kind number (Fortran 77) is the fastest integer on the CPU. On AMD64 that is 32 bit, because it is designed to use a 64 bit pointer with a 32 bit offset. (That is also why Microsoft decided to use a 32 bit long, because it by definition is the fastest integer of at least 32 bits. One can actually claim that the C standard is violated with a 64 bit long on AMD64.) Because of this we use a 32 bit interger in BLAS and LAPACK linked to NumPy and SciPy.
The function KIND (Fortran 90) allows us to query the kind number of a given variable, e.g. to find out the size of INTEGER and REAL.
The function SELECTED_INT_KIND (Fortran 90) returns the kind number of smallest integer with a specified range.
The function SELECTED_REAL_KIND (Fortran 90) returns the kind number of smallest float with a given range and precision. THe returned kind number can be used for REAL and COMPLEX.
KIND, SELECTED_INT_KIND and SELECTED_REAL_KIND will all return compile-time constants, and can be used to declare other variables if the return value is stored in a variable with the attribute PARAMETER. This allows te programmer to get the REAL, COMPLEX or INTEGER the algorithm needs numerically, without thinking about how big they need to be in bits.
ISO_C_BINDING is a Fortran 2003 module which contains kind numbers corresponding to all C types, including size_t and void*, C structs, an attribute for using pass-by-value semantics, controlling the C name to avoid name mangling, as well as functions for converting between C and Fortran pointers. It allows portable interop between C and Fortran (either calling C from Fortran or calling Fortran from C).
ISO_FORTRAN_ENV is a Fortran 2003 and 2008 module. In F2003 it contain kind numbers for integers with specified size: INT8, INT16, INT32, and INT64. In F2008 it also contains kind numbers for IEEE floating point types: REAL32, REAL64, and REAL128. The kind numbers for floating point types can also be used to declare complex numbers.
So with modern Fortran we have a completely portable and unambiguous type system.
C11/C++11 is sane as well, but not quite as sane as that of modern Fortran.
Sturla
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Chris Barker - NOAA Federal chris.barker@noaa.gov wrote:
Turns out I was passing in numpy arrays that I had typed as "np.int". It worked OK two years ago when I was testing only on 32 bit pythons, but today I got a bunch of failed tests on 64 bit OS-X -- a np.int is now a C long!
It has always been C long. It is the C long that varies between platforms.
Sturla
Turns out I was passing in numpy arrays that I had typed as "np.int". It worked OK two years ago when I was testing only on 32 bit pythons, but today I got a bunch of failed tests on 64 bit OS-X -- a np.int is now a C long!
It has always been C long. It is the C long that varies between platforms.
Of course, it's that a c long was a c int on the platform I wrote the code on the first time.
Which is part of the problem with C -- if two types happen to be the same, the compiler is perfectly happy. But that was an error in the first place, it never should have passed.
But that's just me.
;-)
Anyway, as far as concrete proposals go. I say we deprecate the Python types in the numpy namespace (i.e int and float)
Other than that, I'm not sure there's any problem.
-Chris
Sturla
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Chris Barker - NOAA Federal chris.barker@noaa.gov wrote:
Which is part of the problem with C -- if two types happen to be the same, the compiler is perfectly happy.
That int and long int be the same is not more problematic than int and signed int be the same.
Sturla
--
Kind regards Nick Papior On 31 Jul 2015 17:53, "Chris Barker" chris.barker@noaa.gov wrote:
On Thu, Jul 30, 2015 at 11:24 PM, Jason Newton nevion@gmail.com wrote:
This really needs changing though. scientific researchers don't catch
this subtlety and expect it to be just like the c and matlab types they know a little about.
well, C types are a %&$ nightmare as well! In fact, one of the biggest
issues comes from cPython's use of a C "long" for an integer -- which is not clearly defined. If you are writing code that needs any kind of binary compatibility, cross platform compatibility, and particularly if you want to be abel to distribute pre-compiled binaries of extensions, etc, then you'd better use well-defined types.
numpy has had well-defined types for ages, but it is a shame that it's so
easy to use the poorly-defined ones.
I can't even keep it straight in all circumstances, how can I expect
them to? This makes all the newcomers face the same pain and introduce more bugs into otherwise good code.
indeed.
+1 Change it now like ripping off a bandaid. Match C11/C++11 types and
solve much pain past present and future in exchange for a few lashings for the remainder of the year.
Sorry -- I'm not sure what C11 types are -- is "int", "long", etc,
deprecated? If so, then yes.
What about Fortan -- I've been out of that loop for ages -- does
semi-modern Fortran use well defined integer types? Yes, this is much like the c equivalent, integer is int, real is float, for long and double constant castings are needed.
Is it possible to deprecate a bunch of the built-in numpy dtypes? Without
annoying the heck out everyone -- because tehre is a LOT of code out there that just uses np.float, np.int, etc.....
An argument could be made for deprecating creating dtypes from python builtin types as they are ambiguous (C float != python float) and platform dependent. E.g. dtype=int is just an endless source of bugs. But this is also so invasive that the deprecation would never be completed and just be a bother to everyone.
yeah, that is a big concern. :-(
-Chris
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Fri, Jul 31, 2015 at 5:19 PM, Nick Papior nickpapior@gmail.com wrote:
--
Kind regards Nick Papior On 31 Jul 2015 17:53, "Chris Barker" chris.barker@noaa.gov wrote:
On Thu, Jul 30, 2015 at 11:24 PM, Jason Newton nevion@gmail.com wrote:
This really needs changing though. scientific researchers don't catch
this subtlety and expect it to be just like the c and matlab types they know a little about.
well, C types are a %&$ nightmare as well! In fact, one of the biggest
issues comes from cPython's use of a C "long" for an integer -- which is not clearly defined. If you are writing code that needs any kind of binary compatibility, cross platform compatibility, and particularly if you want to be abel to distribute pre-compiled binaries of extensions, etc, then you'd better use well-defined types.
There was some truth to this but if you, like the majority of scientific researchers only produce code for x86 or x86_64 on windows and linux... as long as you aren't treating pointers as int's, everything behaves in accordance to general expectations. The standards did and still do allow for a bit of flux but things like OpenCL [ https://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/scalarDataTypes.h... ] made this really strict so we stop writing ifdef's to deal with varying bitwidths and just implement the algorithms - which is typically a researcher’s top priority.
I'd say I use the strongly defined types (e.g. int/float32) whenever doing protocol or communications work - it makes complete sense there. But often for computation, especially when interfacing with c extensions it makes more sense for the developer to use types/typenames that ought to match 1:1 with c in every case.
-Jason
On Jul 24, 2015 08:55, "Julian Taylor" jtaylor.debian@googlemail.com wrote:
On 07/23/2015 04:29 AM, Nathaniel Smith wrote:
Hi all,
So one of the things exposed in the numpy namespace are objects called np.int np.float np.bool etc.
These are commonly used -- in fact, just yesterday on another project I saw a senior person reviewing a pull request instruct a more junior person that they should use np.float instead of float or np.float64. But AFAICT everyone who is actually using them is doing this based on a very easy-to-fall-for misconception, i.e., that these objects have something to do with numpy.
I don't see the issue. They are just aliases so how is np.float worse than just float?
Because np.float systematically confuses people in a way that plain float does not. Which is problematic given that we have a lot of users who aren't expert programmers and are easily confused.
Too me this does not seem worth the bother of deprecation. An argument could be made for deprecating creating dtypes from python builtin types as they are ambiguous (C float != python float) and platform dependent. E.g. dtype=int is just an endless source of bugs. But this is also so invasive that the deprecation would never be completed and just be a bother to everyone.
Yeah, I don't see any way to ever make dtype=int an error, though I can see an argument for making it unconditionally int64 or intp. That's a separate discussion... but every step we can make to simplify these names makes it easier to untangle the overall knot, IMHO. (E.g. if people have different expectations about what int and np.int should mean -- as they obviously do -- then changing the meaning of both of them is harder than deprecating one and then changing the other, so this deprecation puts us in a better position even if it doesn't immediately help much.)
So -1 from me.
Do you really mean this as a true veto? While some of the thread has gotten a bit confused about how much of a change we're actually talking about, AFAICT everyone else is very much in favor of this deprecation, including testimony from multiple specific users who have gotten burned.
-n