Hi, There have been some changes recently in the umath code, which breaks windows 64 compilation - and I don't understand their rationale either. I have myself spent quite a good deal of time to make sure this works on many platforms/toolchains, by fixing the config distutils command and that platform specificities are contained in a very localized part of the code. It may not be very well documented (see below), but may I ask that next time someone wants to change file file, people ask for review before putting it directly in the trunk ? thanks, David How to deal with platform oddities: ----------------------------------- Basically, the code to replace missing C99 math funcs is, for an hypothetical double foo(double) function: #ifndef HAVE_FOO #udnef foo static double npy_foo(double a) { // define a npy_foo function with the same requirements as C99 foo } #define npy_foo foo #else double foo(double); #endif I think this code is wrong on several accounts: - we should not undef foo if foo is available: if foo is available at that point, it is a bug in the configuration, and should not be dealt in the code. Some cases may be complicated (IEEE754-related macro which are sometimes macro, something functions, etc...), but that should be dealt in very narrow cases. - we should not declare our own function: function declaration is not portable, and varies among OS/toolchains. Some toolchains use intrinsic, some non standard inline mechanism, etc... which can crash the resulting binary because there is a discrepency between our code calling conventions and the library convention. The reported problem with VS compiler on amd64 is caused by this exact problem. Unless there is a strong rationale otherwise, I would like that we follow how "autoconfed" projects do. They have long experience on dealing with platforms idiosyncrasies, and the above method is not the one they follow. They follow the simple: #ifnfdef HAVE_FOO //define foo #endif And deal with platform oddities in the *configuration* code instead of directly in the code. That really makes my life easier when I deal with windows compilers, which are already painful enough to deal with as it is.
On Tue, Dec 16, 2008 at 8:59 PM, David Cournapeau < david@ar.media.kyoto-u.ac.jp> wrote:
Hi,
There have been some changes recently in the umath code, which breaks windows 64 compilation - and I don't understand their rationale either. I have myself spent quite a good deal of time to make sure this works on many platforms/toolchains, by fixing the config distutils command and that platform specificities are contained in a very localized part of the code. It may not be very well documented (see below), but may I ask that next time someone wants to change file file, people ask for review before putting it directly in the trunk ?
thanks,
David
How to deal with platform oddities: -----------------------------------
Basically, the code to replace missing C99 math funcs is, for an hypothetical double foo(double) function:
#ifndef HAVE_FOO #udnef foo static double npy_foo(double a) { // define a npy_foo function with the same requirements as C99 foo }
#define npy_foo foo #else double foo(double); #endif
I think this code is wrong on several accounts: - we should not undef foo if foo is available: if foo is available at that point, it is a bug in the configuration, and should not be dealt in the code. Some cases may be complicated (IEEE754-related macro which are sometimes macro, something functions, etc...), but that should be dealt in very narrow cases. - we should not declare our own function: function declaration is not portable, and varies among OS/toolchains. Some toolchains use intrinsic, some non standard inline mechanism, etc... which can crash the resulting binary because there is a discrepency between our code calling conventions and the library convention. The reported problem with VS compiler on amd64 is caused by this exact problem.
Unless there is a strong rationale otherwise, I would like that we follow how "autoconfed" projects do. They have long experience on dealing with platforms idiosyncrasies, and the above method is not the one they follow. They follow the simple:
Yes, the rational was to fix compilation on windows 64 with msvc and etch on SPARC, both of which were working after the changes. You are, of course, free to break these builds again. However, I designated space at the top of the file for compiler/distro specific defines, I think you should use them, there is a reason other folks do. The macro undef could be moved but I preferred to generate an error if there was a conflict with the the standard c function prototypes. We can't use inline code for these functions as they are passed to the generic loops as function pointers. I assume compilers have some way of recognizing this case and perhaps generating function code on the fly. If so, we need to figure out how to detect that. Chuck
Charles R Harris wrote:
On Tue, Dec 16, 2008 at 8:59 PM, David Cournapeau
mailto:david@ar.media.kyoto-u.ac.jp> wrote: Hi,
There have been some changes recently in the umath code, which breaks windows 64 compilation - and I don't understand their rationale either. I have myself spent quite a good deal of time to make sure this works on many platforms/toolchains, by fixing the config distutils command and that platform specificities are contained in a very localized part of the code. It may not be very well documented (see below), but may I ask that next time someone wants to change file file, people ask for review before putting it directly in the trunk ?
thanks,
David
How to deal with platform oddities: -----------------------------------
Basically, the code to replace missing C99 math funcs is, for an hypothetical double foo(double) function:
#ifndef HAVE_FOO #udnef foo static double npy_foo(double a) { // define a npy_foo function with the same requirements as C99 foo }
#define npy_foo foo #else double foo(double); #endif
I think this code is wrong on several accounts: - we should not undef foo if foo is available: if foo is available at that point, it is a bug in the configuration, and should not be dealt in the code. Some cases may be complicated (IEEE754-related macro which are sometimes macro, something functions, etc...), but that should be dealt in very narrow cases. - we should not declare our own function: function declaration is not portable, and varies among OS/toolchains. Some toolchains use intrinsic, some non standard inline mechanism, etc... which can crash the resulting binary because there is a discrepency between our code calling conventions and the library convention. The reported problem with VS compiler on amd64 is caused by this exact problem.
Unless there is a strong rationale otherwise, I would like that we follow how "autoconfed" projects do. They have long experience on dealing with platforms idiosyncrasies, and the above method is not the one they follow. They follow the simple:
Yes, the rational was to fix compilation on windows 64 with msvc and etch on SPARC, both of which were working after the changes.
It does not work at the moment on windows at least :) But more essentially, I don't see why you declared those functions: can you explain me what was your intention, because I don't understand the rationale.
You are, of course, free to break these builds again. However, I designated space at the top of the file for compiler/distro specific defines, I think you should use them, there is a reason other folks do.
The problem is two folds: - by declaring functions everywhere in the code, you are effectively spreading toolchain specific oddities in the whole source file. This is not good, IMHO: those should be detected at configuration stage, and dealt in the source code using those informations. That's how every autoconf project does it. If a function is actually a macro, this should be detected at configuration. - declarations are toolchain specific; it is actually worse, it even depends on the compiler flags. It is at least the case with MS compilers. So there is no way to guarantee that your declaration matches the math runtime one (the compiler crash reported is exactly caused by this).
The macro undef could be moved but I preferred to generate an error if there was a conflict with the the standard c function prototypes.
We can't use inline code for these functions as they are passed to the generic loops as function pointers.
Yes, I believe this is another problem when declaring function: if we use say cosl, and cosl is an inline function in the runtime, by re-declaring it, you are telling the compiler that it is not inline anymore, so the compiler does not know anymore you can't take the address of cosl, unless it detects the mismatch between the runtime declaration and ours, and considers it as an error (I am not sure whether this is always an error with MS compilers; it may only be a warning on some versions - it is certainly not dealt in a gracious manner every time, since the linker crashes in some cases). David
------------------------------------------------------------------------
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
On Tue, Dec 16, 2008 at 10:56 PM, David Cournapeau < david@ar.media.kyoto-u.ac.jp> wrote:
Charles R Harris wrote:
On Tue, Dec 16, 2008 at 8:59 PM, David Cournapeau
mailto:david@ar.media.kyoto-u.ac.jp> wrote: It does not work at the moment on windows at least :) But more essentially, I don't see why you declared those functions: can you explain me what was your intention, because I don't understand the rationale.
The declarations were for the SPARC. Originally I had them up in an ifdef up top, but I got curious what different machines would do. They shouldn't cause a problem unless something is pretty strange. The undefs I put where they are for similar reasons, but there was a strong temptation to move them into the if statement where they used to be. Let's say curiousity got the best of me there. They shouldn't affect anything but macros and I didn't want the function declarations do be interpreted as macros.
You are, of course, free to break these builds again. However, I designated space at the top of the file for compiler/distro specific defines, I think you should use them, there is a reason other folks do.
The problem is two folds: - by declaring functions everywhere in the code, you are effectively spreading toolchain specific oddities in the whole source file. This is not good, IMHO: those should be detected at configuration stage, and dealt in the source code using those informations. That's how every autoconf project does it. If a function is actually a macro, this should be detected at configuration. - declarations are toolchain specific; it is actually worse, it even depends on the compiler flags. It is at least the case with MS compilers. So there is no way to guarantee that your declaration matches the math runtime one (the compiler crash reported is exactly caused by this).
Worth knowing ;) It works on the windows buildbot but that is running python 2.4. Speaking of which, the BSD buildbot needs nose (I don't know what happened to it), the windows box is showing the same old permissions problem, and one of the SPARC buildbots just times out unless you build during the right time of day. We are just hobbling along at the moment.
The macro undef could be moved but I preferred to generate an error if there was a conflict with the the standard c function prototypes.
We can't use inline code for these functions as they are passed to the generic loops as function pointers.
Yes, I believe this is another problem when declaring function: if we use say cosl, and cosl is an inline function in the runtime, by re-declaring it, you are telling the compiler that it is not inline anymore, so the compiler does not know anymore you can't take the address of cosl, unless it detects the mismatch between the runtime declaration and ours, and considers it as an error (I am not sure whether this is always an error with MS compilers; it may only be a warning on some versions - it is certainly not dealt in a gracious manner every time, since the linker crashes in some cases).
Sorry for the late reply, the network was down. Chuck
Charles R Harris wrote:
The declarations were for the SPARC. Originally I had them up in an ifdef up top, but I got curious what different machines would do.
I still don't understand what exact problem they solve. Since the declarations are put when HAVE_FOO is defined, the only problems I can see are problem in the detection code or a platform bug (I seem to remember for SPARC, this was a platform error, right ?). In either case, it should be solved elsewhere (at worst, for platform specific, this should be done within #if PLATFORM/#endif).
They shouldn't cause a problem unless something is pretty strange.
They do; the default rule should be not to put any external declaration, because they are heavily toolchain/platform specific. I removed a lot of them from the old code when I refactored this code, and putting them back almost totally alleviate my effort :) To quote python code itself (pyport.h): /************************************************************************** Prototypes that are missing from the standard include files on some systems (and possibly only some versions of such systems.) Please be conservative with adding new ones, document them and enclose them in platform-specific #ifdefs. **************************************************************************/
The undefs I put where they are for similar reasons, but there was a strong temptation to move them into the if statement where they used to be.
Could you be more specific ? I want to know the actual error they were solving.
Let's say curiousity got the best of me there. They shouldn't affect anything but macros and I didn't want the function declarations do be interpreted as macros.
"Shouldn't affect" is not good enough :) The default rule should be to avoid relying at all on those distinctions, and only care when they matter. Doing the other way around does not work, there alway be some strange platform which will break most assumptions, as rationale as they can be.
Worth knowing ;) It works on the windows buildbot but that is running python 2.4.
Ah, it is 2.4 ! I was wondering the exact combination. It does not work with the platform SDK 6.1 (which includes 64 bits compiler), and this results in a compiler segfault. The problem is particularly pernicious, since the segfaults is not seen directly, but put in a temp file which itself causes problem because two processes try to access it... One of the nicest build failure I have ever seen :)
Speaking of which, the BSD buildbot needs nose (I don't know what happened to it), the windows box is showing the same old permissions problem, and one of the SPARC buildbots just times out unless you build during the right time of day. We are just hobbling along at the moment.
Windows problems at least are not specific to the buildbot.
Sorry for the late reply, the network was down.
No problem, David
Hi David, On Wed, Dec 17, 2008 at 8:58 PM, David Cournapeau < david@ar.media.kyoto-u.ac.jp> wrote:
Charles R Harris wrote:
The declarations were for the SPARC. Originally I had them up in an ifdef up top, but I got curious what different machines would do.
I still don't understand what exact problem they solve. Since the declarations are put when HAVE_FOO is defined, the only problems I can see are problem in the detection code or a platform bug (I seem to remember for SPARC, this was a platform error, right ?). In either case, it should be solved elsewhere (at worst, for platform specific, this should be done within #if PLATFORM/#endif).
They shouldn't cause a problem unless something is pretty strange.
They do; the default rule should be not to put any external declaration, because they are heavily toolchain/platform specific. I removed a lot of them from the old code when I refactored this code, and putting them back almost totally alleviate my effort :) To quote python code itself (pyport.h):
/************************************************************************** Prototypes that are missing from the standard include files on some systems (and possibly only some versions of such systems.)
Please be conservative with adding new ones, document them and enclose them in platform-specific #ifdefs. **************************************************************************/
That's how I did it originally, that's why that section is up top. But I got curious. So that can be fixed.
The undefs I put where they are for similar reasons, but there was a strong temptation to move them into the if statement where they used to be.
Could you be more specific ? I want to know the actual error they were solving.
The undefs need to be there when the functions are defined by numpy, so they only need to be in the same #if block as those definitions. I moved them out to cover the function declarations also, but if those are put in their own block for SPARC then they aren't needed.
Let's say curiousity got the best of me there. They shouldn't affect anything but macros and I didn't want the function declarations do be interpreted as macros.
"Shouldn't affect" is not good enough :) The default rule should be to avoid relying at all on those distinctions, and only care when they matter. Doing the other way around does not work, there alway be some strange platform which will break most assumptions, as rationale as they can be.
Worth knowing ;) It works on the windows buildbot but that is running python 2.4.
Ah, it is 2.4 ! I was wondering the exact combination. It does not work with the platform SDK 6.1 (which includes 64 bits compiler), and this results in a compiler segfault. The problem is particularly pernicious, since the segfaults is not seen directly, but put in a temp file which itself causes problem because two processes try to access it... One of the nicest build failure I have ever seen :)
The window buildbot was working, went off line for a few weeks, and showed failures on return. It is a VMWare version, so maybe something was changed in between.
Speaking of which, the BSD buildbot needs nose (I don't know what happened to it), the windows box is showing the same old permissions problem, and one of the SPARC buildbots just times out unless you build during the right time of day. We are just hobbling along at the moment.
Windows problems at least are not specific to the buildbot.
Sorry for the late reply, the network was down.
No problem,
And I still have network problems... What will the world do if the networks collapse? Chuck
Hi Chuck,
On Fri, Dec 19, 2008 at 2:15 AM, Charles R Harris
The undefs need to be there when the functions are defined by numpy, so they only need to be in the same #if block as those definitions. I moved them out to cover the function declarations also, but if those are put in their own block for SPARC then they aren't needed.
But then it just hides the problem instead of solving it. If we are in the #if bloc and the symbol is defined, it is a bug in the configuration stage, it should be dealt there - if it is a bug in the toolchain (say the symbol is in the library, but not declared in the header), then it should be dealt with for that exact platform only. It is not nit-picking, because the later way means it won't break any other platform :) It still should be used sparingly, though (the SPARC problem is a good example where it should be used). cheers, David
On Thu, Dec 18, 2008 at 9:12 PM, David Cournapeau
Hi Chuck,
On Fri, Dec 19, 2008 at 2:15 AM, Charles R Harris
wrote: The undefs need to be there when the functions are defined by numpy, so they only need to be in the same #if block as those definitions. I moved them out to cover the function declarations also, but if those are put in their own block for SPARC then they aren't needed.
But then it just hides the problem instead of solving it. If we are in the #if bloc and the symbol is defined, it is a bug in the configuration stage, it should be dealt there - if it is a bug in the toolchain (say the symbol is in the library, but not declared in the header), then it should be dealt with for that exact platform only.
It is not nit-picking, because the later way means it won't break any other platform :) It still should be used sparingly, though (the SPARC problem is a good example where it should be used).
cheers,
David _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
On Thu, Dec 18, 2008 at 9:12 PM, David Cournapeau
Hi Chuck,
On Fri, Dec 19, 2008 at 2:15 AM, Charles R Harris
wrote: The undefs need to be there when the functions are defined by numpy, so they only need to be in the same #if block as those definitions. I moved them out to cover the function declarations also, but if those are put in their own block for SPARC then they aren't needed.
But then it just hides the problem instead of solving it. If we are in the #if bloc and the symbol is defined, it is a bug in the configuration stage, it should be dealt there - if it is a bug in the toolchain (say the symbol is in the library, but not declared in the header), then it should be dealt with for that exact platform only.
It is not nit-picking, because the later way means it won't break any other platform :) It still should be used sparingly, though (the SPARC problem is a good example where it should be used).
True, it should be solved in the configuration stage, but what if it isn't? I suppose an error message might be the desired result. If you want to remove the undefs to see what happens, that's fine with me. They were inherited from the old code in any case. Chuck
participants (3)
-
Charles R Harris
-
David Cournapeau
-
David Cournapeau