Mailman 3 Combined versus separate build - NumPy-Discussion

Combined versus separate build

Nathaniel Smith

27 Jun 2012 27 Jun '12

11:17 a.m.

Currently the numpy build system(s) support two ways of building numpy: either by compiling a giant concatenated C file, or by the more conventional route of first compiling each .c file to a .o file, and then linking those together. I gather from comments in the source code that the former is the traditional method, and the latter is the newer "experimental" approach. It's easy to break one of these builds without breaking the other (I just did this with the NA branch, and David had to clean up after me), and I don't see what value we really get from having both options -- it seems to just double the size of the test matrix without adding value. Now that the separate build seems to be fully supported, maybe it's time to finish the "experiment" and pick one approach to support going forward? I guess the arguments for each would be: - The monolithic build in principle allows for some extra intra-procedural optimization. I won't believe this until I see benchmarks, though; numpy doesn't have a lot of tiny inline-able function calls or anything like that. - The separate build is probably more convenient for developers, allowing faster rebuilds. Numpy builds fast enough for me that I'm not too worried about which approach we use, but it definitely seems worthwhile to reduce the number of configurations we have to support one way or the other. -N

Show replies by date

David Cournapeau

27 Jun 27 Jun

11:50 a.m.

On Wed, Jun 27, 2012 at 7:17 PM, Nathaniel Smith <njs@pobox.com> wrote:

...

Currently the numpy build system(s) support two ways of building numpy: either by compiling a giant concatenated C file, or by the more conventional route of first compiling each .c file to a .o file, and then linking those together. I gather from comments in the source code that the former is the traditional method, and the latter is the newer "experimental" approach.

It's easy to break one of these builds without breaking the other (I just did this with the NA branch, and David had to clean up after me), and I don't see what value we really get from having both options -- it seems to just double the size of the test matrix without adding value.

There is unfortunately a big value in it: there is no standard way in C to share symbols within a library without polluting the whole process namespace, except on windows where the default is to export nothing. Most compilers support it (I actually know of none that does not support it in some way or the others), but that's platform-specific. I do find the multi-file support useful when developing (it does not make the full build faster, but I find partial rebuild too slow without it). David

Nathaniel Smith

12:07 p.m.

On Wed, Jun 27, 2012 at 7:50 PM, David Cournapeau <cournape@gmail.com> wrote:

...

On Wed, Jun 27, 2012 at 7:17 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
Currently the numpy build system(s) support two ways of building numpy: either by compiling a giant concatenated C file, or by the more conventional route of first compiling each .c file to a .o file, and then linking those together. I gather from comments in the source code that the former is the traditional method, and the latter is the newer "experimental" approach.

It's easy to break one of these builds without breaking the other (I just did this with the NA branch, and David had to clean up after me), and I don't see what value we really get from having both options -- it seems to just double the size of the test matrix without adding value.

There is unfortunately a big value in it: there is no standard way in C to share symbols within a library without polluting the whole process namespace, except on windows where the default is to export nothing.

Most compilers support it (I actually know of none that does not support it in some way or the others), but that's platform-specific.

IIRC this isn't too tricky to arrange for with gcc, but why is this an issue in the first place for a Python extension module? Extension modules are opened without RTLD_GLOBAL, which means that they *never* export any symbols. At least, that's how it should work on Linux and most Unix-alikes; I don't know much about OS X's linker, except that it's unusual in other ways. -N

David Cournapeau

12:29 p.m.

On Wed, Jun 27, 2012 at 8:07 PM, Nathaniel Smith <njs@pobox.com> wrote:

...

On Wed, Jun 27, 2012 at 7:50 PM, David Cournapeau <cournape@gmail.com> wrote:

...
On Wed, Jun 27, 2012 at 7:17 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
Currently the numpy build system(s) support two ways of building numpy: either by compiling a giant concatenated C file, or by the more conventional route of first compiling each .c file to a .o file, and then linking those together. I gather from comments in the source code that the former is the traditional method, and the latter is the newer "experimental" approach.

It's easy to break one of these builds without breaking the other (I just did this with the NA branch, and David had to clean up after me), and I don't see what value we really get from having both options -- it seems to just double the size of the test matrix without adding value.

There is unfortunately a big value in it: there is no standard way in C to share symbols within a library without polluting the whole process namespace, except on windows where the default is to export nothing.

Most compilers support it (I actually know of none that does not support it in some way or the others), but that's platform-specific.

IIRC this isn't too tricky to arrange for with gcc

No, which is why this is supported for gcc and windows :)

...

, but why is this an issue in the first place for a Python extension module? Extension modules are opened without RTLD_GLOBAL, which means that they *never* export any symbols. At least, that's how it should work on Linux and most Unix-alikes; I don't know much about OS X's linker, except that it's unusual in other ways.

The pragmatic answer is that if it were not an issue, python itself would not bother with it. Every single extension module in python itself is built from a single compilation unit. This is also why we have this awful system to export the numpy C API with array of function pointers instead of simply exporting things in a standard way. See this: http://docs.python.org/release/2.5.3/ext/using-cobjects.html Looking quickly at the 2.7.3 sources, the more detailed answer is that python actually does not use RTLD_LOCAL (nor RTLD_GLOBAL), and what happens when neither of them is used is implementation-dependent. It seems to be RTLD_LOCAL on linux, and RTLD_GLOBAL on mac os x. There also may be consequences on the use of RTLD_LOCAL in embedded mode (I have ancient and bad memories with matlab related to this, but I forgot the details). David

Nathaniel Smith

12:53 p.m.

On Wed, Jun 27, 2012 at 8:29 PM, David Cournapeau <cournape@gmail.com> wrote:

...

On Wed, Jun 27, 2012 at 8:07 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
On Wed, Jun 27, 2012 at 7:50 PM, David Cournapeau <cournape@gmail.com> wrote:

...
On Wed, Jun 27, 2012 at 7:17 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
Currently the numpy build system(s) support two ways of building numpy: either by compiling a giant concatenated C file, or by the more conventional route of first compiling each .c file to a .o file, and then linking those together. I gather from comments in the source code that the former is the traditional method, and the latter is the newer "experimental" approach.

It's easy to break one of these builds without breaking the other (I just did this with the NA branch, and David had to clean up after me), and I don't see what value we really get from having both options -- it seems to just double the size of the test matrix without adding value.

There is unfortunately a big value in it: there is no standard way in C to share symbols within a library without polluting the whole process namespace, except on windows where the default is to export nothing.

Most compilers support it (I actually know of none that does not support it in some way or the others), but that's platform-specific.

IIRC this isn't too tricky to arrange for with gcc

No, which is why this is supported for gcc and windows :)

...
, but why is this an issue in the first place for a Python extension module? Extension modules are opened without RTLD_GLOBAL, which means that they *never* export any symbols. At least, that's how it should work on Linux and most Unix-alikes; I don't know much about OS X's linker, except that it's unusual in other ways.

The pragmatic answer is that if it were not an issue, python itself would not bother with it. Every single extension module in python itself is built from a single compilation unit. This is also why we have this awful system to export the numpy C API with array of function pointers instead of simply exporting things in a standard way.

The array-of-function-pointers is solving the opposite problem, of exporting functions *without* having global symbols.

...

See this: http://docs.python.org/release/2.5.3/ext/using-cobjects.html

Looking quickly at the 2.7.3 sources, the more detailed answer is that python actually does not use RTLD_LOCAL (nor RTLD_GLOBAL), and what happens when neither of them is used is implementation-dependent. It seems to be RTLD_LOCAL on linux, and RTLD_GLOBAL on mac os x. There also may be consequences on the use of RTLD_LOCAL in embedded mode (I have ancient and bad memories with matlab related to this, but I forgot the details).

See, I knew OS X was quirky :-). That's what I get for trusting dlopen(3). But seriously, what compilers do we support that don't have -fvisibility=hidden? ...Is there even a list of compilers we support available anywhere? -N

Dag Sverre Seljebotn

12:57 p.m.

On 06/27/2012 09:53 PM, Nathaniel Smith wrote:

...

On Wed, Jun 27, 2012 at 8:29 PM, David Cournapeau<cournape@gmail.com> wrote:

...
On Wed, Jun 27, 2012 at 8:07 PM, Nathaniel Smith<njs@pobox.com> wrote:

...
On Wed, Jun 27, 2012 at 7:50 PM, David Cournapeau<cournape@gmail.com> wrote:

...
On Wed, Jun 27, 2012 at 7:17 PM, Nathaniel Smith<njs@pobox.com> wrote:

...
Currently the numpy build system(s) support two ways of building numpy: either by compiling a giant concatenated C file, or by the more conventional route of first compiling each .c file to a .o file, and then linking those together. I gather from comments in the source code that the former is the traditional method, and the latter is the newer "experimental" approach.

It's easy to break one of these builds without breaking the other (I just did this with the NA branch, and David had to clean up after me), and I don't see what value we really get from having both options -- it seems to just double the size of the test matrix without adding value.

There is unfortunately a big value in it: there is no standard way in C to share symbols within a library without polluting the whole process namespace, except on windows where the default is to export nothing.

Most compilers support it (I actually know of none that does not support it in some way or the others), but that's platform-specific.

IIRC this isn't too tricky to arrange for with gcc

No, which is why this is supported for gcc and windows :)

...
, but why is this an issue in the first place for a Python extension module? Extension modules are opened without RTLD_GLOBAL, which means that they *never* export any symbols. At least, that's how it should work on Linux and most Unix-alikes; I don't know much about OS X's linker, except that it's unusual in other ways.

The pragmatic answer is that if it were not an issue, python itself would not bother with it. Every single extension module in python itself is built from a single compilation unit. This is also why we have this awful system to export the numpy C API with array of function pointers instead of simply exporting things in a standard way.

The array-of-function-pointers is solving the opposite problem, of exporting functions *without* having global symbols.

...
See this: http://docs.python.org/release/2.5.3/ext/using-cobjects.html

Looking quickly at the 2.7.3 sources, the more detailed answer is that python actually does not use RTLD_LOCAL (nor RTLD_GLOBAL), and what happens when neither of them is used is implementation-dependent. It seems to be RTLD_LOCAL on linux, and RTLD_GLOBAL on mac os x. There also may be consequences on the use of RTLD_LOCAL in embedded mode (I have ancient and bad memories with matlab related to this, but I forgot the details).

See, I knew OS X was quirky :-). That's what I get for trusting dlopen(3).

But seriously, what compilers do we support that don't have -fvisibility=hidden? ...Is there even a list of compilers we support available anywhere?

You could at the very least switch the default for a couple of releases, introducing a new flag with a "please email numpy-discussion if you use this" note, and see if anybody complains? Dag

David Cournapeau

1:07 p.m.

On Wed, Jun 27, 2012 at 8:57 PM, Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:

...

On 06/27/2012 09:53 PM, Nathaniel Smith wrote:

...
On Wed, Jun 27, 2012 at 8:29 PM, David Cournapeau<cournape@gmail.com> wrote:

...
On Wed, Jun 27, 2012 at 8:07 PM, Nathaniel Smith<njs@pobox.com> wrote:

...
On Wed, Jun 27, 2012 at 7:50 PM, David Cournapeau<cournape@gmail.com> wrote:

...
On Wed, Jun 27, 2012 at 7:17 PM, Nathaniel Smith<njs@pobox.com> wrote:

...
Currently the numpy build system(s) support two ways of building numpy: either by compiling a giant concatenated C file, or by the more conventional route of first compiling each .c file to a .o file, and then linking those together. I gather from comments in the source code that the former is the traditional method, and the latter is the newer "experimental" approach.

It's easy to break one of these builds without breaking the other (I just did this with the NA branch, and David had to clean up after me), and I don't see what value we really get from having both options -- it seems to just double the size of the test matrix without adding value.

There is unfortunately a big value in it: there is no standard way in C to share symbols within a library without polluting the whole process namespace, except on windows where the default is to export nothing.

Most compilers support it (I actually know of none that does not support it in some way or the others), but that's platform-specific.

IIRC this isn't too tricky to arrange for with gcc

No, which is why this is supported for gcc and windows :)

...
, but why is this an issue in the first place for a Python extension module? Extension modules are opened without RTLD_GLOBAL, which means that they *never* export any symbols. At least, that's how it should work on Linux and most Unix-alikes; I don't know much about OS X's linker, except that it's unusual in other ways.

The pragmatic answer is that if it were not an issue, python itself would not bother with it. Every single extension module in python itself is built from a single compilation unit. This is also why we have this awful system to export the numpy C API with array of function pointers instead of simply exporting things in a standard way.

The array-of-function-pointers is solving the opposite problem, of exporting functions *without* having global symbols.

...
See this: http://docs.python.org/release/2.5.3/ext/using-cobjects.html

Looking quickly at the 2.7.3 sources, the more detailed answer is that python actually does not use RTLD_LOCAL (nor RTLD_GLOBAL), and what happens when neither of them is used is implementation-dependent. It seems to be RTLD_LOCAL on linux, and RTLD_GLOBAL on mac os x. There also may be consequences on the use of RTLD_LOCAL in embedded mode (I have ancient and bad memories with matlab related to this, but I forgot the details).

See, I knew OS X was quirky :-). That's what I get for trusting dlopen(3).

But seriously, what compilers do we support that don't have -fvisibility=hidden? ...Is there even a list of compilers we support available anywhere?

You could at the very least switch the default for a couple of releases, introducing a new flag with a "please email numpy-discussion if you use this" note, and see if anybody complains?

Yes, we could. That's actually why I set up travis-CI to build both configurations in the first place :) (see https://github.com/numpy/numpy/issues/315) David

David Cournapeau

1:05 p.m.

On Wed, Jun 27, 2012 at 8:53 PM, Nathaniel Smith <njs@pobox.com> wrote:

...

On Wed, Jun 27, 2012 at 8:29 PM, David Cournapeau <cournape@gmail.com> wrote:

...
On Wed, Jun 27, 2012 at 8:07 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
On Wed, Jun 27, 2012 at 7:50 PM, David Cournapeau <cournape@gmail.com> wrote:

...
On Wed, Jun 27, 2012 at 7:17 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
Currently the numpy build system(s) support two ways of building numpy: either by compiling a giant concatenated C file, or by the more conventional route of first compiling each .c file to a .o file, and then linking those together. I gather from comments in the source code that the former is the traditional method, and the latter is the newer "experimental" approach.

It's easy to break one of these builds without breaking the other (I just did this with the NA branch, and David had to clean up after me), and I don't see what value we really get from having both options -- it seems to just double the size of the test matrix without adding value.

There is unfortunately a big value in it: there is no standard way in C to share symbols within a library without polluting the whole process namespace, except on windows where the default is to export nothing.

Most compilers support it (I actually know of none that does not support it in some way or the others), but that's platform-specific.

IIRC this isn't too tricky to arrange for with gcc

No, which is why this is supported for gcc and windows :)

...
, but why is this an issue in the first place for a Python extension module? Extension modules are opened without RTLD_GLOBAL, which means that they *never* export any symbols. At least, that's how it should work on Linux and most Unix-alikes; I don't know much about OS X's linker, except that it's unusual in other ways.

The pragmatic answer is that if it were not an issue, python itself would not bother with it. Every single extension module in python itself is built from a single compilation unit. This is also why we have this awful system to export the numpy C API with array of function pointers instead of simply exporting things in a standard way.

The array-of-function-pointers is solving the opposite problem, of exporting functions *without* having global symbols.

I meant that the lack of standard around symbols and namespaces is why we have to do those hacks. Most platforms have much better solutions to those problems.

...

...
See this: http://docs.python.org/release/2.5.3/ext/using-cobjects.html

Looking quickly at the 2.7.3 sources, the more detailed answer is that python actually does not use RTLD_LOCAL (nor RTLD_GLOBAL), and what happens when neither of them is used is implementation-dependent. It seems to be RTLD_LOCAL on linux, and RTLD_GLOBAL on mac os x. There also may be consequences on the use of RTLD_LOCAL in embedded mode (I have ancient and bad memories with matlab related to this, but I forgot the details).

See, I knew OS X was quirky :-). That's what I get for trusting dlopen(3).

But seriously, what compilers do we support that don't have -fvisibility=hidden? ...Is there even a list of compilers we support available anywhere?

Well, I am not sure how all this is handled on the big guys (bluegen and co), for once. There is also the issue of the consequence on statically linking numpy to python: I don't what they are (I would actually like to make statically linked numpy into python easier, not harder). David

Nathaniel Smith

1 Jul 1 Jul

10:36 a.m.

On Wed, Jun 27, 2012 at 9:05 PM, David Cournapeau <cournape@gmail.com> wrote:

...

On Wed, Jun 27, 2012 at 8:53 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
But seriously, what compilers do we support that don't have -fvisibility=hidden? ...Is there even a list of compilers we support available anywhere?

Well, I am not sure how all this is handled on the big guys (bluegen and co), for once.

There is also the issue of the consequence on statically linking numpy to python: I don't what they are (I would actually like to make statically linked numpy into python easier, not harder).

All the docs I can find in a quick google seem to say that bluegene doesn't do shared libraries at all, though those may be out of date. Also, it looks like our current approach is not doing a great job of avoiding symbol table pollution... despite all the NPY_NO_EXPORTS all over the source, I still count ~170 exported symbols on Linux with numpy 1.6, many of them with non-namespaced names ("_n_to_n_data_copy", "_next", "npy_tan", etc.) Of course this is fixable, but it's interesting that no-one has noticed. (Current master brings this up to ~300 exported symbols.) It sounds like as far as our "officially supported" platforms go (linux/windows/osx with gcc/msvc), then the ideal approach would be to use -fvisibility=hidden or --retain-symbols-file to convince gcc to hide symbols by default, like msvc does. That would let us remove cruft from the source code, produce a more reliable result, and let us use the more convenient separate build, with no real downsides. (Static linking is trickier because no-one uses it anymore so the docs aren't great, but I think on Linux at least you could accomplish the equivalent by building the static library with 'ld -r ... -o tmp-multiarray.a; objcopy --keep-global-symbol=initmultiarray tmp-multiarray.a multiarray.a'.) Of course there are presumably other platforms that we don't support or test on, but where we have users anyway. Building on such a platform sort of intrinsically requires build system hacks, and some equivalent to the above may well be available (e.g. I know icc supports -fvisibility). So I while I'm not going to do anything about this myself in the near future, I'd argue that it would be a good idea to: - Switch the build-system to export nothing by default when using gcc, using -fvisibility=hidden - Switch the default build to "separate" - Leave in the single-file build, but not "officially supported", i.e., we're happy to get patches but it's not used on any systems that we can actually test ourselves. (I suspect it's less fragile than the separate build anyway, since name clashes are less common than forgotten include files.) -N

David Cournapeau

11:36 a.m.

On Sun, Jul 1, 2012 at 6:36 PM, Nathaniel Smith <njs@pobox.com> wrote:

...

On Wed, Jun 27, 2012 at 9:05 PM, David Cournapeau <cournape@gmail.com> wrote:

...
On Wed, Jun 27, 2012 at 8:53 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
But seriously, what compilers do we support that don't have -fvisibility=hidden? ...Is there even a list of compilers we support available anywhere?

Well, I am not sure how all this is handled on the big guys (bluegen and co), for once.

There is also the issue of the consequence on statically linking numpy to python: I don't what they are (I would actually like to make statically linked numpy into python easier, not harder).

All the docs I can find in a quick google seem to say that bluegene doesn't do shared libraries at all, though those may be out of date.

Also, it looks like our current approach is not doing a great job of avoiding symbol table pollution... despite all the NPY_NO_EXPORTS all over the source, I still count ~170 exported symbols on Linux with numpy 1.6, many of them with non-namespaced names ("_n_to_n_data_copy", "_next", "npy_tan", etc.) Of course this is fixable, but it's interesting that no-one has noticed. (Current master brings this up to ~300 exported symbols.)

It sounds like as far as our "officially supported" platforms go (linux/windows/osx with gcc/msvc), then the ideal approach would be to use -fvisibility=hidden or --retain-symbols-file to convince gcc to hide symbols by default, like msvc does. That would let us remove cruft from the source code, produce a more reliable result, and let us use the more convenient separate build, with no real downsides.

What cruft would it allow us to remove ? Whatever method we use, we need a whitelist of symbols to export. On the exported list I see on mac, most of them are either from npymath (npy prefix) or npysort (no prefix, I think this should be added). Once those are ignored as they should be, there are < 30 symbols exported.

...

(Static linking is trickier because no-one uses it anymore so the docs aren't great, but I think on Linux at least you could accomplish the equivalent by building the static library with 'ld -r ... -o tmp-multiarray.a; objcopy --keep-global-symbol=initmultiarray tmp-multiarray.a multiarray.a'.)

I am not sure why you say that static linking is not used anymore: I have met some people who do statically link numpy into python.

...

Of course there are presumably other platforms that we don't support or test on, but where we have users anyway. Building on such a platform sort of intrinsically requires build system hacks, and some equivalent to the above may well be available (e.g. I know icc supports -fvisibility). So I while I'm not going to do anything about this myself in the near future, I'd argue that it would be a good idea to: - Switch the build-system to export nothing by default when using gcc, using -fvisibility=hidden - Switch the default build to "separate" - Leave in the single-file build, but not "officially supported", i.e., we're happy to get patches but it's not used on any systems that we can actually test ourselves. (I suspect it's less fragile than the separate build anyway, since name clashes are less common than forgotten include files.)

I am fine with making the separate build the default (I have a patch somewhere that does that on supported platforms), but not with using -fvisibility=hidden. When I implemented the initial support around this, fvisibility was buggy on some platforms, including mingw 3.x I don't think changing what our implementation does here is worthwhile given that it works, and fsibility=hidden has no big advantages (you would still need to mark the functions to be exported). David

Nathaniel Smith

12:32 p.m.

On Sun, Jul 1, 2012 at 7:36 PM, David Cournapeau <cournape@gmail.com> wrote:

...

On Sun, Jul 1, 2012 at 6:36 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
On Wed, Jun 27, 2012 at 9:05 PM, David Cournapeau <cournape@gmail.com> wrote:

...
On Wed, Jun 27, 2012 at 8:53 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
But seriously, what compilers do we support that don't have -fvisibility=hidden? ...Is there even a list of compilers we support available anywhere?

Well, I am not sure how all this is handled on the big guys (bluegen and co), for once.

There is also the issue of the consequence on statically linking numpy to python: I don't what they are (I would actually like to make statically linked numpy into python easier, not harder).

All the docs I can find in a quick google seem to say that bluegene doesn't do shared libraries at all, though those may be out of date.

Also, it looks like our current approach is not doing a great job of avoiding symbol table pollution... despite all the NPY_NO_EXPORTS all over the source, I still count ~170 exported symbols on Linux with numpy 1.6, many of them with non-namespaced names ("_n_to_n_data_copy", "_next", "npy_tan", etc.) Of course this is fixable, but it's interesting that no-one has noticed. (Current master brings this up to ~300 exported symbols.)

It sounds like as far as our "officially supported" platforms go (linux/windows/osx with gcc/msvc), then the ideal approach would be to use -fvisibility=hidden or --retain-symbols-file to convince gcc to hide symbols by default, like msvc does. That would let us remove cruft from the source code, produce a more reliable result, and let us use the more convenient separate build, with no real downsides.

What cruft would it allow us to remove ? Whatever method we use, we need a whitelist of symbols to export.

No, right now we don't have a whitelist, we have a blacklist -- every time we add a new function or global variable, we have to remember to add a NPY_NO_EXPORT tag to its definition. Except the evidence says that we don't do that reliably. (Everyone always sucks at maintaining blacklists, that's the nature of blacklists.) I'm saying that we'd better off if we did have a whitelist. Especially since CPython API makes maintaining this whitelist so very trivial -- each module exports exactly one symbol!

...

On the exported list I see on mac, most of them are either from npymath (npy prefix) or npysort (no prefix, I think this should be added). Once those are ignored as they should be, there are < 30 symbols exported.

...
(Static linking is trickier because no-one uses it anymore so the docs aren't great, but I think on Linux at least you could accomplish the equivalent by building the static library with 'ld -r ... -o tmp-multiarray.a; objcopy --keep-global-symbol=initmultiarray tmp-multiarray.a multiarray.a'.)

I am not sure why you say that static linking is not used anymore: I have met some people who do statically link numpy into python.

Yes, of course, or I wouldn't have bothered researching it. But this research would have been easier if there were enough of a user base that the tools makers actually paid any attention to supporting this use case, is all I was saying :-).

...

...
Of course there are presumably other platforms that we don't support or test on, but where we have users anyway. Building on such a platform sort of intrinsically requires build system hacks, and some equivalent to the above may well be available (e.g. I know icc supports -fvisibility). So I while I'm not going to do anything about this myself in the near future, I'd argue that it would be a good idea to: - Switch the build-system to export nothing by default when using gcc, using -fvisibility=hidden - Switch the default build to "separate" - Leave in the single-file build, but not "officially supported", i.e., we're happy to get patches but it's not used on any systems that we can actually test ourselves. (I suspect it's less fragile than the separate build anyway, since name clashes are less common than forgotten include files.)

I am fine with making the separate build the default (I have a patch somewhere that does that on supported platforms), but not with using -fvisibility=hidden. When I implemented the initial support around this, fvisibility was buggy on some platforms, including mingw 3.x

It's true that mingw doesn't support -fvisibility=hidden, but that's because it would be a no-op; windows already works that way by default...

...

I don't think changing what our implementation does here is worthwhile given that it works, and fsibility=hidden has no big advantages (you would still need to mark the functions to be exported).

But there are only about 10 functions that we need to export, and that list never changes; OTOH there are tons and tons of functions that we want to *not* export, and that list changes constantly. -N

David Cournapeau

1:17 p.m.

On Sun, Jul 1, 2012 at 8:32 PM, Nathaniel Smith <njs@pobox.com> wrote:

...

On Sun, Jul 1, 2012 at 7:36 PM, David Cournapeau <cournape@gmail.com> wrote:

...
On Sun, Jul 1, 2012 at 6:36 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
On Wed, Jun 27, 2012 at 9:05 PM, David Cournapeau <cournape@gmail.com> wrote:

...
On Wed, Jun 27, 2012 at 8:53 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
But seriously, what compilers do we support that don't have -fvisibility=hidden? ...Is there even a list of compilers we support available anywhere?

Well, I am not sure how all this is handled on the big guys (bluegen and co), for once.

There is also the issue of the consequence on statically linking numpy to python: I don't what they are (I would actually like to make statically linked numpy into python easier, not harder).

All the docs I can find in a quick google seem to say that bluegene doesn't do shared libraries at all, though those may be out of date.

Also, it looks like our current approach is not doing a great job of avoiding symbol table pollution... despite all the NPY_NO_EXPORTS all over the source, I still count ~170 exported symbols on Linux with numpy 1.6, many of them with non-namespaced names ("_n_to_n_data_copy", "_next", "npy_tan", etc.) Of course this is fixable, but it's interesting that no-one has noticed. (Current master brings this up to ~300 exported symbols.)

It sounds like as far as our "officially supported" platforms go (linux/windows/osx with gcc/msvc), then the ideal approach would be to use -fvisibility=hidden or --retain-symbols-file to convince gcc to hide symbols by default, like msvc does. That would let us remove cruft from the source code, produce a more reliable result, and let us use the more convenient separate build, with no real downsides.

What cruft would it allow us to remove ? Whatever method we use, we need a whitelist of symbols to export.

No, right now we don't have a whitelist, we have a blacklist -- every time we add a new function or global variable, we have to remember to add a NPY_NO_EXPORT tag to its definition. Except the evidence says that we don't do that reliably. (Everyone always sucks at maintaining blacklists, that's the nature of blacklists.) I'm saying that we'd better off if we did have a whitelist. Especially since CPython API makes maintaining this whitelist so very trivial -- each module exports exactly one symbol!

There may be some confusion on what NPY_NP_EXPORT does: it marks a function that can be used between compilation units but is not exported. The choice is between static and NPY_NO_EXPORT, not between NPY_NO_EXPORT and nothing. In that sense, marking something NPY_NO_EXPORT is a whitelist. If we were to use -fvisibility=hidden, we would still need to mark those functions static (as it would otherwise publish functions in the single file build).

...

Yes, of course, or I wouldn't have bothered researching it. But this research would have been easier if there were enough of a user base that the tools makers actually paid any attention to supporting this use case, is all I was saying :-).

...
...
Of course there are presumably other platforms that we don't support or test on, but where we have users anyway. Building on such a platform sort of intrinsically requires build system hacks, and some equivalent to the above may well be available (e.g. I know icc supports -fvisibility). So I while I'm not going to do anything about this myself in the near future, I'd argue that it would be a good idea to: - Switch the build-system to export nothing by default when using gcc, using -fvisibility=hidden - Switch the default build to "separate" - Leave in the single-file build, but not "officially supported", i.e., we're happy to get patches but it's not used on any systems that we can actually test ourselves. (I suspect it's less fragile than the separate build anyway, since name clashes are less common than forgotten include files.)

I am fine with making the separate build the default (I have a patch somewhere that does that on supported platforms), but not with using -fvisibility=hidden. When I implemented the initial support around this, fvisibility was buggy on some platforms, including mingw 3.x

It's true that mingw doesn't support -fvisibility=hidden, but that's because it would be a no-op; windows already works that way by default...

That's not my understanding: gcc behaves on windows as on linux (it would break too many softwares that are the usual target of mingw otherwise), but the -fvisibility flag is broken on gcc 3.x. The more recent mingw supposedly handle this better, but we can't use gcc 4.x because of another issue regarding private dll sharing :) David

Nathaniel Smith

2 Jul 2 Jul

3:34 p.m.

On Sun, Jul 1, 2012 at 9:17 PM, David Cournapeau <cournape@gmail.com> wrote:

...

On Sun, Jul 1, 2012 at 8:32 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
On Sun, Jul 1, 2012 at 7:36 PM, David Cournapeau <cournape@gmail.com> wrote:

...
On Sun, Jul 1, 2012 at 6:36 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
On Wed, Jun 27, 2012 at 9:05 PM, David Cournapeau <cournape@gmail.com> wrote:

...
On Wed, Jun 27, 2012 at 8:53 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
But seriously, what compilers do we support that don't have -fvisibility=hidden? ...Is there even a list of compilers we support available anywhere?

Well, I am not sure how all this is handled on the big guys (bluegen and co), for once.

There is also the issue of the consequence on statically linking numpy to python: I don't what they are (I would actually like to make statically linked numpy into python easier, not harder).

All the docs I can find in a quick google seem to say that bluegene doesn't do shared libraries at all, though those may be out of date.

Also, it looks like our current approach is not doing a great job of avoiding symbol table pollution... despite all the NPY_NO_EXPORTS all over the source, I still count ~170 exported symbols on Linux with numpy 1.6, many of them with non-namespaced names ("_n_to_n_data_copy", "_next", "npy_tan", etc.) Of course this is fixable, but it's interesting that no-one has noticed. (Current master brings this up to ~300 exported symbols.)

It sounds like as far as our "officially supported" platforms go (linux/windows/osx with gcc/msvc), then the ideal approach would be to use -fvisibility=hidden or --retain-symbols-file to convince gcc to hide symbols by default, like msvc does. That would let us remove cruft from the source code, produce a more reliable result, and let us use the more convenient separate build, with no real downsides.

What cruft would it allow us to remove ? Whatever method we use, we need a whitelist of symbols to export.

No, right now we don't have a whitelist, we have a blacklist -- every time we add a new function or global variable, we have to remember to add a NPY_NO_EXPORT tag to its definition. Except the evidence says that we don't do that reliably. (Everyone always sucks at maintaining blacklists, that's the nature of blacklists.) I'm saying that we'd better off if we did have a whitelist. Especially since CPython API makes maintaining this whitelist so very trivial -- each module exports exactly one symbol!

There may be some confusion on what NPY_NP_EXPORT does: it marks a function that can be used between compilation units but is not exported. The choice is between static and NPY_NO_EXPORT, not between NPY_NO_EXPORT and nothing. In that sense, marking something NPY_NO_EXPORT is a whitelist.

If we were to use -fvisibility=hidden, we would still need to mark those functions static (as it would otherwise publish functions in the single file build).

To be clear, this subthread started with the caveat *as far as our "officially supported" platforms go* -- I'm not saying that we should go around and remove all the NPY_NO_EXPORT macros tomorrow. However, the only reason they're actually needed is for supporting platforms where you can't control symbol visibility from the linker, and AFAICT we have no examples of such platforms to hand. So I'm questioning the wisdom of maintaining multiple parallel build systems etc. just for this hypothetical benefit.

...

...
Yes, of course, or I wouldn't have bothered researching it. But this research would have been easier if there were enough of a user base that the tools makers actually paid any attention to supporting this use case, is all I was saying :-).

...
...
Of course there are presumably other platforms that we don't support or test on, but where we have users anyway. Building on such a platform sort of intrinsically requires build system hacks, and some equivalent to the above may well be available (e.g. I know icc supports -fvisibility). So I while I'm not going to do anything about this myself in the near future, I'd argue that it would be a good idea to: - Switch the build-system to export nothing by default when using gcc, using -fvisibility=hidden - Switch the default build to "separate" - Leave in the single-file build, but not "officially supported", i.e., we're happy to get patches but it's not used on any systems that we can actually test ourselves. (I suspect it's less fragile than the separate build anyway, since name clashes are less common than forgotten include files.)

I am fine with making the separate build the default (I have a patch somewhere that does that on supported platforms), but not with using -fvisibility=hidden. When I implemented the initial support around this, fvisibility was buggy on some platforms, including mingw 3.x

It's true that mingw doesn't support -fvisibility=hidden, but that's because it would be a no-op; windows already works that way by default...

That's not my understanding: gcc behaves on windows as on linux (it would break too many softwares that are the usual target of mingw otherwise), but the -fvisibility flag is broken on gcc 3.x. The more recent mingw supposedly handle this better, but we can't use gcc 4.x because of another issue regarding private dll sharing :)

I don't have windows to test, but everyone else on the internet seems to think mingw works the way I said, with __declspec and all... you aren't thinking of cygwin, are you? (see e.g. http://mingw.org/wiki/sampleDLL) -N

David Cournapeau

4:29 p.m.

On Mon, Jul 2, 2012 at 11:34 PM, Nathaniel Smith <njs@pobox.com> wrote:

...

To be clear, this subthread started with the caveat *as far as our "officially supported" platforms go* -- I'm not saying that we should go around and remove all the NPY_NO_EXPORT macros tomorrow.

However, the only reason they're actually needed is for supporting platforms where you can't control symbol visibility from the linker, and AFAICT we have no examples of such platforms to hand.

I gave you one, mingw 3.x. Actually, reading a bit more around, it seems this is not specific to mingw, but all gcc < 4 (http://gcc.gnu.org/gcc-4.0/changes.html#visibility)

...

I don't have windows to test, but everyone else on the internet seems to think mingw works the way I said, with __declspec and all... you aren't thinking of cygwin, are you? (see e.g. http://mingw.org/wiki/sampleDLL)

Well, I did check myself, but looking more into it, I was tricked by nm output, which makes little sense on windows w.r.t. visibility with dll. You can define the same function in multiple dll, they will all appear as a public symbol (T label with nm), but the windows linker will not see them when linking for an executable. I am still biased toward the conservative option, especially that it is still followed by pretty much every C extension out there (including python itself). I trust their experience in dealing with cross platform more than ours. I cannot find my patch for detecting platforms where this can safely become the default, I will reprepare one. David

4399

Age (days ago)

4404

Last active (days ago)

List overview

Download

13 comments

3 participants

participants (3)

Dag Sverre Seljebotn
David Cournapeau
Nathaniel Smith

Combined versus separate build

tags

participants (3)