On Sat, May 7, 2016 at 6:17 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Do you expect that
projects ... should (somehow) contain simplified instructions on how to
build the various C/Fortran extensions supplied in the bundle as
source code?

Essentially, yes. I'm not sure how achievable it would
be, but ideally that's what I'd like.

I think we've all come to conclusion that that's simply not possible -- build configuration cannot be purely declarative in the general case -- if you are building a package you are going to be running arbitrary code (and really, if you're running a compiler, you're doing that anyway, so there isn't an additional security issue here -- if you trust the code, you might as well trust the build script)

> On a unix system, most of the time they would all be in
well-known locations. If I install something in an unusual
place or in an unusual way, I'm going to have to tell
something about it anyway. I don't see how an executable
setup file provided by the package author is going to
magically figure out all the weird stuff I've done.

You can look at any number of packages -- essentially, they do what configure scripts do with autotools -- various hacky ways to find the stuff it's looking for.

and users really want this -- maybe on a "normal" *nix system there isn't much to it, but on OS-X there sure is -- people may have hand installed the dependencies, or they may have used fink, or macports, or homebrew, or... and, except for the hand-install case, they may have no idea what the heck they did and where it is.

I don't know if there are conventions for such things on
Windows. I suspect not, in which case manual input is
going to be needed one way or another.

Windows is even worse -- not only no conventions, but also no common package manager, either (at least OS-X is limited to four :-) )

Not all of it, only the parts that strictly have to be
performed as part of the build step of the install
process, to use your terminology. That's a fairly
restricted subset of the whole problem of compiling
software.

I don't think it's possible (or desirable) to make a clear distinction between "the build step of the install process" and building in general.

Could a purely declarative config file be flexible
enough to handle this? I don't know. The distutils
API is actually pretty declarative already if you
use it in a straightforward way.

indeed -- and for the common cases, that works fine, but there's always SOMETHING That some weird case is going to need.

You could, I suppose, separate out the configuration step from the build step -- analogous to ./configure, vs make.

So the configure step would generate a purely declarative config file of some sort, and then the build step would use that. In the simple case, there might be no need for a configure step at all.

though I'm not sure even this is possible -- for instance, numpy used a heavily enhanced distutils to do it's thing. Cython extends distutils to understand the "build the C from the pyx" step, etc.... This is still used decoratively, but it's third party code that is doing part of the build step -- so in order to make the build system extendable, you need to it run code....

Anyway, I thought it was clear that we need to make the distinction between building and installing/packaging, etc clear -- both form an API perspective and a user perspective. So at this point, we need to establish and API to the build system (probably compatible with what we have (setup.py build) but we leave it up to the build system to figure out how to do it's thing -- maybe someone will come up with a purely declarative system -- who knows?

>  Running Pyrex to generate .c files
from .pyx files is one that I've encountered.
(I encouraged that one myself by including a distutils
extension for it, which I later decided had been a
mistake.)

I don't think it was a mistake :-) -- that's got to be done some time -- why add another layer???

That's nice, but it wouldn't help me when I encounter
a package that *hadn't* been set up to use gregsbuild. :-(

sure -- but we can't have a one-build-system-to-rule-them-all until we can first have a way to have ANY build system other than setuptools :-)

Key issue here:

Right now, in order to make things easier for users, and due to history, the whole build/package/install processes are fully intermingled. I think this is a BAD THING, for two reasons:

1) you can't replace any of the tools individually (except I suppose the packaging part, as it's the last step)

2) user's don't know what's going on, or what's going to happen when they try to "intsall" something:

It's great that you can "just run pip install" and it often "just works" -- but when it doesn't it's kind of ugly. And there are security concerns: When a user runs:

pip install some_package

They don't know what's going to happen. pip goes and looks on PyPi to see if it's there.

If it's there, it looks for a binary wheel that's compatible with the current system. if there is one, then it gets installed (and hopefully works :-)

If it's not there, then it downloads the source archive from pypi, and tries to build and install it, by monkey patching setuptools, and then running "setup.py install".

At this point, essentially arbitrary code is being run (which makes my IT folks nervous, thought that code came from the same place a binary would have...)

This whole thing is great for pure python packages (which don't really need "building" anyway), but not so great for anything that needs compilation, and particularly anything that needs third-party libs to compile.

I'd like to see binary packaging more clearly separated from source, needs to be built, packaging -- so the systems can be more flexible, and so it's clear to users what exactly is happening or what's wrong when it doesn't work.

-CHB

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov