On Thu, Jun 5, 2014 at 11:41 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Thu, Jun 5, 2014 at 3:36 AM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
> @nathaniel IIRC, one of the objections to the missing values work was that
> it changed the underlying array object by adding a couple of variables to
> the structure. I'm willing to do that sort of thing, but it would be good to
> have general agreement that that is acceptable.

I can't think of reason why adding new variables to the structure *per
se* would be objectionable to anyone? IIRC the objection you're
thinking of wasn't to the existence of new variables, but to their
effect on compatibility: their semantics meant that every piece of
legacy C code that worked with ndarrays had to be updated to check for
the new variables before it could correctly interpret the ->data
field, and if it wasn't updated it would just return incorrect
results. And there wasn't really a clear story for how we were going
to detect and fix all this legacy code. This specific kind of
compatibility break does seem pretty objectionable, but that's because
of the details of its behaviour, not because variables in general are
problematic, I think.

> As to blaze/dynd, I'd like to steal bits here and there, and maybe in the
> long term base numpy on top of it with a compatibility layer. There is a lot
> of thought and effort that has gone into those projects and we should use
> what we can. As is, I think numpy is good for another five to ten years and
> will probably hang on for fifteen, but it will be outdated by the end of
> that period. Like great whites, we need to keep swimming just to have
> oxygen. Software projects tend to be obligate ram ventilators.

I worry a bit that this could become a self-fulfilling prophecy.
Plenty of software survives longer than that; the Linux kernel hasn't
had a "real" major number increase [1] since 2.6.0, more than 10 years
ago, and it's still an extraordinarily vital project. Partly this is
because they have resources we don't etc., but partly it's just
because they've decided that incremental change is how they're going
to do things, and approached each new feature with that in mind. And
in ten years they haven't yet found any features that required a
serious compatibility break.

This is a pretty minor worry though -- we don't have to agree about
what will happen in 10 years to agree about what to do now :-).

[1] http://www.pcmag.com/article2/0,2817,2388926,00.asp

> The Python 3 experience is definitely something we want to avoid. And while
> blaze does big data and offers some nice features, I don't know that it
> offers compelling reasons to upgrade to the more ordinary user at this time,
> so I'd like to sort of slip it into numpy if possible.
>
> If we do start moving numpy forward in more radical steps, we should try to
> have some agreement beforehand as to what sort of changes are acceptable.
> For instance, to maintain backward compatibility, is it sufficient that a
> recompile will do the job, or do we require forward compatibility for
> extensions compiled against earlier releases?

I find it hard to discuss these things in general, since specific
compatibility issues usually involve complicated trade-offs -- will
every package have to recompile or just some of them, if they don't
will it be a nice error message or a segfault, is there some way we
can issue warnings ahead of time for the offending behaviour, etc.
etc.

That said, my vote is that if there's a change that (a) can't be done
some other way, (b) requires a recompile, (c) doesn't cause segfaults
but rather produces some sensible error message like "ABI mismatch
please recompile", (d) is a change that's worth the bother (this
determination to include at least canvassing the list to check that
users in general agree that it's worth it), then yeah we should do it.
I don't anticipate that this will happen very often given how far
we've gotten without it, but yeah.

Changing the ABI 'safely' (i.e. raise a python exception if changed) is already handled in numpy. We can always increase the ABI version if we think it is worth it 

David