[Numpy-discussion] Created NumPy 1.7.x branch

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Sat Jun 23 03:32:08 EDT 2012


On 06/23/2012 05:14 AM, Charles R Harris wrote:
>
>
> On Fri, Jun 22, 2012 at 2:42 PM, Travis Oliphant <travis at continuum.io
> <mailto:travis at continuum.io>> wrote:
>
>>
>>     The usual practice is to announce a schedule first.
>
>     I just did announce the schedule.
>
>
> What has been done in the past is that an intent to fork is announced
> some two weeks in advance so that people can weigh in on what needs to
> be done before the fork. The immediate fork was a bit hasty. Likewise,
> when I suggested going to the github issue tracking, I opened a
> discussion on needed tags, but voila, there it was with an incomplete
> set and no discussion. That to seemed hasty.
>
>>
>>         There is time before the first Release candidate to make
>>         changes on the 1.7.x branch.   If you want to make the changes
>>         on master, and just indicate the Pull requests, Ondrej can
>>         make sure they are added to the 1.7.x. branch by Monday.    We
>>         can also delay the first Release Candidate by a few days to
>>         next Wednesday and then bump everything 3 days if that will
>>         help.     There will be a follow-on 1.8 release before the end
>>         of the year --- so there is time to make changes for that
>>         release as well.    The next release will not take a year to
>>         get out, so we shouldn't feel pressured to get *everything* in
>>         this release.
>>
>>
>>     What are we going to do for 1.8?
>
>     Let's get 1.7 out the door first.
>
>
> Mark proposed a schedule for the next several releases, I'd like to know
> if we are going to follow it.
>
>
>>
>>     Yes, the functions will give warnings otherwise.
>
>     I think this needs to be revisited.  I don't think these changes are
>     necessary for *every* use of macros.   It can cause a lot of effort
>     for people downstream without concrete benefit.
>
>
> The idea is to slowly move towards hiding the innards of the array type.
> This has been under discussion since 1.3 came out. It is certainly the
> case that not all macros need to go away.
>
>
>>
>>           That's not as nice to type.
>>
>>
>>     So? The point is to have correctness, not ease of typing.
>
>     I'm not sure if a pun was intended there or not.    C is not a safe
>     and fully-typed system.    That is one of its weaknesses according
>     to many.  But, I would submit that not being forced to give
>     everything a "type" (and recognizing the tradeoffs that implies) is
>     also one reason it gets used.
>
>
> C was famous for bugs due to the lack of function prototypes. This was
> fixed with C99 and the stricter typing was a great help.
>
>
>
>>
>>          Is that assuming that PyArray_NDIM will become a function and
>>         need a specific object type for its argument (and everything
>>         else cast....).   That's one clear disadvantage of inline
>>         functions versus macros in my mind:  no automatic polymorphism.
>>
>>
>>     That's a disadvantage of Python. The virtue of inline functions is
>>     precisely type checking.
>
>     Right, but we need to be more conscientious about this.   Not every
>     use of Macros should be replaced by inline function calls and the
>     requisite *forced* type-checking.   type-chekcing is not
>     *universally* a virtue --- if it were, nobody would use Python.
>
>>
>>         I don't think type safety is a big win for macros like these.
>>             We need to be more judicious about which macros are
>>         scheduled for function inlining.  Some just don't benefit from
>>         the type-safety implications as much as others do, and you end
>>         up requiring everyone to change their code downstream for no
>>         real reason.
>>
>>         These sorts of changes really feel to me like unnecessary
>>         spelling changes that require work from extension writers who
>>         now have to modify their code with no real gain.   There seems
>>         to be a lot of that going on in the code base and I'm not
>>         really convinced that it's useful for end-users.
>>
>>
>>     Good style and type checking are useful. Numpy needs more of both.
>
>     You can assert it, but it doesn't make it so. "Good style" depends
>     on what you are trying to accomplish and on your point of view.
>       NumPy's style is not the product of one person, it's been adapted
>     from multiple styles and inherits quite a bit from Python's style.
>     I don't make any claims for it other than it allowed me to write it
>     with the time and experience I had 7 years ago.    We obviously
>     disagree about this point.  I'm sorry about that.  I'm pretty
>     flexible usually --- that's probably one of your big criticisms of
>     my "style".
>
>
> Curiously, my criticism would be more that you are inflexible, slow to
> change old habits.
>
>
>     But, one of the things I feel quite strongly about is how hard we
>     make it for NumPy users to upgrade.    There are two specific things
>     I disagree with pretty strongly:
>
>     1) Changing defined macros that should work the same on
>     PyArrayObjects or PyObjects to now *require* types --- if we want to
>     introduce new macros that require types than we can --- as long as
>     it just provides warnings but still compiles then I suppose I could
>     find this acceptable.
>
>     2) Changing MACROS to require semicolons when they were previously
>     not needed.    I'm going to be very hard-nosed about this one.
>
>>
>>         I'm going to be a lot more resistant to that sort of change in
>>         the code base when I see it.
>>
>>
>>     Numpy is a team effort. There are people out there who write
>>     better code than you do, you should learn from them.
>
>     Exactly!  It's a team effort.   I'm part of that team as well, and
>     while I don't always have strong opinions about things.  When I do,
>     I'm going to voice it.
>
>     I've learned long ago there are people that write better code than
>     me.    There are people that write better code than you.
>
>
> Of course. Writing code is not my profession, and even if it were, there
> are people out there who would be immeasurable better. I have tried to
> improve my style over the years by reading books and browsing code by
> people who are better than me. I also recognize common bad habits naive
> coders tend to pick up when they start out, not least because I have at
> one time or another had many of the same bad habits.
>
>     That is not the question here at all.     The question here is not
>     requiring a *re-write* of code in order to get their extensions to
>     compile using NumPy headers.    We should not be making people
>     change their code to get their extensions to compile in NumPy 1.X
>
>
> I think a bit of rewrite here and there along the way is more palatable
> than a big change coming in as one big lump, especially if the changes
> are done with a long term goal in mind. We are working towards a Numpy
> 2, but we can't just go off for a year or two and write it, we have to
> get there step by step. And that requires a plan.

To me you sound like you expect that people just need to change, say,

PyArray_SHAPE(obj)

to

PyArray_SHAPE((PyArrayObject*)obj)

But that's not the reality. The reality is that most users of the NumPy 
C API are required to do:

#if WHATEVERNUMPYVERSIONDEFINE > 0x...
PyArray_SHAPE(obj)
#else
PyArray_SHAPE((PyArrayObject*)obj)
#endif

or, perhaps, PyArray_SHAPE(CAST_IF_NEW_NUMPY obj).

Or perhaps write a shim wrapper to insulate themselves from the NumPy API.

At least if you want to cleanly compile against all the last ~3 versions 
of NumPy cleanly without warnings -- which any good developer wishes 
(unless there are *features* in newer versions that make a hard 
dependency on the newest version logical). Thus, cleaning up the NumPy 
API makes users' code much more ugly and difficult to read.

"Gradual changes along the way" means there will be lots of different 
#if tests like that, which is at least harder to remember and work with 
than a single #if test for 1.x vs 2.x.

Dag



More information about the NumPy-Discussion mailing list