[Numpy-discussion] Proposed Roadmap Overview

David Cournapeau cournape at gmail.com
Sat Feb 18 16:17:03 EST 2012


On Sat, Feb 18, 2012 at 8:45 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Sat, Feb 18, 2012 at 1:39 PM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>>
>> Hi,
>>
>> On Sat, Feb 18, 2012 at 12:35 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> >
>> > On Sat, Feb 18, 2012 at 12:21 PM, Matthew Brett
>> > <matthew.brett at gmail.com>
>> > wrote:
>> >>
>> >> Hi.
>> >>
>> >> On Sat, Feb 18, 2012 at 12:18 AM, Christopher Jordan-Squire
>> >> <cjordan1 at uw.edu> wrote:
>> >> > On Fri, Feb 17, 2012 at 11:31 PM, Matthew Brett
>> >> > <matthew.brett at gmail.com> wrote:
>> >> >> Hi,
>> >> >>
>> >> >> On Fri, Feb 17, 2012 at 10:18 PM, Christopher Jordan-Squire
>> >> >> <cjordan1 at uw.edu> wrote:
>> >> >>> On Fri, Feb 17, 2012 at 8:30 PM, Sturla Molden <sturla at molden.no>
>> >> >>> wrote:
>> >> >>>>
>> >> >>>>
>> >> >>>> Den 18. feb. 2012 kl. 05:01 skrev Jason Grout
>> >> >>>> <jason-sage at creativetrax.com>:
>> >> >>>>
>> >> >>>>> On 2/17/12 9:54 PM, Sturla Molden wrote:
>> >> >>>>>> We would have to write a C++ programming tutorial that is based
>> >> >>>>>> on
>> >> >>>>>> Pyton knowledge instead of C knowledge.
>> >> >>>>>
>> >> >>>>> I personally would love such a thing.  It's been a while since I
>> >> >>>>> did
>> >> >>>>> anything nontrivial on my own in C++.
>> >> >>>>>
>> >> >>>>
>> >> >>>> One example: How do we code multiple return values?
>> >> >>>>
>> >> >>>> In Python:
>> >> >>>> - Return a tuple.
>> >> >>>>
>> >> >>>> In C:
>> >> >>>> - Use pointers (evilness)
>> >> >>>>
>> >> >>>> In C++:
>> >> >>>> - Return a std::tuple, as you would in Python.
>> >> >>>> - Use references, as you would in Fortran or Pascal.
>> >> >>>> - Use pointers, as you would in C.
>> >> >>>>
>> >> >>>> C++ textbooks always pick the last...
>> >> >>>>
>> >> >>>> I would show the first and the second method, and perhaps
>> >> >>>> intentionally forget the last.
>> >> >>>>
>> >> >>>> Sturla
>> >> >>>>
>> >> >>
>> >> >>> On the flip side, cython looked pretty...but I didn't get the
>> >> >>> performance gains I wanted, and had to spend a lot of time figuring
>> >> >>> out if it was cython, needing to add types, buggy support for
>> >> >>> numpy,
>> >> >>> or actually the algorithm.
>> >> >>
>> >> >> At the time, was the numpy support buggy?  I personally haven't had
>> >> >> many problems with Cython and numpy.
>> >> >>
>> >> >
>> >> > It's not that the support WAS buggy, it's that it wasn't clear to me
>> >> > what was going on and where my performance bottleneck was. Even after
>> >> > microbenchmarking with ipython, using timeit and prun, and using the
>> >> > cython code visualization tool. Ultimately I don't think it was
>> >> > cython, so perhaps my comment was a bit unfair. But it was
>> >> > unfortunately difficult to verify that. Of course, as you say,
>> >> > diagnosing and solving such issues would become easier to resolve
>> >> > with
>> >> > more cython experience.
>> >> >
>> >> >>> The C files generated by cython were
>> >> >>> enormous and difficult to read. They really weren't meant for human
>> >> >>> consumption.
>> >> >>
>> >> >> Yes, it takes some practice to get used to what Cython will do, and
>> >> >> how to optimize the output.
>> >> >>
>> >> >>> As Sturla has said, regardless of the quality of the
>> >> >>> current product, it isn't stable.
>> >> >>
>> >> >> I've personally found it more or less rock solid.  Could you say
>> >> >> what
>> >> >> you mean by "it isn't stable"?
>> >> >>
>> >> >
>> >> > I just meant what Sturla said, nothing more:
>> >> >
>> >> > "Cython is still 0.16, it is still unfinished. We cannot base NumPy
>> >> > on
>> >> > an unfinished compiler."
>> >>
>> >> Y'all mean, it has a zero at the beginning of the version number and
>> >> it is still adding new features?  Yes, that is correct, but it seems
>> >> more reasonable to me to phrase that as 'active development' rather
>> >> than 'unstable', because they take considerable care to be backwards
>> >> compatible, have a large automated Cython test suite, and a major
>> >> stress-tester in the Sage test suite.
>> >>
>> >
>> > Matthew,
>> >
>> > No one in their right mind would build a large performance library using
>> > Cython, it just isn't the right tool. For what it was designed for -
>> > wrapping existing c code or writing small and simple things close to
>> > Python
>> > - it does very well, but it was never designed for making core C/C++
>> > libraries and in that role it just gets in the way.
>>
>> I believe the proposal is to refactor the lowest levels in pure C and
>> move the some or most of the library superstructure to Cython.
>
>
> Go for it.

The proposal of moving to a core C + cython has been discussed by
multiple contributors. It is certainly a valid proposal. *I* have
worked on this (npymath, separate compilation), although certainly not
as much as I would have wanted to. I think much can be done in that
vein. Using the "shut up if you don't do it" is a straw man (and
uncalled for).

Moving away from subjective considerations on how to do things, is
there a way that one can see the pros/cons of each approach. For the
C++ approach, I would really like to see which C++ is being
considered. I was. Once the choice is done, going back would be quite
hard, so I can't see how we could go for it just because some people
prefer it without very clear technical arguments.

Saying that C++ is more readable, or scale better are frankly very
weak and too subjective to be convincing. There are too many projects
way more complex than numpy that have been done in either C or C++.

David



More information about the NumPy-Discussion mailing list