[SciPy-dev] SciPy Foundation

David Cournapeau david at ar.media.kyoto-u.ac.jp
Tue Aug 4 04:35:02 EDT 2009


Sebastian Walter wrote:
> 2 cents from an outsider who thought about contributing to
> scipy/scikits (but didn't (yet)):
>
> I think it is a good idea to make scipy easy to use for beginners.
> However, after reading this thread, I have the impression that it is
> not the goal to provide state of the art algorithms but rather making
> Scipy as popular as possible by putting money and effort into the
> "marketing" of Scipy.
> Don't get me wrong, I think there are some good reasons why a project
> should thrive for a large user base. Some of the best projects are
> popular.
> Alas, correlation does not imply causality.
>
> Me for instance, would rather like to see more efforts to get state of
> the art algorithms to be implemented in Scipy because that's something
> that would make a real difference in my research work. Of course,
> targeting the "clueless Matlab" users is quite pointless if it is that
> what you are after.
>   

One point which has not been mentioned concerning matlab-like
environment - maybe it is obvious and everyone implicitly acknowledges
it, but Mathworks is a 30 years old company, with > 1000 people today.

Building something like matlab, with a good GUI and top notch
documentation takes a huge amount of resources, of which the 'useful'
code is only a fraction. I of course don't know the details of matlab
implementation, but I know that for music oriented softwares (which need
good UI to sell well, and have non trivial computational requirements,
so the comparison is not totally stupid), the graphical code is 80 % of
the code. This ratio is consistent with the big open source audio
softwares as well (ardour, rosegarden). Worse, being cross platform
makes the problem much more difficult. For music softwares market, mac
os x is rarely ignored (~ 40-50% of the market I believe), so people
need to support two platforms, and that's really a lot of work. For
scientific software, I think you can go the non native route for the
graphical toolkit, though.

Also, very few open source software are successful as far as good GUI
are concerned (I don't want to enter into a debate here, but there are
good documents/studies on this topic). You need financial incentive for
this, so only projects backed up by big companies managed to pull it of.

IOW, I am pretty pessimistic about being a 'matlab' clone. We should
rather shoot for what makes numpy/scipy better (extensibility, cross
platform, actual language, etc...), because really, matlab will always
be a much better matlab than us. Price and licensing are not good enough
to justify migration - if what you want is a free matlab clone, why not
using octave or scilab anyway.

That does NOT mean that we should not aim at making the software more
accessible. I (and I guess other developers) are definitely interested
in a more product-like, integrated stack, to make the barrier of entry
lower. I for example am really tired of the installation problems
consistently reported. I feel like we cover mac os x and windows pretty
well now, but the linux situation is still dreadful. I have a few ideas
on how to improve the situation, but they all requires quite a bit of
work/infrastructure. I hope that soon, the scenario "I see this cool
python script on the internet, it requires this numpy/scipy thing, can I
try it in 2 minutes ?" will be a reality.

> Then you really get some "killer applications". I could name a few
> people who are coding some cool state of the art algorithms but waste
> so much time because they started coding directly in C++. In the
> meantime, they could have implemented the algorithms in Python _and_
> in C++. If scipy had something really good that Matlab etc. do not
> have: guess what ppl would do....
>   

Yes, there are a lot of people who still don't know that there are
languages outside Fortran, C and C++. In my field, I still see some
people who implement parsers in C...

> 1) an easy, modular and flexible build system (fortran, c, c++, D,
> swig, boost:python, cython,...)
>   

you mean like numscons :) ? Adding D support to numscons should be easy.
For example, I added initial cython support in a couple of minutes
during the cython talk at SciPy08, adding new languages is relatively
easy thanks to scons.

> 2) very low entry barrier for possible contributors:
>   a simple checkout, then  ./manage.py startapp  mycoolmodule
>   and everything is ready to go ( "Start coding in 5 minutes!")
>   

there are various pieces to enable this (in place build, develop command
of setuptools, virtualenv/pip/easy_install), but yes, the situation is
kind of messy. For scikits, that's not so difficult  - you should be
able to implement a trivial scikit by copying the scikits.example
package and starting from there.

One problem is that it is technically impossible to build in place and
test in one go because of a nose limitation ATM (for some reason, nose
fails to import a package if it is in the current directory).

> 3) a distributed version control system (e.g. git). SVN really scares me off...
>   

That's a sensitive issue, I think we should avoid starting this one here
:) Needless to say, you can use git-svn - several core developers use it
for numpy/scipy dev, and we distribute an official import:

http://projects.scipy.org/numpy/browse_git

At least I have not touched svn for numpy/scipy development for > 6
months now, except to check releases when I tag them.

> 4) standardized unit tests
>   

What do you mean exactly here ? We use nose for testing, what do you
consider "non standard".

> 5) automated documentation generation
>   

It is almost automated now - but an example for scikits is missing in
the example package :)

cheers,

David



More information about the SciPy-Dev mailing list