About alternatives to Matlab

Mark Morss mfmorss at aep.com
Tue Dec 5 10:37:06 EST 2006


Hans >Langtangen<, rather.
Mark Morss wrote:
> I doubt that anyone would dispute that even as boosted by Numpy/Scipy,
> Python will almost certainly be notably slower than moderately
> well-written code in a compiled language.  The reason Numpy exists,
> however, is not to deliver the best possible speed, but to deliver
> enough speed to make it possible to solve large numerical problems with
> the powerful and flexible Python language.  As observed by Hans
> Latangen in _Python Scripting for Computational Science_, 2nd ed.,
> Springer 2005, scientific computing is more than number crunching:
>
> "Very often programming is about shuffling data in and out of different
> tools, converting one data format to another, extracting numerical data
> from a text, and administering numerical experiments involving a large
> number of data files and directories.  Such tasks are much faster to
> accomplish in a language like Python than in Fortran, C, C++ or Java."
>
> He might well have mentioned the importance of developing nice-looking
> reports once the analysis is complete, and that development is simpler
> and more flexible in Python than a compiled language.
>
> In principle, I agree that heavy-duty number-crunching, at least if it
> has to be repeated again and again, should be accomplished by means of
> a compiled language.  However, if someone has to solve many different
> problems just one time, or just a few times (for example if you are an
> engineering consultant), there is an excellent argument for using
> Python + Numpy.  Unless it affects feasibility, I opine, computational
> speed is important primarily in context of regular production, e.g.,
> computing the daily "value at risk" in a commodity trading portfolio,
> or making daily weather predictions.
>
> I am aware of the power and flexibility of the OCaml language, and
> particularly that an OCaml user can easily switch back and forth
> between interpreted and compiled implementation.  I'm attacted to OCaml
> and, indeed, I'm in the process of reading Smith's (unfortunately not
> very well-written) _Practical OCaml_.  However, I also understand that
> OCaml supports only double-precision implementation of real numbers;
> that its implementation of arrays is a little clunky compared to
> Fortran 95 or Numpy (and I suspect not as fast as Fortran's); and that
> the available libraries, while powerful, are by no means as
> comprehensive as those available for Python.  For example, I am unaware
> of the existance of an HDF5 interface for OCaml.
>
> In summary, I think that OCaml is an excellent language, but I think
> that the question of whether to use it in preference to Python+Numpy
> for general-purpose numerical analysis must rest on much more than its
> computational speed.
>
> Jon Harrop wrote:
> > Filip Wasilewski wrote:
> > > Besides of that this code is irrelevant to the original one and your
> > > further conclusions may not be perfectly correct. Please learn first
> > > about the topic of your benchmark and different variants of wavelet
> > > transform, namely difference between lifting scheme and dwt, and try
> > > posting some relevant examples or use other topic for your benchmarks.
> >
> > Your lifting implementation is slightly longer and slower than the
> > non-lifting one but its characteristics are identical. For n=2^25 I get:
> >
> > 1.88s C++ (816 non-whitespace bytes)
> > 2.00s OCaml (741 b)
> > 2.33s F# (741 b)
> > 9.83s Your Python (1,002 b)
> >
> > The original python was 789 bytes.
> >
> > > Neglecting this issues, I wonder if it is equally easy to juggle
> > > arbitrary multidimensional arrays in a statically typed language and do
> > > operations on selected axis directly from the interactive interpreter
> > > like in the NumPy example from my other post -
> > > http://groups.google.com/group/comp.lang.python/msg/a71a5bf00a88ad57 ?
> > > I don't know much OCaml so it would be great if you could elaborate on
> > > this.
> >
> > It is probably just as easy. Instead of dynamic typing you have parametric
> > polymorphism. If you want to make your function generic over arithmetic
> > type then you can pass in the arithmetic operators.
> >
> > >> 0.56s C++ (direct arrays)
> > >> 0.61s F# (direct arrays)
> > >> 0.62s OCaml (direct arrays)
> > >> 1.38s OCaml (slices)
> > >> 2.38s Python (slices)
> > >> 10s Mathematica 5.1
> > >>
> > >> Note that all implementations are safe (e.g. C++ uses a.at(i) instead of
> > >> a[i]).
> > >
> > > So why not use C++ instead of all others if the speed really matters?
> > > What is your point here?
> >
> > 1. Benchmarks should not just include two inappropriate languages.
> > 2. Speed aside, the other languages are more concise.
> >
> > > Could you please share full benchmark code, so we could make
> > > conclusions too?
> >
> > I'll paste the whole programs at the end...
> >
> > > I would like to get to know about your benchmark
> > > methodology. I wonder if you are allocating the input data really
> > > dynamically or whether it's size is a priori knowledge for the
> > > compiler.
> >
> > The knowledge is there for the compiler to use but I don't believe any of
> > them exploit it.
> >
> > >> In this specific context (discrete wavelet transform benchmark), I'd have
> > >> said that speed was the most important thing after correctness.
> > >
> > > Not necessarily, if you are doing research with different kinds of
> > > wavelets and need a general, flexible and easily adaptable tool or just
> > > the computation speed is not considered a bottleneck.
> >
> > Brevity is probably next.
> >
> > > Language speed is a great advantage, but if it always was the most
> > > important, people would talk directly to the CPUs or knit DSP chips in
> > > theirs backyards whenever they need to solve a calculation problem.
> >
> > Sure.
> >
> > > Please don't make this discussion heading towards `what is the fastest
> > > way of doing d4 transform and why OCaml rules` instead of `what is the
> > > optimal way of doing things under given circumstances and still have a
> > > free afternoon ;-)`.
> >
> > Comments like that seem to be based on the fundamental misconception that
> > writing C++ like this is hard.
> >
> > > Different tasks need different, well-balanced
> > > measures. Neither Python nor OCalm nor any other language is a panacea
> > > for every single problem.
> >
> > Absolutely. OCaml is as good as the next (compiled) language in this case.
> > Python and Matlab seem to be particularly bad at this task.
> >
> > Here's my complete OCaml:
> >
> > let a = sqrt 3. and b = 4. *. sqrt 2.
> > let h0, h1, h2, h3 =
> >   (1. +. a) /. b, (3. +. a) /. b, (3. -. a) /. b, (1. -. a) /. b
> > let g0, g1, g2, g3 = h3, -.h2, h1, -.h0
> >
> > let rec d4_aux tmp a n =
> >   let n2 = n lsr 1 in
> >   for i=0 to n2-2 do
> >     tmp.(i) <- a.(i*2)*.h0+.a.(i*2+1)*.h1+.a.(i*2+2)*.h2+.a.(i*2+3)*.h3;
> >     tmp.(i+n2) <- a.(i*2)*.g0+.a.(i*2+1)*.g1+.a.(i*2+2)*.g2+.a.(i*2+3)*.g3;
> >   done;
> >   tmp.(n2-1) <- a.(n-2)*.h0 +. a.(n-1)*.h1 +. a.(0)*.h2 +. a.(1)*.h3;
> >   tmp.(2*n2-1) <- a.(n-2)*.g0 +. a.(n-1)*.g1 +. a.(0)*.g2 +. a.(1)*.g3;
> >   Array.blit tmp 0 a 0 n;
> >   if n > 4 then d4_aux tmp a (n lsr 1)
> >
> > let d4 a =
> >   let tmp = Array.make (Array.length a) 0. in
> >   d4_aux tmp a (Array.length a)
> >
> > let () =
> >   print_endline "Generating test data...";
> >   let x = Array.init (1 lsl 25) (fun _ -> Random.float 1.) in
> >   print_endline "Benchmarking...";
> >   let t1 = Sys.time() in
> >   ignore(d4 x);
> >   Printf.printf "Elapsed time is %.6f seconds\n" (Sys.time() -. t1)
> >
> > and C++:
> >
> > #include <iostream>
> > #include <vector>
> > #include <ctime>
> > #include <cmath>
> >
> > using namespace std;
> >
> > double a = sqrt(3), b = 4 * sqrt(2);
> > double h0 = (1 + a) / b, h1 = (3 + a) / b, h2 = (3 - a) / b, h3 = (1 - a) /
> > b;
> > double g0 = h3, g1 = -h2, g2 = h1, g3 = -h0;
> >
> > void d4(vector<double> &a) {
> >   vector<double> tmp(a.size());
> >   for (int n = a.size(); n >= 4; n >>= 1) {
> >     int half = n >> 1;
> >
> >     for (int i=0; i<n-2; i+=2) {
> >       tmp.at(i/2) = a.at(i)*h0 + a.at(i+1)*h1 + a.at(i+2)*h2 + a.at(i+3)*h3;
> >       tmp.at(i/2+half) = a.at(i)*h3 - a.at(i+1)*h2 + a.at(i+2)*h1 -
> > a.at(i+3)*h0
> > ;
> >     }
> >
> >     tmp.at(half-1) = a.at(n-2)*h0 + a.at(n-1)*h1 + a.at(0)*h2 + a.at(1)*h3;
> >     tmp.at(2*half-1) = a.at(n-2)*h3 - a.at(n-1)*h2 + a.at(0)*h1 -
> > a.at(1)*h0;
> >
> >     copy(tmp.begin(), tmp.begin() + n, a.begin());
> >   }
> > }
> >
> > int main() {
> >   cout << "Generating data...\n";
> >   vector<double> a(1 << 25);
> >   for (int i=0; i<a.size(); ++i) a.at(i) = rand();
> >   cout << "Benchmarking...\n";
> >   double t1 = clock();
> >   d4(a);
> >   cout << "Elapsed time " << (clock() - t1) / CLOCKS_PER_SEC << "s\n";
> > }
> >
> > --
> > Dr Jon D Harrop, Flying Frog Consultancy
> > Objective CAML for Scientists
> > http://www.ffconsultancy.com/products/ocaml_for_scientists/index.html?usenet




More information about the Python-list mailing list