why () is () and [] is [] work in other way?

Fri Apr 27 02:21:24 EDT 2012

Steven D'Aprano <steve+comp.lang.python at pearwood.info> writes:
> I'm seeing code generated by the Haskell GHC compiler being 2-4 times 
> slower than code from the C gcc compiler, and on average using 2-3 times 
> as much memory (and as much as 7 times).

Alioth isn't such a great comparison, because on the one hand you get
very carefully tuned, unidiomatic code for each language; but on the
other, you're somewhat constrained by the benchmark specs.  Obviously C
is not much above assembler, so you can write almost-optimal programs if
you code close enough to the metal and suffer enough.  If you're talking
about coding reasonably straightforwardly, C usually does beat Haskell
(once you've debugged the core dumps...) but there are exceptions to
that.

> Feel free to find your own set of benchmarks that show the opposite. I'd 
> be interested to see under what conditions Haskell might be faster than C.

Haskell wasn't included in this multi-way comparison, but Ocaml beat C
by a significant factor at a straightforward vector arithmetic loop,
because it didn't have to pessimize around possible pointer aliasing:

  http://scienceblogs.com/goodmath/2006/11/the_c_is_efficient_language_fa.php

GHC should be able to do similar things.

Also, here's a sort of cheating Haskell example: the straightforward
Haskell Fibonacci code is slower than C, but just sprinkle in a few
parallelism keywords and run it on your quad core cpu:

  http://donsbot.wordpress.com/2007/11/29/use-those-extra-cores-and-beat-c-today

Note the Haskell code in that example is using arbitrary-precision
integers while C is using int64's.  Yes, you could beat the GHC speed by
writing a lot more C code to manage Posix threads, locks, etc., but in
Haskell two do two things in parallel you can just say "par".

There is also work going on to support parallel listcomps (just like
regular ones but they run on multiple cores), and vector combinators
that offload the computation to a GPU.  Those things are quite hard to
do in plain C, though there are some specialty libraries for it.

Finally, a less-cheating example (this is from 2007 and I think things
are even better now):

http://neilmitchell.blogspot.com/2007/07/making-haskell-faster-than-c.html

Gives a Haskell word count program

   main = print . length . words =<< getContents

which could also be written (if the syntax looks better to you):

   main = do
     text <- getContents
     print (length (words text))

The comparison C code is:

    int main() {
     int i = 0;
     int c, last_space = 1, this_space;
     while ((c = getchar()) != EOF) {
      this_space = isspace(c);
      if (last_space && !this_space)
       i++;
      last_space = this_space;
     }
     printf("%i\n", i);
     return 0;
    }

and GHC/Supero beats the C code by about 10% even though both use
getchar.  The blog post explains, you could speed up the C code by
writing a rather contorted version, unrolling it into two separate
loops, one for sequences of spaces and one for non-spaces, and jumping
back and forth between the loops instead of using the last_space
variable.  That is basically the code that Supero figures out how to
generate: two separate loops with transitions in the right places,
starting from very straightforward high-level input.

I'm not really good at Haskell even after fooling with it on and off for
several years now, and it certainly can't beat Python for ease-of-use
without a lot of experience.  But in the hands of experts it is
incredibly powerful.  It makes Python seem almost like a toy.