gfortran/g77+f2py vs gcc+Cython speed comparison
Hi, I needed to write 2D Ising model simulation into my school and I decided to compare the two possible solutions how to do it, so I of course wrote it in Python, then rewrote it in Fortran + f2py, and also Cython. What is better? Read below. :) But for the impatient, I am going to use Cython, reasons below. CCing to Cython, numpy (f2py is discussed there), and sage-devel (there are people there who could be interested in these kinds of comparisons). The code is available at: http://hg.sharesource.org/isingmodel/ How to play with that - just do this (after installing Mercurial): $ hg clone http://hg.sharesource.org/isingmodel/ [...] $ cd isingmodel $ hg up db7dd01cdc26 # just to be sure that we are talking about the same revision / code $ make [...] $ time python simulate.py [...] real 0m2.026s user 0m1.988s sys 0m0.020s This runs Cython code. Then apply this patch to run fortran code instead: $ hg di diff -r db7dd01cdc26 simulate.py --- a/simulate.py Sun Dec 23 02:23:30 2007 +0100 +++ b/simulate.py Sun Dec 23 02:24:33 2007 +0100 @@ -31,8 +31,8 @@ def MC(mu = 1, temp = 2, dim = 20, steps J=1 #coupling constant k=1 #Boltzman constant - #from mcising import mc - from pyising import mc + from mcising import mc + #from pyising import mc B = D1(A) mc(B, dim, steps, temp, H, mu, J, k) return D2(B) And then again (and apply the patch below, otherwise it might not work for you): $ time python simulate.py [...] real 0m3.600s user 0m3.528s sys 0m0.052s So it's a lot slower. We are comparing many things here - wrappers, my fortran coding skills vs Cython C code generation and gcc vs gfortran. So I wrote to numpy mailinglist. First Travis (the author of numpy) suggested: " My experience with similar kinds of comparisons is that gnu fortran compilers are not very good, especially on 2-d problems. Try using a different fortran compiler to see if speeds improve. " Then Pearu (the author of f2py) suggested: " Though the problem is 2D, your implementations are essentially 1D. If you would treat the array A as 2D array (and avoid calling subroutine p) then you would gain some 7% speed up in Fortran. When using -DF2PY_REPORT_ATEXIT for f2py then a summary of timings will be printed out about how much time was spent in Fortran code and how much in the interface. In the given case I get (nsteps=50000): Overall time spent in ... (a) wrapped (Fortran/C) functions : 1962 msec (b) f2py interface, 60 calls : 0 msec (c) call-back (Python) functions : 0 msec (d) f2py call-back interface, 0 calls : 0 msec (e) wrapped (Fortran/C) functions (acctual) : 1962 msec that is, most of the time is spent in Fortran function and no time in wrapper. The conclusion is that the cause of the difference in timings is not in f2py or cpython generated interfaces but in Fortran and C codes and/or compilers. Some idiom used in Fortran code is just slower than in C.. For example, in C code you are doing calculations using float precision but in Fortran you are forcing double precision. HTH, Pearu PS: Here follows a setup.py file that I used to build the extension modules instead of the Makefile: #file: setup.py def configuration(parent_package='',top_path=None): from numpy.distutils.misc_util import Configuration config = Configuration('',parent_package,top_path) config.add_extension('mcising', sources=['mcising.f'], define_macros = [('F2PY_REPORT_ATEXIT',1)] ) #config.add_extension('pyising', sources=['pyising.pyx']) return config from numpy.distutils.core import setup setup(configuration = configuration) " and then quickly added " When using g77 compiler instead of gfortran, I get a speed up 4.8 times. Btw, a line in a if statement of the fortran code should read `A(p(i,j,N)) = - A(p(i,j,N))`. " (btw I have no idea how it could work for me without the A(p(i,j,N)) = - A(p(i,j,N)) fix, quite embarassing) So then we discussed it on #sympy IRC: * Now talking on #sympy <ondrej> hi pearu, thanks a lot for testing it! <ondrej> 4.8 speedup, jesus christ. so the gfortran sucks a lot <pearu> hey ondrej * ondrej is trying g77 <pearu> gortran has advantage of being Fortran 90 compiler <ondrej> g77 is depricated in Debian and dead upstram (imho) <pearu> yes but I guess those guys who are maintaing g77/gfortran, those do not crunch numbers much;) <pearu> g77 has a long development history and is well optimized <ondrej> I fear the fortran is a bad investment for new projects <ondrej> I know g77 is well optimized, but it's dead <pearu> gfortran is a write from scratch and is too young <ondrej> do you think many people still write fortran? I use it just because g77 was faster than gcc <pearu> g77 is certainly not dead, scientist use it a lot because of its speed <ondrej> btw`A(p(i,j,N)) = - A(p(i,j,N))`. <ondrej> means my fortran code was broken <ondrej> doesn't it? <pearu> some think that fortran is a dead language, some use it a lot because lots of code is written in fortran over several decades <ondrej> yes <pearu> yes, it was, I got segfaults because of that <ondrej> I don't know what to think myself <pearu> it depends on application <pearu> if it is a research app then use fortran because of speed <ondrej> but as you can see, you need to use g77 <ondrej> and not gfortran <ondrej> (but all Debian is moving to use gfortran) <pearu> I use gfortran only for f90 code <ondrej> hm <pearu> I think Debian is wrong in short term, may be in future gfortran will be faster, but not at the moment <ondrej> hm, that's bad <ondrej> but is someone developing g77? <pearu> no, I don't think so <ondrej> debian moved to gfortran, because there are problems with g77 being umaintained <pearu> but it is a complete Fortran 77 compiler and a very good one <ondrej> hm. I think when one wants speed, one should use some comercial compilers <pearu> probably because new architectures are developed and g77 is not updated to use their features <ondrej> and when one wants robustness, it should use a free compiler that is maintinaed (gfortran) <pearu> commercial compilers does not mean fast compliers, in general <ondrej> I didn't try intel, so I don't know how it compares to g77 <pearu> using intel restricts one to be on Intel platform <ondrej> the g77 doesn't support -fdefault-real-8 for example <pearu> AMD is daster in floating point that most scientist use <pearu> why is this flag important, one can always write real*8 <ondrej> indeed, g77 is 1.8s and gcc 2.0s on my comp <ondrej> as to real*8 - I thought a good strategy is to write real and let the compiler choose the precision <pearu> try to use float in fortran as you do in C <ondrej> ok <pearu> this will mess up f2py-ing as f2py assumes that real is real*4 <ondrej> yes - <ondrej> real*4 is float, isn't it? <pearu> yes <ondrej> so I'll just use real*4 everywhere <pearu> yep <ondrej> ok <ondrej> btw, I was writing a python script for automatically converting real*4 to real*8 and back <ondrej> but it's freaking hard to parse fortran <ondrej> (I needed to also work with "real", etc) <ondrej> I only managed it to work in simple cases <pearu> just do re.sub <ondrej> it's not that easy <ondrej> there are constructs like real(something) <ondrej> that shouldn't be converted <ondrej> but real T should be converted <ondrej> etc. <ondrej> so using real*4 with g77 slowed the code done from 1.8s to 1.9s <pearu> re.sub should be able to handle these, use cb argument, fo instance <ondrej> ok <pearu> you are using random numbers, could this make also a difference <ondrej> I think it does <ondrej> (I had to use glibc fast random to speed C up) <pearu> a, ok, that also explains things <ondrej> shit, the gfortran is really slow <ondrej> 3.5s on my comp <ondrej> no matter which precision <pearu> try to disable random to see if loops are faster in C or fortran <ondrej> ok <ondrej> so gfortran is now 1.3s <ondrej> and g77 1.04s * ondrej is trying C <ondrej> C is 1.05s <pearu> so gfortran random is slow <ondrej> yes <ondrej> g77 random is fast <ondrej> C random is slower than g77 <ondrej> it depends on the quality of the random generator as well <ondrej> there are huge differences in that <pearu> and so gfortran itsel is not too bad <pearu> :) <ondrej> but as you can see - there is no point for me to use g77 if I can achieve the same speed with C <ondrej> and gfortran is slower than both <pearu> yep <ondrej> ok, going to write this to the list <pearu> can you try ifc <ondrej> I'll also send it to the Cython guys, they'll like the result. :) <ondrej> ifc = intel fortran compiler? <pearu> yes <ondrej> I like opensource and Debian <ondrej> I think this clearly shows, that fortran is dead for me. I think gcc is pretty good too. <pearu> note that these results are not thanks to pyrex /cpython but thanks to compilers:) <pearu> i like to be fair;) <ondrej> (yes, it's just compilers comparison, not f2py vs Cython/pyrex) <pearu> yep <ondrej> but there are two ways a user can choose - either cython+gcc, or f2py+fortran <pearu> next time we shoul probably have this discussion inscipy irc <ondrej> and if he wants to use the most recent default free compilers, at least I will choose cython+gcc <pearu> there are more ways, weave, etc etc <ondrej> I am going to paste this to my email about that <pearu> ok <ondrej> (if you are ok with that) <pearu> ok with me So, what do you think of that? Ondrej
On Dec 23, 2007 12:57 PM, Ondrej Certik <ondrej@certik.cz> wrote:
Hi,
I needed to write 2D Ising model simulation into my school and I decided to compare the two possible solutions how to do it, so I of course wrote it in Python, then rewrote it in Fortran + f2py, and also Cython. What is better? Read below. :) But for the impatient, I am going to use Cython, reasons below.
I recently tried a similar experiment with the opposite result :)
So it's a lot slower.
This is very surprising indeed. I have just tried to run your code but can't get the fortran generated module to work. But I wonder how you got such a difference, because in my recent comparisons I generally got about 0-2 % faster (i.e. negligible) results with gfortran compared to gcc and a much bigger difference (in favour of fortran) with the Intel compilers. In fact for simple tests gcc and gfortran produce almost identical assembler code (use -s switch to check). For the more mature compilers (i.e. Intel) fortran is faster as it is inherently easier to optimize (easier data dependency analysis and vectorisation). The main reason I can think of is the gfortran version you are using and the optimisation switches. Any version 4.0.x is seriously suboptimal (first released version). Big gains were made from 4.0-4.1 and 4.1-4.2. I would suggest 4.2 as the minimum version to use. The development version 4.3 is faster still (also for gcc) because of recent vectorisation work.
When using g77 compiler instead of gfortran, I get a speed up 4.8 times.
The difference between g77 and gfortran with version 4.3 is now very small, plus gfortran is going to benefit from all the new vectorisation and optimisation work in gcc. It will not be long before it over takes for good. In addition you can use modern fortran 95/2003 code which makes life much easier! In in addition to all that it is very actively maintained and destined to be the default compiler on all linux systems as it is bundled with gcc.
So, what do you think of that?
If you are using a modern gfortran version then I suggest you have found a bug, in which case I would take it to the gfortran mailing list. If not, then try version 4.3 which can be downloaded from the gfortran wiki: gcc.gnu.org/wiki/GFortran In the mean time, after Christmas, I'll try and get your code working and see for myself the difference. Seasons greetings, John -- Telephone: (+44) (0) 7739 105209
Hi John, thanks for the response.
This is very surprising indeed. I have just tried to run your code but can't get the fortran generated module to work. But I wonder how you
Yes, I really didn't try to make it robust, I just made it work for me on Debian unstable.
got such a difference, because in my recent comparisons I generally got about 0-2 % faster (i.e. negligible) results with gfortran compared to gcc and a much bigger difference (in favour of fortran) with the Intel compilers.
In fact for simple tests gcc and gfortran produce almost identical assembler code (use -s switch to check). For the more mature compilers (i.e. Intel) fortran is faster as it is inherently easier to optimize (easier data dependency analysis and vectorisation).
The main reason I can think of is the gfortran version you are using and the optimisation switches.
Any version 4.0.x is seriously suboptimal (first released version). Big gains were made from 4.0-4.1 and 4.1-4.2. I would suggest 4.2 as the minimum version to use. The development version 4.3 is faster still (also for gcc) because of recent vectorisation work.
I use: $ gfortran --version GNU Fortran (GCC) 4.2.3 20071123 (prerelease) (Debian 4.2.2-4) Copyright (C) 2007 Free Software Foundation, Inc. GNU Fortran comes with NO WARRANTY, to the extent permitted by law. You may redistribute copies of GNU Fortran under the terms of the GNU General Public License. For more information about these matters, see the file named COPYING $
When using g77 compiler instead of gfortran, I get a speed up 4.8 times.
The difference between g77 and gfortran with version 4.3 is now very small, plus gfortran is going to benefit from all the new vectorisation and optimisation work in gcc. It will not be long before it over takes for good. In addition you can use modern fortran 95/2003 code which makes life much easier! In in addition to all that it is very actively maintained and destined to be the default compiler on all linux systems as it is bundled with gcc.
Yes, I also think gfortran is the way to go, since g77 is dead upstream. But currently, the g77 still gives quite a lot faster code than gfortran, but I hope it will change.
So, what do you think of that?
If you are using a modern gfortran version then I suggest you have found a bug, in which case I would take it to the gfortran mailing list. If not, then try version 4.3 which can be downloaded from the gfortran wiki: gcc.gnu.org/wiki/GFortran In the mean time, after Christmas, I'll try and get your code working and see for myself the difference.
From the social point of view, I much prefer C, since a lot more people know it, but as you say, the good fortran compilers are a lot better, than
So your suggestion is to prepare a clean fortran example, that, when compiled with gfortran, runs a lot slower, than if compiled with g77? Yep, I can do that. So what is your personal opinion about fortran vs C for scientific computing? the C compilers. So I don't know. Probably depends on the project and a code I want to reuse. Ondrej
On Dec 24, 2007 1:29 PM, Ondrej Certik <ondrej@certik.cz> wrote:
Hi John,
When using g77 compiler instead of gfortran, I get a speed up 4.8 times.
The difference between g77 and gfortran with version 4.3 is now very small, plus gfortran is going to benefit from all the new vectorisation and optimisation work in gcc. It will not be long before it over takes for good. In addition you can use modern fortran 95/2003 code which makes life much easier! In in addition to all that it is very actively maintained and destined to be the default compiler on all linux systems as it is bundled with gcc.
Yes, I also think gfortran is the way to go, since g77 is dead upstream. But currently, the g77 still gives quite a lot faster code than gfortran, but I hope it will change.
Yes, I just proved myself wrong (not a rare occurrence...), on my slow laptop: With gcc (4.3.0 20071223 (experimental)): real 0m3.272s user 0m3.120s sys 0m0.044s With gfortran (4.3.0 20071223 (experimental)): real 0m4.294s user 0m3.976s sys 0m0.020s With g77 (3.4.6): real 0m2.928s user 0m2.812s sys 0m0.044s If g77=1.0 then gcc=1.27 and gfortran=1.41 (far from 4.8 times though!!) This is in contrast to my recent experiments between gfortran and gcc, though that code used f95 stuff extensively so was never tested against g77.
So your suggestion is to prepare a clean fortran example, that, when compiled with gfortran, runs a lot slower, than if compiled with g77? Yep, I can do that.
The most interesting will be to compile with g77 and gfortran with the options to spit out the generated assembler code, then we can see exactly where the difference is. I'll dig back in the gfortran mailing lists and see previous conclusions of the g77 vs gfortran debate (which occur often). You may just be exercising a particularly unoptimized part of gfortran.
From the social point of view, I much prefer C, since a lot more people know it, but as you say, the good fortran compilers are a lot better, than
So what is your personal opinion about fortran vs C for scientific computing? the C compilers. So I don't know. Probably depends on the project and a code I want to reuse.
If you haven't guessed I'm pro fortran! My path was C++ -> Matlab -> Scipy (which I still use for prototyping and short calculations) -> fortran for big simulations. Of course, the wonderful f2py helps combining the last two steps! The turning point came when I discovered modern object orientated fortran features and particularly the array intrinsics/operations in fortran 95/2003. I realised that I could code my stuff in almost as few lines as Matlab and Scipy and also have the fastest execution (if using commercial compilers). My hope (slightly set back today) is that gfortran will mature into being as fast/faster than gcc/g77 for most code. In that case, with gcc on every linux system, then the social opposition should disappear. Though this still will not help against people who think f77 is fortran, ignoring the last 30 years of development. Anyway, I really must go home now... John
participants (2)
-
John Travers
-
Ondrej Certik