python speed

Donn Cave donn at
Wed Nov 30 23:19:41 CET 2005

In article <86sltdn6ma.fsf at>, Mike Meyer <mwm at> 

> "Harald Armin  Massa" <haraldarminmassa at> writes:
> >>Faster than assembly? LOL... :)
> > why not? Of course, a simple script like "copy 200 bytes from left to
> > right" can be handoptimized in assembler and run at optimum speed.
> > Maybe there is even a special processor command to do that.
> Chances are, version 1 of the system doesn't have the command. Version
> 2 does, but it's no better than the obvious hand-coded loop. Version 3
> finally makes it faster than the hand-coded loop, if you assume you
> have the instruction. If you have to test to see if you can use it,
> the hand-coded version is equally fast. Version 4 makes it faster even
> if you do the test, so you want to use it if you can. Of course, by
> then there'll be a *different* command that can do the same thing,j and
> is faster in some conditions.
> Dealing with this in assembler is a PITA. If you're generating code on
> the fly, you generate the correct version for the CPU you're running
> on, and that's that. It'll run at least as fast as hand-coded
> assembler on every CPU, and faster on some.

Actually I think the post you quote went on to make a similar

I read yesterday morning in the paper that the Goto Basic Linear
Algebra Subroutines, by a Mr. Kazushige Goto, are still the most
efficient library of functions for their purpose for use in
supercomputing applications.  Apparently hand-optimized assembler
for specific processors.
(actually from the NY Times, apparently)

   Donn Cave, donn at

More information about the Python-list mailing list