DecInt's division algorithm is completely general also. But I would never claim that Python code is faster than assembler. I believe that careful implementation of a good algorithm is more important than the raw speed of the language or efficiency of the compiler. Python makes it easy to implement algorithms.