negative modular exponentiation
Hello, SF patch 936813 (fast modular exponentiation) introduces some optimization of exponentiation with longs, including a new feature: in a pow(x,y,z), the exponent y is allowed to be negative when it makes mathematical sense. The result of (x ** y) % z is a number that is the inverse of (x ** y) % z, i.e. multiplying the two gives 1 modulo z. This is a very convenient feature if you are doing modulo arithmetic. Quoting Trevor Perrin, the patch's author: Libraries like GMP and LibTomMath work the same way. Being able to take inverses mod a number is useful for cryptography (e.g. RSA, DSA, and Elgamal). As this is a new feature it should be approved of. If it is, I'd volunteer to review the patch and make the necessary extension to the plain integer's pow() so that they work the same way. E.g.
pow(2,1,9) 5
because (5*2)%9 == 1. Currently this gives a TypeError (why not a ValueError?) so changing the behaviour seems safe. Armin
SF patch 936813 (fast modular exponentiation) introduces some optimization of exponentiation with longs, including a new feature: in a pow(x,y,z), the exponent y is allowed to be negative when it makes mathematical sense. The result of (x ** y) % z is a number that is the inverse of (x ** y) % z, i.e. multiplying the two gives 1 modulo z. This is a very convenient feature if you are doing modulo arithmetic. Quoting Trevor Perrin, the patch's author:
Libraries like GMP and LibTomMath work the same way. Being able to take inverses mod a number is useful for cryptography (e.g. RSA, DSA, and Elgamal).
As this is a new feature it should be approved of. If it is, I'd volunteer to review the patch and make the necessary extension to the plain integer's pow() so that they work the same way.
E.g.
pow(2,1,9) 5
because (5*2)%9 == 1. Currently this gives a TypeError (why not a ValueError?) so changing the behaviour seems safe.
This seems of little practical value (especially since real crypto wizards only complain that Python's longs are so much slower than gmp), but I can't see it could possibly break something, so a +0 from me. (Hm, it could break the expectation that pow(x, y, z), when defined, equals x**y%z. At the very least the docs need to be updated.) Guido van Rossum (home page: http://www.python.org/~guido/)
[Armin Rigo]
SF patch 936813 (fast modular exponentiation) introduces some optimization of exponentiation with longs, including a new feature: in a pow(x,y,z), the exponent y is allowed to be negative when it makes mathematical sense. The result of (x ** y) % z is a number that is the inverse of (x ** y) % z, i.e. multiplying the two gives 1 modulo z.
...
As this is a new feature it should be approved of.
It's fine by me, and new features are finer by me than adding ever more obscure optimizations in longobject.c. The Python long implementation has a great track record wrt portability and lack of bugs, and gonzo optimizations work against that. For example, did Python really need the Karatsuba multiplication gimmick? I added the patch for it after a lot of reworking, but overall I'm not sure it was a net win; and the current patch in question boosts the smallest integer at which Karatsuba triggers to such a large value that anyone reaching it would probably be much better off using a GMP wrapper anyway.
If it is, I'd volunteer to review the patch
One thing I noticed a few times in the patch was neglecting to check C calls for error returns. Even _PyLong_NumBits() can fail!
and make the necessary extension to the plain integer's pow() so that they work the same way.
E.g.
pow(2,1,9) 5
because (5*2)%9 == 1. Currently this gives a TypeError (why not a ValueError?) so changing the behaviour seems safe.
ValueError would make more sense to me too. I expect it was TypeError for compatibility with older exceptions raised by the float version.
Hi Armin, Guido, Tim, everyone, Thanks for looking at this patch! I've submitted a few other cryptorelated patches too. They're all waiting eagerly for good, loving people to adopt them :)... They've all got basically the same motivation. If I explain that, it might help people decide whether they're worth it or not: Rationale for some crypto patches in 2.4  Almost all crypto protocols require the same fast or OSdependent primitives: modular exponentiation, hashing, ciphers, and entropyharvesting. Everything else can be done in Python. To get these primitives in Python you could install M2Crypto (needing OpenSSL and SWIG) or pycrypto (needing GMP). But that complicates software distribution. So I was hoping the few missing primitives could be added to 2.4, enabling fast and portable purePython crypto. I've submitted the below patches. The first 2 aren't that important: #923643, long <> bytestring: Convenience methods so longs are easier to read/write in network protocols: n = long(someString, 256) # str > long s = n.tostring() # long > str #935454, sha256 module: Since SHA256 is preferred over SHA1 for certain things these days. #934711, platformspecific entropy: An os.urandom() function providing the platform equivalent of /dev/urandom (CryptGenRandom on Windows, presumably SecureRandom on Java). #936813, faster modular exponentiation... The only thing left is block ciphers (AES, maybe 3DES). Those would be easy (just copy from pycrypto) but I think this is courting enough controversy at the moment.. Anyways, about modular exponentiation: On cryptosized numbers (10008000 bits) on a P4, Python modexp is ~612x slower than GMP. With this patch it's only ~34x slower, which is on par with another C library I tested (LibTomCrypt). Modexp pretty much determines how long an SSL or IPsec handshake takes, or signing/decrypting a PGP message. These often take a sizeable fraction of a second, so they're worth speeding up. [Tim]
The Python long implementation has a great track record wrt portability and lack of bugs, and gonzo optimizations work against that.
Good points. The speedup is worth it to me, but it certainly adds complexity (though these are boring, straightforward optimizations that everyone does, and are described in lots of books. I ain't good enough to do gonzo :)
For example, did Python really need the Karatsuba multiplication gimmick? I added the patch for it after a lot of reworking, but overall I'm not sure it was a net win;
It seems a good speedup to me, particularly if the cutoff is increased  on 40008000 bit numbers it gives around 1030% faster multiplication, and on bigger numbers it's wildly faster.
and the current patch in question boosts the smallest integer at which Karatsuba triggers to such a large value that anyone reaching it would probably be much better off using a GMP wrapper anyway.
You'd be better off using GMP at pretty much any value. But if you want to widely distribute a tool or product that uses crypto, it's a pain for every user to have to do that.
One thing I noticed a few times in the patch was neglecting to check C calls for error returns. Even _PyLong_NumBits() can fail!
Darn, thanks. The 2 problems I see are in the long <> bytestring code (which I left in the longobject.c patch), not the modexp code, which I went over more carefully. I'll fix and resubmit the long<>bytestring patch.. Trevor
participants (4)

Armin Rigo

Guido van Rossum

Tim Peters

Trevor Perrin