On Saturday, 13 August 2011 18:32:58 Antonio Cuni wrote:
On 12/08/11 17:49, David Naylor wrote:
Would it not be a simple matter of changing the __(get|set)state method to use a tuple or even an int(long)?
yes, I think it should be enough. I'm going on vacation soon and I won't have a look at it right now, so if anybody wants to work on it, he's very welcome (hint, hint :-)).
See attached for my naive attempt (and I did not run any unit tests on the code). It provides between 4.5x to 13.4x improvement in hash speed. If method 1 is acceptable I could properly implement it. If you look at the __hash__ method for datetime you will notice three return statements. The performance of those statements are as follows, based on: @bench.bench def hashdate(): res = 0 for i in range(10000000): now = datetime.datetime(i // 10000 + 1, (i % 10000) % 12 + 1, (i % 100) % 28 + 1) res ^= hash(now) return res hashdate() Method 1 (direct integer compute): hashdate: 0.70 seconds Method 2 (hash of __getstate()): hashdate: 2.39 seconds Method 3 (unity): hashdate: 0.68 seconds Method 4 (original): hashdate: 10.93 seconds (python: 12.60 seconds) And back to my original "benchmark" with the change of `key = i`: # python iforkey.py ifdict: [2.8676719665527344, 2.872897148132324, 2.8396730422973633] keydict: [2.3266799449920654, 2.3431849479675293, 2.3421859741210938] defaultdict: [3.706634044647217, 3.6940698623657227, 3.7520179748535156] # pypy iforkey.py (original) ifdict: [29.201794147491455, 29.047310829162598, 29.34461998939514] keydict: [14.939809083938599, 15.250468015670776, 15.542209148406982] defaultdict: [15.11891484260559, 15.064191102981567, 14.94817304611206] # pypy iforkey (method 1) ifdict: [7.455403804779053, 7.376722097396851, 7.447360038757324] keydict: [3.9056499004364014, 3.833178997039795, 3.8482401371002197] defaultdict: [3.9568910598754883, 3.8757669925689697, 3.88435697555542] # pypy iforkey.py (method 2) ifdict: [11.993246078491211, 11.865861892700195, 11.916783094406128] keydict: [6.141685962677002, 6.092236042022705, 6.082683086395264] defaultdict: [6.376708030700684, 6.337490081787109, 6.361854791641235] So, it appears pypy is failing to speed up this contrived example...