[pypy-dev] Python vs pypy: interesting performance difference [dict.setdefault]

David Naylor naylor.b.david at gmail.com
Sat Aug 13 20:14:20 CEST 2011


On Saturday, 13 August 2011 18:32:58 Antonio Cuni wrote:
> On 12/08/11 17:49, David Naylor wrote:
> > Would it not be a simple matter of changing the __(get|set)state method
> > to use a tuple or even an int(long)?
> 
> yes, I think it should be enough. I'm going on vacation soon and I won't
> have a look at it right now, so if anybody wants to work on it, he's very
> welcome (hint, hint :-)).

See attached for my naive attempt (and I did not run any unit tests on the code).  It provides between 4.5x to 13.4x improvement in hash speed.  If 
method 1 is acceptable I could properly implement it.  

If you look at the __hash__ method for datetime you will notice three return statements.  The performance of those statements are as follows, 
based on:

@bench.bench
def hashdate():
     res = 0
     for i in range(10000000):
         now = datetime.datetime(i // 10000 + 1, (i % 10000) % 12 + 1, (i % 100) % 28 + 1)
         res ^= hash(now)
     return res

hashdate()

Method 1 (direct integer compute):
hashdate: 0.70 seconds

Method 2 (hash of __getstate()):
hashdate: 2.39 seconds

Method 3 (unity):
hashdate: 0.68 seconds

Method 4 (original):
hashdate: 10.93 seconds (python: 12.60 seconds)

And back to my original "benchmark" with the change of `key = i`:

# python iforkey.py
ifdict: [2.8676719665527344, 2.872897148132324, 2.8396730422973633]
keydict: [2.3266799449920654, 2.3431849479675293, 2.3421859741210938]
defaultdict: [3.706634044647217, 3.6940698623657227, 3.7520179748535156]

# pypy iforkey.py (original)
ifdict: [29.201794147491455, 29.047310829162598, 29.34461998939514]
keydict: [14.939809083938599, 15.250468015670776, 15.542209148406982]
defaultdict: [15.11891484260559, 15.064191102981567, 14.94817304611206]

# pypy iforkey (method 1)
ifdict: [7.455403804779053, 7.376722097396851, 7.447360038757324]
keydict: [3.9056499004364014, 3.833178997039795, 3.8482401371002197]
defaultdict: [3.9568910598754883, 3.8757669925689697, 3.88435697555542]

# pypy iforkey.py (method 2)
ifdict: [11.993246078491211, 11.865861892700195, 11.916783094406128]
keydict: [6.141685962677002, 6.092236042022705, 6.082683086395264]
defaultdict: [6.376708030700684, 6.337490081787109, 6.361854791641235]

So, it appears pypy is failing to speed up this contrived example...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: datetime.diff
Type: text/x-patch
Size: 5791 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20110813/ed6fff4c/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20110813/ed6fff4c/attachment.pgp>


More information about the pypy-dev mailing list