Python vs pypy: interesting performance difference [dict.setdefault]
Hi, I needed to create a cache of date and time objects and I wondered what was the best way to handle the cache. For comparison I put together the following test: <file="iforkey.py"> import datetime import random import timeit ranges = [datetime.datetime(2011,01, random.randint(1, 31)) for i in xrange(1000)] def ifdict(): cache = {} b = [] for i in ranges: key = i.day if key in cache: b.append(cache[key]) else: date = i.date() cache[key] = date b.append(date) def keydict(): cache = {} b = [] for i in ranges: key = i.day try: b.append(cache[key]) except KeyError: date = i.date() cache[key] = date b.append(date) def defaultdict(): cache = {} b= [] for i in ranges: b.append(cache.setdefault(i, i.date())) print "ifdict:", timeit.repeat("ifdict()", "from __main__ import ifdict", number=10000) print "keydict:", timeit.repeat("keydict()", "from __main__ import keydict", number=10000) print "defaultdict:", timeit.repeat("defaultdict()", "from __main__ import defaultdict", number=10000) </file> # python iforfile.py ifdict: [2.432887077331543, 2.4002890586853027, 2.397233009338379] keydict: [2.3483030796051025, 2.358638048171997, 2.314802885055542] defaultdict: [3.5384328365325928, 3.5329859256744385, 3.5728111267089844] # pypy iforfile.py (pypy 1.5) ifdict: [0.8129069805145264, 0.74648118019104, 0.7432689666748047] keydict: [0.5187451839447021, 0.4662129878997803, 0.4504108428955078] defaultdict: [37.98510789871216, 37.859113931655884, 37.92770600318909] Pypy displays significant slowdown in the defaultdict function, otherwise displays its usual speedup. To check what is the cause I replaced i.date() with i.day and found no major difference in times. It appears dict.setdefault (or it's interaction with jit) is causing a slow down. Regards
Hello David, On 10/08/11 21:27, David Naylor wrote:
Hi,
I needed to create a cache of date and time objects and I wondered what was the best way to handle the cache. For comparison I put together the following test:
[cut]
Pypy displays significant slowdown in the defaultdict function, otherwise displays its usual speedup. To check what is the cause I replaced i.date() with i.day and found no major difference in times. It appears dict.setdefault (or it's interaction with jit) is causing a slow down.
I don't think that setdefault is the culprit here, as shown by this benchmark: @bench.bench def setdef(): d = {} for i in range(10000000): d.setdefault(i, i) return d @bench.bench def tryexcept(): d = {} for i in range(10000000): try: d[i] except KeyError: d[i] = i return d setdef() tryexcept() $ python dictbench.py setdef: 2.03 seconds tryexcept: 8.54 seconds tmp $ pypy-c dictbench.py setdef: 1.31 seconds tryexcept: 1.37 seconds as you can see, in PyPy there is almost no difference between using a try/except or using setdefault. What is very slow on PyPy seems to be hashing datetime objects: import datetime @bench.bench def hashdate(): res = 0 for i in range(100000): now = datetime.datetime.now() res ^= hash(now) return res hashdate() $ pypy-c dictbench.py hashdate: 0.83 seconds $ python dictbench.py hashdate: 0.22 seconds I had a quick look at the code (which in PyPy is written at applevel) and it does a lot of nonsense. In particular, __hash__ calls __getstate which formats a dynamically created string, just to call hash() on it. I suppose that this code can (and should) be optimized a lot. I may try to look at it but it's unclear when, since I'm about to go on vacation. ciao, Anto
On 12/08/11 14:51, Antonio Cuni wrote:
@bench.bench
for reference, here is the implementation of the "bench" decorator: https://bitbucket.org/antocuni/env/src/1b11491fab79/pypath/bench.py
On Friday, 12 August 2011 14:51:36 Antonio Cuni wrote:
Hello David,
On 10/08/11 21:27, David Naylor wrote:
Hi,
I needed to create a cache of date and time objects and I wondered what was the best way to handle the cache. For comparison I put together
the following test: [cut]
Pypy displays significant slowdown in the defaultdict function, otherwise displays its usual speedup. To check what is the cause I replaced i.date() with i.day and found no major difference in times. It appears dict.setdefault (or it's interaction with jit) is causing a slow down.
I don't think that setdefault is the culprit here, as shown by this benchmark:
I made a mistake in the script, I intended to use the day number as the key (as was in the case of the other tests). Changing the critical line to "b.append(cache.setdefault(i.day, i.date()))" gives comparable results to python (factoring in the speed up). <snip/>
as you can see, in PyPy there is almost no difference between using a try/except or using setdefault.
I would argue that with an expensive value operation the try/except will be much faster than setdefault (where cache hits are above 0). Using my fixed script I get: keydict: [0.3298788070678711, 0.28450703620910645, 0.28931379318237305] defaultdict: [0.47180604934692383, 0.4183311462402344, 0.4172670841217041] indicating setdefault is about 40% slower (1.4 times slower), which I attribute to the cost of i.date().
I had a quick look at the code (which in PyPy is written at applevel) and it does a lot of nonsense. In particular, __hash__ calls __getstate which formats a dynamically created string, just to call hash() on it. I suppose that this code can (and should) be optimized a lot. I may try to look at it but it's unclear when, since I'm about to go on vacation.
Would it not be a simple matter of changing the __(get|set)state method to use a tuple or even an int(long)?
On 12/08/11 17:49, David Naylor wrote:
Would it not be a simple matter of changing the __(get|set)state method to use a tuple or even an int(long)?
yes, I think it should be enough. I'm going on vacation soon and I won't have a look at it right now, so if anybody wants to work on it, he's very welcome (hint, hint :-)). ciao, Anto
On Saturday, 13 August 2011 18:32:58 Antonio Cuni wrote:
On 12/08/11 17:49, David Naylor wrote:
Would it not be a simple matter of changing the __(get|set)state method to use a tuple or even an int(long)?
yes, I think it should be enough. I'm going on vacation soon and I won't have a look at it right now, so if anybody wants to work on it, he's very welcome (hint, hint :-)).
See attached for my naive attempt (and I did not run any unit tests on the code). It provides between 4.5x to 13.4x improvement in hash speed. If method 1 is acceptable I could properly implement it. If you look at the __hash__ method for datetime you will notice three return statements. The performance of those statements are as follows, based on: @bench.bench def hashdate(): res = 0 for i in range(10000000): now = datetime.datetime(i // 10000 + 1, (i % 10000) % 12 + 1, (i % 100) % 28 + 1) res ^= hash(now) return res hashdate() Method 1 (direct integer compute): hashdate: 0.70 seconds Method 2 (hash of __getstate()): hashdate: 2.39 seconds Method 3 (unity): hashdate: 0.68 seconds Method 4 (original): hashdate: 10.93 seconds (python: 12.60 seconds) And back to my original "benchmark" with the change of `key = i`: # python iforkey.py ifdict: [2.8676719665527344, 2.872897148132324, 2.8396730422973633] keydict: [2.3266799449920654, 2.3431849479675293, 2.3421859741210938] defaultdict: [3.706634044647217, 3.6940698623657227, 3.7520179748535156] # pypy iforkey.py (original) ifdict: [29.201794147491455, 29.047310829162598, 29.34461998939514] keydict: [14.939809083938599, 15.250468015670776, 15.542209148406982] defaultdict: [15.11891484260559, 15.064191102981567, 14.94817304611206] # pypy iforkey (method 1) ifdict: [7.455403804779053, 7.376722097396851, 7.447360038757324] keydict: [3.9056499004364014, 3.833178997039795, 3.8482401371002197] defaultdict: [3.9568910598754883, 3.8757669925689697, 3.88435697555542] # pypy iforkey.py (method 2) ifdict: [11.993246078491211, 11.865861892700195, 11.916783094406128] keydict: [6.141685962677002, 6.092236042022705, 6.082683086395264] defaultdict: [6.376708030700684, 6.337490081787109, 6.361854791641235] So, it appears pypy is failing to speed up this contrived example...
Hi David, On Sat, Aug 13, 2011 at 8:14 PM, David Naylor <naylor.b.david@gmail.com> wrote:
So, it appears pypy is failing to speed up this contrived example...
I think that it is expected, because the hash is computed entirely as pure Python code in the case of PyPy (doing integer arithmetic with overflow checks) whereas it is written in C in the case of RPython. I find it rather good that we get overall close to the speed of CPython, given that we just have datetime.py... Try using datetime.py on top of CPython too :-) More seriously, if we want to be really better with datetime objects, as opposed to "good enough for now", then I fear we need to rewrite datetime.py in RPython, like CPython did rewrite it in C. Or some intermediate solution is possible too, having an internal class which implements only basic operations (like hash) and which is subclassed in pure Python. But I would happily leave that task to someone with an absolutely real need for high-performance datetime objects; for me "good enough for now" is good enough, for now... A bientôt, Armin.
On Mon, Aug 15, 2011 at 2:05 AM, Armin Rigo <arigo@tunes.org> wrote:
Hi David,
On Sat, Aug 13, 2011 at 8:14 PM, David Naylor <naylor.b.david@gmail.com> wrote:
So, it appears pypy is failing to speed up this contrived example...
I think that it is expected, because the hash is computed entirely as pure Python code in the case of PyPy (doing integer arithmetic with overflow checks) whereas it is written in C in the case of RPython. I find it rather good that we get overall close to the speed of CPython, given that we just have datetime.py... Try using datetime.py on top of CPython too :-)
More seriously, if we want to be really better with datetime objects, as opposed to "good enough for now", then I fear we need to rewrite datetime.py in RPython, like CPython did rewrite it in C. Or some intermediate solution is possible too, having an internal class which implements only basic operations (like hash) and which is subclassed in pure Python. But I would happily leave that task to someone with an absolutely real need for high-performance datetime objects; for me "good enough for now" is good enough, for now...
A bientôt,
Armin. _______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev
I'd like to express that, unless we have a very compelling reason, we should try to keep more stuff in pure python, as opposed to RPython. Mostly because it speeds up translation ;) (also easier to test, easier to write, etc.). Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero
Hi Alex, On Mon, Aug 15, 2011 at 3:36 PM, Alex Gaynor <alex.gaynor@gmail.com> wrote:
I'd like to express that, unless we have a very compelling reason, we should try to keep more stuff in pure python, as opposed to RPython. Mostly because it speeds up translation ;) (also easier to test, easier to write, etc.).
I agree. I was really talking about someone needing absolutely best performance for datetime objects. I'm sure that getting "only average" performance is more than enough for all use cases apart from microbenchmarks. A bientôt, Armin.
On Monday, 15 August 2011 09:05:22 Armin Rigo wrote:
Hi David,
On Sat, Aug 13, 2011 at 8:14 PM, David Naylor <naylor.b.david@gmail.com> wrote:
So, it appears pypy is failing to speed up this contrived example...
I think that it is expected, because the hash is computed entirely as pure Python code in the case of PyPy (doing integer arithmetic with overflow checks) whereas it is written in C in the case of RPython. I find it rather good that we get overall close to the speed of CPython, given that we just have datetime.py... Try using datetime.py on top of CPython too :-)
More seriously, if we want to be really better with datetime objects, as opposed to "good enough for now", then I fear we need to rewrite datetime.py in RPython, like CPython did rewrite it in C. Or some intermediate solution is possible too, having an internal class which implements only basic operations (like hash) and which is subclassed in pure Python. But I would happily leave that task to someone with an absolutely real need for high-performance datetime objects; for me "good enough for now" is good enough, for now...
For me the performance of datetime object's hashing is sufficient but I think the python code could use some performance improvements. Is my approach using a direct computation to type long acceptable (in principle). If so I can refine it and submit a patch.
On 15/08/11 15:36, Alex Gaynor wrote:
I'd like to express that, unless we have a very compelling reason, we should try to keep more stuff in pure python, as opposed to RPython. Mostly because it speeds up translation ;) (also easier to test, easier to write, etc.).
or, on the other hand, we should try to improve the JIT so that pure python code can be as fast as rpython :-)
Hi David, On Mon, Aug 15, 2011 at 6:20 PM, David Naylor <naylor.b.david@gmail.com> wrote:
For me the performance of datetime object's hashing is sufficient but I think the python code could use some performance improvements. Is my approach using a direct computation to type long acceptable (in principle). If so I can refine it and submit a patch.
Yes, replacing the hash with a faster-to-compute one is fine. It's best performance-wise if you can avoid using Python longs. As far as I know it just needs some random-looking xor-ing and shifting of the fields. Note, of course, that you must carefully satisfy the property that for any objects x and y, if "x == y" then "hash(x) == hash(y)". A bientôt, Armin.
On Tuesday, 16 August 2011 15:27:30 Armin Rigo wrote:
Hi David,
On Mon, Aug 15, 2011 at 6:20 PM, David Naylor <naylor.b.david@gmail.com> wrote:
For me the performance of datetime object's hashing is sufficient but I think the python code could use some performance improvements. Is my approach using a direct computation to type long acceptable (in principle). If so I can refine it and submit a patch.
Yes, replacing the hash with a faster-to-compute one is fine. It's best performance-wise if you can avoid using Python longs. As far as I know it just needs some random-looking xor-ing and shifting of the fields. Note, of course, that you must carefully satisfy the property that for any objects x and y, if "x == y" then "hash(x) == hash(y)".
Below is the patch, and results, for my proposed hash methods for datetime.datetime (and easily adaptable to include tzinfo and the other datetime objects). I tried to make the hash safe for both 32bit and 64bit systems, and beyond. The results are: # python datetest.py (datetime.py) hash_unity: 35.83 seconds hash_unity: 44.60 seconds hash_datetime: 65.58 seconds hash_datetime: 53.95 seconds # python datetest.py hash_unity: 5.70 seconds hash_unity: 5.69 seconds hash_datetime: 4.88 seconds hash_datetime: 4.90 seconds # pypy datetest.py hash_unity: 0.74 seconds hash_unity: 0.63 seconds hash_datetime: 11.74 seconds hash_datetime: 11.47 seconds # pypy datetest.py (patched datetime.py) hash_unity: 0.73 seconds hash_unity: 0.62 seconds hash_datetime: 0.76 seconds hash_datetime: 0.64 seconds So, based on my patch there is a 7.7x improvement over python and a 17.9x improvement over the previous pypy implementation. If the above approach is acceptable I will complete the patch? Regards
Hi David, On Thu, Aug 25, 2011 at 9:44 PM, David Naylor <naylor.b.david@gmail.com> wrote:
Below is the patch, and results, for my proposed hash methods for datetime.datetime (and easily adaptable to include tzinfo and the other datetime objects). I tried to make the hash safe for both 32bit and 64bit systems, and beyond.
Yes, the patch looks good to me. I can definitely see how it can be a huge improvement in performance :-) If you can also "fix" the other __hash__ methods in the same way, it would be great. A bientôt, Armin.
On Friday, 26 August 2011 06:37:30 Armin Rigo wrote:
Hi David,
On Thu, Aug 25, 2011 at 9:44 PM, David Naylor <naylor.b.david@gmail.com> wrote:
Below is the patch, and results, for my proposed hash methods for datetime.datetime (and easily adaptable to include tzinfo and the other datetime objects). I tried to make the hash safe for both 32bit and 64bit systems, and beyond.
Yes, the patch looks good to me. I can definitely see how it can be a huge improvement in performance :-)
If you can also "fix" the other __hash__ methods in the same way, it would be great.
To follow up on a very old email. The latest results are: # python2.7 iforkey.py ifdict: [2.110611915588379, 2.12678599357605, 2.1126320362091064] keydict: [2.1322460174560547, 2.098900079727173, 2.0998198986053467] defaultdict: [3.184070110321045, 3.2007319927215576, 3.188380002975464] # pypy2.2 iforkey.py ifdict: [0.510915994644165, 0.23750996589660645, 0.2241990566253662] keydict: [0.23270201683044434, 0.18279695510864258, 0.18002104759216309] defaultdict: [3.4535930156707764, 3.3697848320007324, 3.392897129058838] And using the latest datetime.py: pypy iforkey.py ifdict: [0.2814958095550537, 0.23425602912902832, 0.22999906539916992] keydict: [0.23637700080871582, 0.18506789207458496, 0.1831810474395752] defaultdict: [2.8174121379852295, 2.74626088142395, 2.7308008670806885] Excellent, thank you :-) Regards
participants (4)
-
Alex Gaynor
-
Antonio Cuni
-
Armin Rigo
-
David Naylor