[New-bugs-announce] [issue14478] Decimal hashing very slow, could be cached
James Hutchison
report at bugs.python.org
Mon Apr 2 19:16:26 CEST 2012
New submission from James Hutchison <jamesghutchison at gmail.com>:
Tested on 3.2
Note that I noticed that Decimal is supposed to be faster in 3.3 but I thought I would bring this to light just in case its still relevant
Decimal hashing is very slow, even for simple numbers. I found by caching the hash result, there is significant speed up whenever a Decimal value is reused.
I created a class that inherits from Decimal and changed the __hash__ function to try/except a stored hash value as such:
def __hash__(self):
try: return self.myHash;
except:
self.myHash = super().__hash__();
return self.myHash;
Example code:
d = dict();
start = time.time();
i = int(1);
j = int(2);
k = int(3);
for i in range(100000):
d[i,j,k] = 5;
print(time.time() - start);
d = dict();
start = time.time();
i = Decimal(1);
j = Decimal(2);
k = Decimal(3);
for i in range(100000):
d[i,j,k] = 5;
print(time.time() - start);
d = dict();
start = time.time();
i = CachingDecimal(1);
j = CachingDecimal(2);
k = CachingDecimal(3);
for i in range(100000):
d[i,j,k] = 5;
print(time.time() - start);
Output
int: 0.04699993133544922
Decimal: 2.312999963760376
CachingDecimal: 0.1100001335144043
In a real life example, I changed some of the values in my code from int to Decimal because non-whole numbers needed to be supported (and this seemed like the easiest way to do so without having to worry about my == comparisons breaking) and it slowed my code down massively. Changing to a CachingDecimal type sped up one code block from 92 seconds with Decimal to 7 seconds, which was much closer to the original int speed.
Note that all of the relevant operations have to be overloaded to return the CachingDecimal type, which is a pain, so this makes a lot of sense to implement into the Decimal module. My understanding is that altering a Decimal value will always create a new Decimal object, so there shouldn't be any issues with the cached hash desyncing with the correct hash. Someone may want to check that though.
Thanks,
James
----------
components: Library (Lib)
messages: 157369
nosy: Jimbofbx
priority: normal
severity: normal
status: open
title: Decimal hashing very slow, could be cached
type: performance
versions: Python 3.2
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14478>
_______________________________________
More information about the New-bugs-announce
mailing list