[issue14478] Decimal hashing very slow, could be cached

James Hutchison report at bugs.python.org
Mon Apr 2 19:16:26 CEST 2012


New submission from James Hutchison <jamesghutchison at gmail.com>:

Tested on 3.2

Note that I noticed that Decimal is supposed to be faster in 3.3 but I thought I would bring this to light just in case its still relevant

Decimal hashing is very slow, even for simple numbers. I found by caching the hash result, there is significant speed up whenever a Decimal value is reused.

I created a class that inherits from Decimal and changed the __hash__ function to try/except a stored hash value as such:

    def __hash__(self):
        try: return self.myHash;
        except:
            self.myHash = super().__hash__();
            return self.myHash;

Example code:

d = dict();
start = time.time();
i = int(1);
j = int(2);
k = int(3);
for i in range(100000):
    d[i,j,k] = 5;
print(time.time() - start);

d = dict();
start = time.time();
i = Decimal(1);
j = Decimal(2);
k = Decimal(3);
for i in range(100000):
    d[i,j,k] = 5;
print(time.time() - start);

d = dict();
start = time.time();
i = CachingDecimal(1);
j = CachingDecimal(2);
k = CachingDecimal(3);
for i in range(100000):
    d[i,j,k] = 5;
print(time.time() - start);

Output
int:  0.04699993133544922
Decimal:  2.312999963760376
CachingDecimal:  0.1100001335144043

In a real life example, I changed some of the values in my code from int to Decimal because non-whole numbers needed to be supported (and this seemed like the easiest way to do so without having to worry about my == comparisons breaking) and it slowed my code down massively. Changing to a CachingDecimal type sped up one code block from 92 seconds with Decimal to 7 seconds, which was much closer to the original int speed.

Note that all of the relevant operations have to be overloaded to return the CachingDecimal type, which is a pain, so this makes a lot of sense to implement into the Decimal module. My understanding is that altering a Decimal value will always create a new Decimal object, so there shouldn't be any issues with the cached hash desyncing with the correct hash. Someone may want to check that though.

Thanks,

James

----------
components: Library (Lib)
messages: 157369
nosy: Jimbofbx
priority: normal
severity: normal
status: open
title: Decimal hashing very slow, could be cached
type: performance
versions: Python 3.2

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14478>
_______________________________________


More information about the Python-bugs-list mailing list