[Tutor] check against multiple variables
Steven D'Aprano
steve at pearwood.info
Fri Jul 20 00:20:32 CEST 2012
Selby Rowley-Cannon wrote:
> I am using a hash table in a small randomization program. I know that
> some hash functions can be prone to collisions, so I need a way to
> detect collisions.
I doubt that very much.
This entire question seems like a remarkable case of premature optimization.
Start with demonstrating that collisions are an actual problem that need fixing.
Unless you have profiled your application and proven that hash collisions is a
real problem -- and unless you are hashing thousands of float NANs, that is
almost certainly not the case -- you are just wasting your time and making
your code slower rather than faster -- a pessimation, not optimization.
And if it *is* a problem, then the solution is to fix your data so that its
__hash__ method is less likely to collide. If you are rolling your own hash
method, instead of using one of Python's, that's your first problem.
Python's hash implementation is one of the most finely tuned in the world.
Many, many years of effort have gone into making it stand up to real-world
data. You aren't going to beat it with some half-planned pure-Python work-around.
--
Steven
More information about the Tutor
mailing list