Thu Jun 28 07:02:30 CEST 2007
"Martin v. Löwis" <martin at v.loewis.de> writes:
> So: what are your input data, and what is the
> distribution among them?
With good enough hash functions one shouldn't need to care about
the input distribution. Basically functions like SHA can be
used as extractors:
If there's a concern that the input distribution is specially
concocted to give nonuniform results with some known hash function,
then use one unknown to the input provider, e.g.
def hash(obj, key='some string unknown to the input source'):
return int(hmac.HMAC(key,repr(obj)).hexdigest()[:4], 16)
Anyway I don't have the impression that the OP is concerned with this
type of issue. Otherwise s/he'd want much longer hashes than 16 bits.
More information about the Python-list