[docs] [issue32554] random.seed(tuple) uses the randomized hash function and so is not reproductible
Raymond Hettinger
report at bugs.python.org
Mon Jan 15 13:41:43 EST 2018
Raymond Hettinger <raymond.hettinger at gmail.com> added the comment:
I'm getting a nice improvement in dispersion statistics by shuffling in higher bits right at the end:
/* Disperse patterns arising in nested frozensets */
+ hash ^= (hash >> 11) ^ (~hash >> 25);
hash = hash * 69069U + 907133923UL;
Results for range() check:
range range
baseline new
1st percentile 35.06% 40.63%
1st decile 48.03% 51.34%
mean 61.47% 63.24%
median 63.24% 65.58%
Test code for the letter_range() test:
letter letter
baseline new
1st percentile 39.59% 40.14%
1st decile 50.90% 51.07%
mean 63.02% 63.04%
median 65.21% 65.23%
def letter_range(n):
return string.ascii_letters[:n]
def powerset(s):
for i in range(len(s)+1):
yield from map(frozenset, itertools.combinations(s, i))
# range() check
for i in range(10000):
for n in range(5, 19):
t = 2 ** n
mask = t - 1
u = len({h & mask for h in map(hash, powerset(range(i, i+n)))})
print(u/t*100)
# letter_range() check needs to be restarted (reseeded on every run)
for n in range(5, 19):
t = 2 ** n
mask = t - 1
u = len({h & mask for h in map(hash, powerset(letter_range(n)))})
print(u/t)
