Is this secure?
Steven D'Aprano
steven at REMOVE.THIS.cybersource.com.au
Tue Feb 23 21:40:13 EST 2010
On Tue, 23 Feb 2010 15:36:02 +0100, mk wrote:
> The question is: is this secure? That is, can the string generated this
> way be considered truly random?
Putting aside the philosophical question of what "truly random" means, I
presume you mean that the letters are uniformly distributed. The answer
to that is, they don't like uniformly distributed.
This isn't a sophisticated statistical test, it's the equivalent of a
back-of-the-envelope calculation: I generated 100,000 random strings with
your code, and counted how often each letter appears:
If the letters are uniformly distributed, you would expect all the
numbers to be quite close, but instead they range from 15063 to 25679:
{'a': 15063, 'c': 20105, 'b': 15100, 'e': 25465, 'd': 25458, 'g': 25597,
'f': 25589, 'i': 25045, 'h': 25679, 'k': 22945, 'j': 25531, 'm': 16187,
'l': 16252, 'o': 16076, 'n': 16012, 'q': 16069, 'p': 16119, 's': 16088,
'r': 16087, 'u': 15951, 't': 16081, 'w': 16236, 'v': 15893, 'y': 15834,
'x': 15956}
Eye-balling it, it looks vaguely two-humped, one hump around 15-16K, the
second around 22-25K. Sure enough, here's a quick-and-dirty graph:
a | ***********************************
b | ***********************************
c | ***********************************************
d | ***********************************************************
e | ***********************************************************
f | ************************************************************
g | ************************************************************
h | ************************************************************
i | ***********************************************************
j | ************************************************************
k | ******************************************************
l | **************************************
m | **************************************
n | *************************************
o | **************************************
p | **************************************
q | **************************************
r | **************************************
s | **************************************
t | **************************************
u | *************************************
v | *************************************
w | **************************************
x | *************************************
y | *************************************
The mean of the counts is 19056.72, and the mean deviation is 3992.28.
While none of this is statistically sophisticated, it does indicate to me
that your function is nowhere even close to uniform. It has a very strong
bias.
--
Steven
More information about the Python-list
mailing list