SHA-based subclass for random module

Paul Rubin http
Sun Mar 21 18:16:02 EST 2004


python at rcn.com (Raymond Hettinger) writes:
> >  If we're selecting a PRNG for a third
> > time, let's stop cutting corners and take care of the problem once and
> > for all.
> 
> No corners were cut.  MT is a state-of-the-art psuedo random number
> generator.  The problem is that you want encryption.

This sha prng module was motivated partly by your own remark in sf bug
#901285 that MT output may not be sufficiently correlation-free if you
take pairs of 16-bit integers from it, in a context having nothing to
do with encryption.  Anyway, despite how it may sound, I don't have
any desire to use the random module for encryption (I'd use different
API's for that).  I just want to be able to write applications that
use random numbers without having any doubts about whether the random
numbers are good enough.  Teukolsky et al in "Numerical Recipes",
which is not a cryptography book, explain that a good way to do that
is to use cryptographic primitives, so I'm proposing a module that
does it that way.  

The application in numerics could be something like: I run some
half-hour simulation using MT, and get results that subjectively seem
a bit peculiar.  So I run it again using the crypto-based generator
and maybe that takes three hours, but if I still see similarly
peculiar-looking results, I now can be pretty confident that it's not
some consequence of the RNG internals.  It subjectively seems to me
that the Gnome version of Freecell is easier to play than the Windows
version (i.e. it deals easier hands), so that's a semi-real-world
example of this situation.  (In this instance I suspect Freecell is
using some pathetic linear PRNG or something like that, but I haven't
yet bothered to look at the code).

There's also some applications (like online games) that need secure
random numbers, as mentioned in 917055.  The people implementing these
things usually aren't even thinking about cryptography.

> > No, I wanted cryptographic strength to both the left and right,
> 
> That makes sense.  Still, the hashing step protects you.  Given a
> series of 10,000 bits from the generator, do you have any means of
> deducing either a) the state of MT, b) what came before, or c) what
> comes after?

Cryptographic strength to the left means I shouldn't be able to
recover what came before, even given the current output of getstate()
on the generator (see the Yarrow paper for some rationale for wanting
this protection).  I haven't seen any claim that MT was designed to
resist that.

I'm not adamant that this is a crucial property of a deterministic
PRNG; I implemented it because I saw it as straightforward to do and
figured it might as well be included, so that we can say that we did
the most thorough possible job, followed the Yarrow paper's
prescriptions, etc.  If there's a deliberate decision to abandon the
property, that can make the generator a bit simpler and faster, and
I'm willing to code one that way for purposes of comparison.  I just
don't think such decisions should be driven by implementation
accidents.  (They should also be documented).

> I'm keeping an open mind but will abandon this patch unless the tone
> improves.

I don't mean to bash anyone and I'm sorry if it came across that way.
As I see it, the Random module provides an extensible API so that it's
easy to add different kinds of generators that the user can select
from depending on his or her needs.  It then supplies the WH generator
(I guess for backwards compatibility) and the MT generator (for
reasonably good output and very high performance).  The philosophy of
this sha module (as I see it) is to supply a third alternative in that
same flexible spirit: one that goes all-out to make the best possible
output, even if that results in only reasonable performance, while
also having increased portability through use of a standard underlying
primitive.  Do you also see that as an appropriate goal?  If not, we
should try to identify exactly what we're trying to do instead.



More information about the Python-list mailing list