encryption with python

Steve Holden steve at holdenweb.com
Thu Sep 8 01:30:13 CEST 2005

```Paul Rubin wrote:
> James Stroud <jstroud at mbi.ucla.edu> writes:
>
>>Then your best bet is to take a reasonable number of bits from an sha hash.
>>But you do not need pycrypto for this. The previous answer by "ncf" is good,
>>but use the standard library and take 9 digits to lessen probability for
>>clashes
>>
>>import sha
>>def encrypt(x,y):
>>    def _dosha(v): return sha.new(str(v)).hexdigest()
>>    return int(_dosha(_dosha(x)+_dosha(y))[5:13],16)
>>...
>>Each student ID should be unique until you get a really big class. If your
>>class might grow to several million, consider taking more bits of the hash.
>
>
> Please don't give advice like this unless you know what you're doing.
> You're taking 8 hex digits and turning them into an integer.  That
> means you'll probably have a collision after around 65,000 id's, not
> several million.  "Probably" means > 50%.  You'll have a significant
> chance (say more than 1%) of collision after maybe 10,000.
>
> Also, if you know the student's graduation year, in most cases there
> are just a few hundred likely birthdates for that student, so by brute
> force search you can crunch the output of your function to a fairly
> small number of DOB/SSN combinations.
>
> The only approach that makes sense is for the secure database to
> assign arbitrary numbers that aren't algorithmically related to any
> sensitive data.  Answers involving encryption will need to use either
> large ID numbers or secret keys, both of which will cause hassles.

This is indubitably true. There's absolutely no excuse for making the
primary key a function of the data that record contains, as doing so
will assist any cryptanalytical attacks.

regards
Steve
--
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC             http://www.holdenweb.com/

```