[Tutor] Randomize SSN value in field

taserian taserian at gmail.com
Thu May 22 19:17:58 CEST 2008


On Thu, May 22, 2008 at 12:14 PM, GTXY20 <gtxy20 at gmail.com> wrote:
> Hello all,
>
> I will be dealing with an address list where I might have the following:
>
> Name SSN
> John 111111111
> John 111111111
> Jane 222222222
> Jill 333333333
>
> What I need to do is parse the address list and then create a unique random
> unidentifiable value for the SSN field like so:
>
> Name SSNrandomvalue
> John 1a1b1c1d1
> John 1a1b1c1d1
> Jane 2a2b2c2d2
> Jill 3a3b3c3d3
>
> The unique random value does not have to follow this convention but it needs
> to be unique so that I can relate it back to the original SSN when needed.
> As opposed to using the random module I was thinking that it would be better
> to use either sha or md5. Just curious as to thoughts on the correct
> approach.
>
> Thank you in advance.
>
> G.

Both SHA and MD5 are intended to be one-way functions, such that you
can't recover what you provide as an argument. For example (taken from
http://www.python.org/doc/current/lib/module-hashlib.html) :

>>> hashlib.sha224("Nobody inspects the spammish repetition").hexdigest()
'a4337bc45a8fc544c03f52dc550cd6e1e87021bc896588bd79e901e2'

There's no way to take the value 'a4337...' and return "Nobody
insp..", because there are potentially infinite strings that have to
map into the available 224-bit space that sha224 provides.

If you want to be able to recover the SSN, you should probably look at
cryptography. Here's a link that might interest you:
http://www.amk.ca/python/code/crypto.html

Tony R.
aka Taser


More information about the Tutor mailing list