[Edu-sig] Cryptonomicon (group theory for youngsters)

Kirby Urner pdx4d@teleport.com
Thu, 15 Mar 2001 22:39:22 -0800


Some of the recent tricks with Cards (as in deck of) fed
my renewed look at ciphers and cryptology, as a good segue
to/from group theory, with Python providing the playground
equipment and other fun apparatus.

Like, in earlier posts I'd been randomly choosing letters
to pair with other letters (to set up a 'clubhouse code' 
or simple letter substitution cipher), but now I've learned
about the 'shuffle' method in SLM (Standard Library Module)
random.py -- so the code gets simpler still:

 _domain = list(uppercase)

 def mkcode():
     """
     Uniquely pair each element in _domain with another
     """
     uc = _domain+[]
     shuffle(uc)
     dict ={}
     for i,j in zip(_domain,uc):
        dict[i]=j
     return dict

_domain+[] is another way of making sure we assign uc a 
*copy* of _domain, and not a pointer to the same memory
object.  Then we shuffle the copy, create a blank dictionary,
and fill it with unshuffled->shuffled pairs, e.g. A->C, 
B->Q and so on.

Usage:

 >>> mycode = ciphers.mkcode()
 >>> mycode
 {'Z': 'L', 'X': 'J', 'Y': 'S', 'V': 'U', 'W': 'H', 'T': 'Q', ... etc.}

So what's with _domain?  Well, I might want to use integers
instead of letters, which isn't as nice for letter substitution,
but is what the group theory people use when presenting a 
permutation (same as a substitution dictionary).

Alternative usage:

 >>> ciphers._domain = range(1,11)
 >>> mycode = ciphers.mkcode()
 >>> mycode
 {10: 3, 9: 10, 8: 9, 7: 6, 6: 2, 5: 8, 4: 4, 3: 1, 2: 7, 1: 5}

New the group theory folks have an alternative notation for such
as the above.  Pick a number, any number, then worm your way
through the dictionary, chaining each item to the one in points
to, e.g. 10->3->1->5->8->9->10.  Then organize this into a tuple:
(10,3,1,5,8,9), with the understanding that each member is 
followed by its pair, with the last wrapping around to the first.
But notice:  we didn't get them all.  6->2->7->6 is another 
cycle, with no members in common with the first (i.e. they're 
'disjoint').

So it's a fun exercise in Python to write a routine that takes
a dictionary of unshuffled,shuffled pairs, as up top, and 
converts same to a list of cyclic tuples, as per this example.

Usage:

 >>> ciphers.mkcycles(mycode)
 [(10, 3, 1, 5, 8, 9), (7, 6, 2)]

Yep (doesn't matter that here it's (7,6,2) instead of (6,2,7)
-- same info).

And you'd want to make it bidirectional:

 >>> p = ciphers.mkcycles(mycode)
 >>> ciphers.mkdict(p)
 {10: 3, 9: 10, 8: 9, 7: 6, 6: 2, 5: 8, 4: 4, 3: 1, 2: 7, 1: 5}

The thing to realize is that when a code pairs a letter with
itself (or number with itself), you simply don't mention it
in the cycles notation.  Like, if it's the 'identity dictionary'
(*every* letter is paired with itself), then the list of 
corresponding cyclic tuples is empty, and vice versa:

 >>> ciphers.mkcycles({1:1,2:2,3:3,4:4})
 [ ]
 >>> ciphers._domain = list(ciphers.uppercase)
 >>> ciphers.mkdict([])
 {'Z': 'Z', 'X': 'X', 'Y': 'Y', 'V': 'V', 'W': 'W', 'T': 'T' ... etc.}

Back to cryptology:  if we use simple letter substitution, with
the same new letter always taking the place of the same old letter,
then certain patterns will clue the would-be code cracker.  
Palindromes, which spell out the same sentence forward and 
backward, will still be palindromes, even if nonsensical.

Consider the 'encrypt' method:  it simply takes one of these 
substition dictionaries (a.k.a. permutations) and does the 
swap, ignoring any characters/punctuation/other symbols 
that aren't being converted:

 def encrypt(plaintext,secretkey):
     """
     substitute shuffled elements as per secretkey, or
     leave as is if not in the key -- designed to work
     with _domain = list(uppercase)
     """
     ciphertext = ""
     keys = secretkey.keys()
     for i in plaintext.upper(): 
  	 if not i in keys:
	    ciphertext += i
         else:
   	    ciphertext += secretkey[i]
     return ciphertext

So you can go:

 >>> mycode = ciphers.mkcode() # make a random permutation
 >>> ciphers.mkcycles(mycode)  # view permutation as list of cycles
 [('Z', 'R', 'H', 'A', 'L', 'X', 'N', 'M', 'Y', 'F', 'C', 'J', 'T', 
   'U', 'G', 'P', 'I',  'Q', 'O', 'B', 'V', 'W', 'E'), ('S', 'D', 'K')]
 >>> encrypted = ciphers.encrypt("Able was I ere I saw Elba",mycode)
 >>> encrypted
 'LVXZ ELD Q ZHZ Q DLE ZXVL'
 >>> decrypted = ciphers.decrypt(encrypted,mycode)
 >>> decrypted
 'ABLE WAS I ERE I SAW ELBA'

Notice that the encrypted palindrome is still a palindrome.  That
makes is a lot easier to crack.

Back to group theory:  we can multiply permutations together, 
meaning if you scramble a scramble, you get another scramble.
For example if p1 is [('A','C','D')] and p2 is [('A','D','C')]
then p1*p2 means we apply p1 to p2 i.e. A->D->A, so A->A,
and D->C->D so D->D, and yes, C->C.  p1 and p2 are inverses
of one another, since their product is the identity permutation
(that's the definition of inverse pairs -- their product is
the identity e, where p1*e = e*p1 = p1).

So it makes sense to design a Python class that makes these 
permutations into objects, which we can then multiply together
using an overridden * symbol (we'll define both __mul__ and
__pow__ -- the latter because a permutation can also permute
itself, as many times as we like).

Here's what this looks like:

 >>> ciphers._domain = ['A','D','C']   # a restricted domain
 >>> dict1 = ciphers.mkdict(['A','C','D']) 
 >>> dict2 = ciphers.mkdict(['A','D','C'])
 >>> dict1
 {'C': 'C', 'D': 'D', 'A': 'A'}
 >>> dict2
 {'C': 'C', 'D': 'D', 'A': 'A'}
 >>> dict1 = ciphers.mkdict([('A','C','D')])
 >>> dict1
 {'C': 'D', 'D': 'A', 'A': 'C'}
 >>> dict2 = ciphers.mkdict([('A','D','C')])
 >>> dict2
 {'C': 'A', 'D': 'C', 'A': 'D'}
 >>> p1 = ciphers.P(dict1)    # create permutation objects...
 >>> p2 = ciphers.P(dict2)    # using code dictionaries
 >>> p1*p2       
 Permutation: []

Note:  Permutation: [] is the empty list, i.e. the identity 
permutation -- the expected result.

This idea of multiplying permutations, raising them to 
powers and taking their inverses suggests we might create
a more elaborate enciphering strategy wherein the permutations
multiply to give new dictionaries with each letter to be
enciphered -- a reversible process (one would hope).  

                   [  to be continued  ]

Kirby