[Edu-sig] Cryptonomicon (group theory for youngsters) #2
Kirby Urner
pdx4d@teleport.com
Sat, 17 Mar 2001 11:06:44 -0800
For a set of elements to comprise a group vis-a-vis some
definition of the binary operator * (multiplication), they
need to combine (multiply) associatively, but not necessarily
commutatively. If * turns out to be commutative, the group
is said to be Abelian.
The complete set of properties you need to be a group are:
* Closure: the product of two elements in the set is
always itself in the set
* Associativity: p1*p2*p3 has an unambiguous meaning i.e.
(p1*p2)*p3 and p1*(p2*p3) give the same result
* Inverse: every element has an inverse such that the
product of the two is the identity element
* Neutral: some element e, called the identity element,
has a "neutral effect" when multiplying other
elements i.e. p*e=e*p=p.
These four properties spell CAIN, an easy mnemonic, especially
in light of Abelian groups, named for the mathematician
Niels Henrik Abel (1802-29), but also the name of Cain's brother
in the Genesis story.
You'll see below that our domain of alphabet or integer permutations
under multiplication is a non-Abelian group, i.e. * is a non-
commutative, but associative operation.
But first, we need some permutation objects to play with. Let's
start with some simple integer-based ones.
>>> from ciphers import * # asterisk means "all" here
>>> ciphers._domain = range(1,11)
>>> p1=P()
>>> p2=P()
>>> p1
Permutation: [(8, 5, 2, 6, 7, 4, 1)]
>>> p2
Permutation: [(10, 7, 6, 8, 2, 5, 1), (9, 4, 3)]
When objects of class P are initialized with no arguments,
we simply invoke mkcode() to shuffle the domain and create
a substition dictionary of unshuffled:shuffled key:value
pairs:
class P:
"""
Permutations: these objects multiply with
each other, return an inverse, may be raised
to an integer power
"""
def __init__(self,dict=None):
if dict==None:
self.dict = mkcode()
else:
self.dict = dict
Multiplication is defined as follows:
class P:
...
def __mul__(self,other):
newdict = {}
for i in other.dict.keys():
newdict[i] = self.dict[other.dict[i]]
return P(newdict)
Now we're ready to demonstrate non-commutativity:
>>> p1*p2
Permutation: [(10, 4, 3, 9, 1), (8, 6, 5)]
>>> p2*p1
Permutation: [(10, 7, 3, 9, 4), (8, 1, 2)]
Remember, these cyclic expressions are another way of
presenting the same info as held by a substitution dictionary.
In the previous post, the mkcycles(dict) method was introduced,
for going from a dictionary to a list of tuples. This method
is built in to the P class, as it's method for self-representation:
class P:
...
def __repr__(self):
return "Permutation: " + str(mkcycles(self.dict))
p1*p2 applies p1 to p2, so after 10->7, we take 7->4, so in the
product, 10->4.
Demonstrating associativity:
>>> p3
Permutation: [(9, 6, 7), (8, 5), (4, 2, 1, 3)]
>>> p1*(p2*p3)
Permutation: [(10, 4, 2), (9, 5, 6, 7, 1)]
>>> (p1*p2)*p3
Permutation: [(10, 4, 2), (9, 5, 6, 7, 1)]
Or we could have done:
>>> p1*p2==p2*p1
0
>>> p1*(p2*p3)==(p1*p2)*p3
1
The method for determining equality simply compares the
internal dictionary of two objects:
class P:
...
def __eq__(self,other):
return self.dict==other.dict
Note: some specific p1,p2 may commute, e.g. if p1 and p2 are
inverses -- it's just that they need not. The operation * is not
in general a commutative operation in this realm of permutations.
Using a P-class dictionary, we can encipher a message. Let's
make p1 the result of 4 random permutations, intermultiplied
(itself a permutation -- by virtue of Closure):
>>> ciphers._domain = list(uppercase)
>>> p1 = P()*P()*P()*P()
>>> p1
Permutation: [('Z', 'K', 'C', 'N', 'U', 'M', 'B', 'Y', 'H', 'R',
'I', 'P', 'E', 'L', 'W', 'O', 'D', 'Q', 'T', 'J', 'X', 'G', 'F',
'S'), ('V', 'A')]
Let's make p2 be the inverse of p1 (every p has an inverse), i.e.
its internal dictionary is a reverse version of p1's:
>>> p2 = p1.inv()
>>> p2
Permutation: [('Z', 'S', 'F', 'G', 'X', 'J', 'T', 'Q', 'D', 'O',
'W', 'L', 'E', 'P', 'I', 'R', 'H', 'Y', 'B', 'M', 'U', 'N', 'C',
'K'), ('V', 'A')]
We see that to decrypt is to reverse the process of encryption,
which in this case means encrypting the ciphertext with the
reversed dictionary, to get the plaintext back out (except
converted to uppercase as a side-effect):
>>> encrypt("Able was I ere I saw Elba",p1.dict)
'VYWL OVZ P LIL P ZVO LWYV'
>>> encrypt("VYWL OVZ P LIL P ZVO LWYV",p2.dict)
'ABLE WAS I ERE I SAW ELBA'
When we multiply a permutation by itself, that's like raising a
number to a power, e.g. p1*p1 may be written as p1**2 (** is
Python's exponentiation syntax) or, alternatively, as pow(p1,2).
Likewise, p1.inv() may be expressed as p1**(-1) or pow(p1,-1),
since p1*p1.inv() = [] or the identity element. Raising p1 to
the -3 power is equivalent to raising it to the 3rd power and
then taking the inverse of the result, i.e. pow(p1,-3) =
(p1*p1*p1).inv():
>>> pow(p1,-3)
Permutation: [('Z', 'G', 'T', 'O', 'E', 'R', 'B', 'N'), ('X',
'Q', 'W', 'P', 'H', 'M', 'C', 'S'), ('Y', 'U', 'K', 'F', 'J',
'D', 'L', 'I'), ('V', 'A')]
>>> (p1*p1*p1).inv()
Permutation: [('Z', 'G', 'T', 'O', 'E', 'R', 'B', 'N'), ('X',
'Q', 'W', 'P', 'H', 'M', 'C', 'S'), ('Y', 'U', 'K', 'F', 'J',
'D', 'L', 'I'), ('V', 'A')]
The internal P-class __pow__ method simply invokes the
already-define __mul__ repeatedly, inverting as necessary
(i.e. if the exponent is a negative number). Also p1**0 is
defined to be the identity element, whereas p1**1 is just
p1 itself -- definition analogous to the behavior of * when
used for the "regular" multiplication of integers or reals.
class P:
'''
def __pow__(self,n):
new = P(mkdict([]))
if n==0: return new
for i in range(abs(n)):
new = self * new
if n<0:
new = new.inv()
return new
When a permutation permutes itself, as in p1*p1, the tuple
elements skip ahead one notch, to pair with elements that
had been 2 ahead:
>>> p1
Permutation: [('Z', 'K', 'C', 'N', 'U', 'M', 'B', 'Y', 'H',
'R', 'I', 'P', 'E', 'L', 'W', 'O', 'D', 'Q', 'T', 'J', 'X',
'G', 'F', 'S'), ('V', 'A')]
>>> p1**2
Permutation: [('Z', 'C', 'U', 'B', 'H', 'I', 'E', 'W', 'D',
'T', 'X', 'F'), ('Y', 'R', 'P', 'L', 'O', 'Q', 'J', 'G', 'S',
'K', 'N', 'M')]
Whereas in p1, Z->K, in p1**2, Z->C, which was 2 ahead in p1.
What this suggests is that any given tuple will eventually
point all the way around the cycle back to the start, such
that Z->Z, K->K and so on.
But because the different tuples may have different lengths,
they may not all reach this "identity position" at the same
power. But is their some power where all the tuples are
_guarenteed_ to become the identity mapping, all at the same
time? Yes! That would be the lowest common multiple of the
lengths of all the tuples.
With p1 above, the tuples have lengths 24, and 2. The lcm
is 24, so p1**24 should be the identity element...
>>> p1**24
Permutation: []
Let's try another one:
>>> newp = P()
>>> newp
Permutation: [('Z', 'E', 'G', 'N', 'W', 'Y', 'J', 'Q',
'X', 'F', 'M', 'L', 'T', 'A', 'K', 'H'), ('V', 'B', 'D',
'P', 'R', 'C', 'O', 'U', 'S')]
>>> map(len,mkcycles(newp.dict))
[16, 9]
>>> lcm(16,9)
144
>>> pow(newp,144)
Permutation: []
The power to which you need to raise permutation p to get
the identity element is what we call the 'order of p'. We
can build this into the P class as another method, ord():
class P:
...
def ord(self):
"""
returns the exponent n of p such that p**n = []
(identity element)
"""
return reduce(lcm, map(len, mkcycles(self.dict)))
Usage:
>>> newp.ord()
144
Given this definition, pow(p1,p1.ord()) should always return
the identity permutation, which we may initialize with the
identity dictioinary:
>>> ident = P(mkdict([]))
>>> anyp = P()
>>> pow(anyp,anyp.ord())==ident
1
Once anyp reaches the identity state, the next multiplication
will get us back to the original anyp, and the cycle begins
again, i.e. pow(anyp,1), pow(anyp,2)... pow(anyp,anyp.ord())
is a complete cycle of unique permutations based on a base
of anyp.
Our strategy for building a more sophisticated enciphering engine
will be to have a set of permutations, which we will call
"rotors", organized "odometer fashion" such that when the
first powers itself around a complete cycle, it'll kick the
second to the next power, and the second, when "once around
the clock" will kick the third and so on.
This is like having a mixed-base number (presuming the rotors
have different orders, different periods), and counting up
from 0 in that system. If the order of the first rotor is
7, then we would go 001, 002 ... 006, 010 -- p1 becomes the
identity permutation just as p2 kicks up to its first power.
The engine will run the counter up one click for each letter
that gets enciphered, so the substitution dictionary will
be constantly changing. This will remove the property of
always having the same letter stand for some other letter.
And, to make things even more difficult, we'll throw in
the space character as one more "letter" to be enciphered,
thereby obscuring word boundaries.
Because this enciphering strategy is somewhat akin to that
used during WWII by a number of encyption machines, including
the famous Enigma, I'll call this my Enigma class.
>>> ciphers._domain = ciphers._domain + [' '] # add space to domain
>>> enigma = Enigma([P(),P(),P()]) # use 3 random rotors
>>> enigma.counter.bases # the orders of these rotors
[50, 84, 18]
>>> enigma.encrypt("Able was I ere I saw Elba")
'BFTMSWZZF ESKBMAKXDVSYTJJ'
>>> enigma.reset() # return counter to [0,0,0], self.dict to []
>>> enigma.decrypt("BFTMSWZZF ESKBMAKXDVSYTJJ")
'ABLE WAS I ERE I SAW ELBA'
Notice the ciphertext is no longer palindromic -- isn't the same
forwards and backwards, unlike the original. Plus word spacing
has been effectively masked. Suddenly, our code is much harder
to crack.
The decryption method simply runs the rotors in reverse, and
associatively "undoes" whatever was done by the encryption method,
returning our ciphertext back to the plaintext original (except
for the upper-casing side effect).
[ to be continued ]