[Tutor] My shot at the partitioning problem

Roeland Rengelink r.b.rigilink@chello.nl
Sat, 18 Aug 2001 09:49:55 +0200


Hi Danny,

I'm also curious about GA (Genetic Algorithm) approaches to searching.
However, thinking this problem over a bit, I concluded that this problem
is unsuitable for GA algorithms.

Here's my thinking on the suitability.

GA relies on three mechanisms to produce potentially better offspring
from the most suitable parents: inheritance, crossover and mutation,
with inheritance and crossover usually combined in one function.

I think the basic assumption in GA algorithms is that the phenotypic
distance to an optimal solution is proportional to the genotypic
distance to an optimal solution. That is, if a particular string of DNA
encodes a close to optimal solution, then that string will be very
similar to the string of DNA encoding _the_ optimal solution.

Looking at mutations we see that in this case that assumption doesn't
hold.

We have:
Goal:
  Divide [sqrt(i) for i in range(1,51)] in two sets, S1 and S2, such
that the sum
  of the values in each set are as close to each other as possible.

Phenotypic Distance:
  distance = abs(sum(S1)-sum(S2))

A one-point mutation moves one element from S1 to S2, or vv. Suppose
that the current distance is <0.5, then any one-point mutation will
increase the distance. Best case is moving the smallest element from the
set with the largest sum. But that element is always greater than 1,
hence, the distance will increase. So, the minimum distance change for a
one point mutation is 1. Similarly the minimum distance change for a
two-point mutation is sqrt(50)-srtq(49) = 0.071 (exchange of two
elements).  The minimum distance change for a three element mutation is
0.0014 (sqrt(50)-sqrt(12)-sqrt(13)) (There are three-point mutations
that don't change the distance). Finding the minimum distance change for
a four-point mutation is left as an exercise for the reader.

The conclusion is that the closer you are to an optimal solution, the
more mutations you need to improve that solution -> Not suitable for GA.

I find it difficult to see exactly what inheritance does here.
Given two parents with small distances, then there are two
possibilities:

1. Parents are very similar -> child is very similar to both parents
which  doesn't help here.

2. Parents are very differents -> child may be very different from both
parents, which previous analysis showed may be exactly what you want.
But I can't show that given the fitness of both parents this child will
be fit. I think that this may in fact be a requirement of the
reproduction function in GA.

One note on your code. You have:

def stringCrossover(s1, s2, probability=0.5):
    """A simple crossover operator on strings s1 and s2."""
    for i in range(len(s1)):
        if random.random() < probability:
            return s1[:i] + s2[i:]
    return s1

ie.

50 % s1[:0]+s2[0:]
25 % s1[:1]+s2[1:]
12.5% s1[:2]+s2[2:]
etc.

Is that really what you want?

I thought more something like:

def stringCrossover(s1, s2):
    """A simple crossover operator on strings s1 and s2."""
    i = random.randrange(len(s1))
    return s1[:i]+s2[i:]

Any thoughts?

Roeland

Danny Yoo wrote:
> 
> Ok, here's my contribution to the partitioning problem.  I can only get it
> to 2 decimal places too, so I'm sure that someone can do much better!
> 
> Sample output:
> 
> ###
> Our best choice (00001010010001100011101001110110000101010110110101)
> has a value of 119.516797799
> Difference between it and our target: 0.00110250314594
> Ok, done computing.  Our best choice,
> sqrt(5) + sqrt(7) + sqrt(10) + sqrt(14) + sqrt(15) + sqrt(19) + sqrt(20) +
> sqrt(21) + sqrt(23) + sqrt(26) + sqrt(27) + sqrt(28) + sqrt(30) +
> sqrt(31) + sqrt(36) + sqrt(38) + sqrt(40) + sqrt(42) + sqrt(43) +
> sqrt(45) + sqrt(46) + sqrt(48) +
> sqrt(50)
> is THIS close to our target: 0.00110250314594
> ###
> 
> Actually, the program is a bit long.  I'll attach it to this message
> instead, and send it off to Useless Python.  I'd have to say that this
> approach was much more Useless than I had anticipated... *grin*.
> 
>   ------------------------------------------------------------------------
>                  Name: genetic.py
>    genetic.py    Type: Plain Text (TEXT/PLAIN)
>              Encoding: BASE64
> 
>                      Name: partition50.py
>    partition50.py    Type: Plain Text (TEXT/PLAIN)
>                  Encoding: BASE64

-- 
r.b.rigilink@chello.nl

"Half of what I say is nonsense. Unfortunately I don't know which half"