A few years ago I saw a BBC documentary called ‘The Secret Life of Chaos’; it’s worth watching. At the end there is a section where a computer designed a sort of walking stick figure and taught it how to balance. At the time I was amazed that a computer could design something by itself. It used something called a ‘genetic algorithm’. After a few years of programming experience, it turns out it is a lot easier than it looks!
I always hesitate to use the word ‘easy’ when talking about programming. Nothing used to frustrate me more than people saying ‘oh it’s easy’. Learning to code is really difficult – nothing is easy. But, relatively speaking, I was surprised how quickly I figured out genetic algorithms.
Defining a problem
As we’ll see later, genetic algorithms aren’t a magic solution to every problem. Simple problems such as ‘which route to drive to work’ can be solved with this type of algorithm but that’s boring. So, I opted for a slightly more complex problem but one that’s easy to understand and makes some nice graphics to look at.
This video is from 1994. It shows creatures designed by a series of genetic algorithms:
1994? Really!? I suspect this required a supercomputer at the time. Anyway, I decided to go for something similar. Rather than designing an actual creature, I gave the computer a ‘pre-set’ creature and sought the best combination of moves to make it travel the fastest.
I went for a caterpillar because it is easy to define the motion but sufficiently complex that I couldn’t confidently figure out the best combination of moves myself. I also chose this because Hannah recently knitted a caterpillar and it seemed a good idea to ‘bring it to life’.
A real life caterpillar
Real life caterpillars consist of a series of segments with feet. They can essentially do two things to move forwards:
- They can ‘squeeze’ any combination of 2 segments together. Any segments in the middle will then raise up above the ground.
- They can grip the floor with their feet.
Here are the rules for my caterpillar. This is what the algorithm has to work with:
- The caterpillar consists of 10 connected circular segments.
- Segments one and ten must always be touching the floor.
- Any section from 2 to 8 can raise itself up by a small amount every move. When a segment raises, the two adjacent segments are pulled inwards.
- At any time any single segment can ‘lock’ in place – this is the same as a real caterpillar gripping the floor with that segment’s feet.
Note that in my model, segments are told to ‘raise and lower’ themselves. The adjacent segments are then pulled inwards. This is opposite to a real life caterpillar where the segments push in and the central segment(s) raise up as a result. The movement, however, is identical and it is far easier to program this way. The relationship, found by simple trigonometry is:
distance a segment moves inwards when adjacent segment is raised by h = ((h)/(tan(asin(h/0.4))))
Actually modelling it
Believe it or not, programming in the above rules was more complicated that the genetic algorithm.
In the program, a ‘caterpillar’ was a list of 100 moves. Each move was, itself a list of ‘up’, ‘down’ or ‘stay’ commands for each segment. An ‘up’ or ‘down’ command results in the segment moving up or down by 1/10 of the segment’s radius. Included in each move was also a value between 2 and 8. This is the number of the locked segment, which does not move. Only segments on the ground could be locked (since the feet need to be on the ground to grip and hold it in place).
In essence that’s it. A caterpillar is just a list of moves. So we are trying to find the list move moves that will move the caterpillar the farthest…it’s an optimisation problem
Finding and optimum solution
Let’s start thinking like a mathematician. The problem is ‘which combination of 100 moves will make a caterpillar travel furthest?’ So, every possible combination is a solution, but some solutions are better than others. Randomly telling segments to jiggle up and down is going to be a poorer solution than one which is more ordered.
Let’s imagine plotting every solution on a graph, where each solution is on the x axis. We move through the solutions by gradually changing the list of moves represented by each solution. The y axis gives us the result ie. the distance travelled.
Our graph may look something like this:
We can’t plot this graph in real life because we don’t know the result of each solution. So how do we find the solution which gives the peak?
There are various ways to do this. Let’s go through some:
Ok, this solution would actually plot the graph. Providing you can wait beyond the heat death of the universe.
I’ve mentioned this in plenty previous posts. Given any problem, we always ask ourselves, ‘can we solve it by brute force ?’
A brute force attack simply involves checking every possible solution in turn and choosing the best when we have checked all of them.So how many solutions would we have to check? Let’s do a rough calculation:
Every segment can move in one of 3 ways (ignoring locked segments). There are 8 moveable segments and they each can move 100 times for each solution. So that gives (3^8)^100 = 5651^100 possible solutions. Add in the options for locking segments and that’d be a few more orders of magnitude.
Now we can see the place for an optimisation algorithm. The ‘caterpillar problem’ seems pretty simple at first but in reality, the number of possible solutions is mind-boggling. We can’t brute force this one.
Imagine our graph of solutions looked something like this:
We could pick a random point to start at and then change our solution a little bit. If the new solution is better, it becomes our current solution. We then change this a little bit and repeat. By iterating over and over again, we’d eventually climb up the hill and reach the best solution.
That is if we start in the right place. Imagine starting at the marked point on the plot above.
With a hill climbing algorithm once we reach a peak, we are stuck there forever. Since the algorithm can only ‘climb upwards’, it will not necessarily find the best solution.
A small peak that is not the highest one is called a LOCAL OPTIMUM. The best solution is called the GLOBAL MAXIMUM.
We need an algorithm which somehow avoids getting stuck in local optima.
I’m not going to go into much detail on this method because it would need a separate post.
The hill climbing method fails because of the tendency to get stuck in local optima. If, instead of moving up the slope every time, we sometimes moved down, we could avoid these optima. This is how simulated annealing works. The probability of rejecting a better solution and moving down the slope is modeled on the relationship between how quickly a heated object cools down.
This sounds bizarre but it is essentially replicating the natural process of annealing, where atoms in a metal tend to arrange themselves in the lowest possible energy state.
Without further consideration, there is no reason to believe this would be more or less effective than a genetic algorithm. But, this post is about genetic algorithms so we’re not using it here.
The Genetic Algorithm
We have the problem defined – we are trying to find the list of 100 moves that will make the caterpillar travel furthest. Let’s look at the algorithm I wrote to find this list:
- Generate a list of 100 random potential solutions, this is our population. Since we now essentially have a ‘list of lists of lists….’ it’s a good idea to represent this in a diagram:
- This list of potential solutions is our ‘population’.
- Unsurprisingly, none of the individuals in this initial population perform well since they were randomly generated.
- Pick the top 20% of individuals from the population. These are carried forward into the next generation.
- From these ‘elite’ individuals, mutate some and add them to the new population. The number of mutated individuals is 20% of the total population size but can be changed if desired.
- The rest of the new population consists of ‘descendants’ of the ‘elite’ 20%. To create a descendant, we take two random ‘elite’ individuals and combine their characteristics.
- We now have a new population of 100 individuals.
- With the new population, repeat steps 3-6. Print out the distance travelled by the top individual each time.
- Keep repeating until the distance stops changing – ie the algorithm has found its optimum.
In my opinion, this is surprisingly simple and intuitive. We need to look at a few details though:
Recall that each individual consists of a list of 100 moves. And each move consists of a list of 8 ‘segment moves’ and a ‘locked segment allocation’. How do we cross these list of lists?
I decided to make two types of ‘cross functions’:
- In most cases, the two ‘parents’ will produce a ‘simply crossed offspring’. This simply involves picking a random proportion of moves to cross, call the proportion ‘k’. k percent of the first parents moves are added to 100-k percent of the other parents moves to get a new list of moves. This is the offspring.
- In some cases, a ‘complex cross’ occurred. This follows the same principle as the simple cross. But, the individual segment moves for each single move are also crossed. This ensures more diversity is constantly added to the model.
It is worth noting I decided to add the ‘complex cross’ function empirically. After a few trial runs, it became apparent that the algorithm wasn’t finding a solution particularly quickly.
This proved slightly challenging but only needed a bit of common sense.
Examples of genetic algorithms that I’d studied would involve slightly changing a single element of a list to mutate an individual. Since, in our case, an individual contained 900 elements (8 segment moves and a locked segment for each move), changing a single element was unlikely to introduce diversity to the population at a fast enough rate.
Instead, I wrote the mutate function to apply small mutations most of the time and large mutations occasionally.
A mutation consists of changing an element of a move for an individual. The function goes over every move and generates a random number. If this number is below a predetermined value, this move is chosen for mutation. This is then repeated with a second randomly generated number for each segment in the move (including the locked segment allocation). If the second random number is lower than a predetermined value, the segment is mutated.
A ‘mutation’ simply consists of randomly changing the segment move to another value.
I settled on non-exceedance values of 2% and 10% respectively for the random numbers. That way, we’d expect 2% of moves for an individual to be mutated, and approximately 1 segment to change with each mutated move.
Of course, this occasionally gives far larger mutations which will occasionally cause the algorithm to ‘escape’ from a local optimum.
On the face of it, the caterpillar problem doesn’t look too difficult. However, I did briefly have a go at working out the best solution myself (I failed miserably).
The bottom line is there are more options than it first appears.
Is it better to move like this:
A series of these (like a spring):
Some combination of the above? Some other option that I haven’t thought of?
In the end, I decided that 7.6 was the maximum possible distance the caterpillar could travel in 100 moves.
Let’s look at how the caterpillar evolved with each generation. I started it off with the head at position 10 on a number line.
The first generation, being randomly generated just looked tragic:
By 100 generations, it was starting to go somewhere:
By 1000 generations, there was a definite regime. It was now clear the ‘spring-type’ motion was probably going to be the solution.
At this point, it is worth noting the improvement per generation diminishes as we add more generations. Occasionally, there was a jump (probably when a mutation carried the algorithm out of a local optimum) but generally, it could go on for hundreds of iterations without improvement. Watching the new values printing to screen almost drove me crazy ‘do I stop it now? Is it going to improve any more?’
I decided to let it run for 100000 iterations. It took 2 days but that’s ok, I wasn’t in a rush.
By 20000 generations it looked like there wasn’t going to be much more improvement:
And by the end – 100000 generations, we have a really nice solution:
The genetic algorithm gave a travel distance of 8.1
Here are my thoughts on the solution:
- After viewing the result of the 1000th iteration, I realised my prediction of 7.6 was wrong. I’d assumed the caterpillar was going to move each segment fully up and down for each ‘flex of the spring’. Of course, anyone with a basic understanding of a circle would realise it’s more efficient to raise alternating circles up to their full height and just move them the minimum amount.
- It is interesting to note the ‘raise up’ and ‘lower down’ at the start and end to maximise the distance covered over the limited number of moves.
- I also realised two segments would have to be at the same level throughout due to there being an even number of segments, and the presence of the constraint to ground the start and end segments. The algorithm chose the first two possible segments to ‘lock together’.
- I never thought of moving the segments at the ground up slightly on each move to maximise the distance either.
- The solution doesn’t look ‘elegant’. I expected a nice wavy motion, or a smooth ‘springy’ motion. This is the nature of this type of algorithm – they give the best not the prettiest solution. Humans are often terrible at finding the best solution.
This algorithm was written in an evening, imagine what could be done by a team of programmers in a month. My example certainly doesn’t show the full potential of genetic algorithms. A youtube search will throw up some fascinating cases where it seems an algorithm has failed but when the solution is tried out, it turns out to be far smarter than anything a human could devise.
It is easy to get carried away about optimisation algorithms. On the face of things, they can appear frighteningly lifelike. They are used widely in the real world and will continue to improve, finding solutions that humans would never think of.
On the other hand, goo is pretty good at optimisation too.
This is the code for the genetic algorithm. It is run by using the ‘evolve’ function. The code is object oriented to a degree which makes it a little easier to follow. At present, it saves one in every ten generations to be animated later. This uses up a lot of memory and can be turned off by commenting out the log.append function on line 315:
To animate caterpillars from the log of moves from the above code, use the ‘animatecaterpillar’ function here. The position of the caterpillar to animate in the list needs to be changed (see the comments):