Home > Genetic Programming, Software Development > Watchmaker: A First and Second Example

Watchmaker: A First and Second Example


Time to look at the framework that I was sure I was going to like more than JGAP and ECJ.

I do like it; more than JGAP, but not as much as ECJ. I’ll send flowers later.

Since both of the examples, Hello World (genetic algorithm) and Simple Math Test (a genetic program), are already done in Watchmaker all I will be doing is explaining how they were done using the Watchmaker framework.

Hello World – A Genetic Algorithm

This is straight out of the Watchmaker Framework example code. Only two classes are needed:

  • StringEvaluator, the fitness function
  • StringsExample, the class that starts the evolution

The StringEvaluator class does three things:

  • hold onto the string the chromosomes are supposed to evolve into
  • check every character of the candidate string to determine how close it is to the desired string
  • return false when asked if a valid fitness value increases or decreases.

The StringsExample class creates the evolutionary environment in which the strings will evolve. The interesting stuff happens in here (reformatted to fit on this page).

This first section creates crossover and mutation objects specific to strings.

  • StringMutation mutates the evolving string using the alphabet as mutation values with a probability value of 0.02.
  • StringCrossover handles the crossover of strings based on the defined number of crossover points. The default is 1 crossover point.

The pipeline contains the list of operators to be applied to the population; in this case just the two we just discussed.

StringsExample.java
...
public static String evolveString(String target)
{
    List<EvolutionaryOperator> operators
                       = new ArrayList<EvolutionaryOperator>(2);
    operators.add(new StringMutation(ALPHABET, new Probability(0.02d)));
    operators.add(new StringCrossover());
    EvolutionaryOperator pipeline
                      = new EvolutionPipeline(operators);

The EvolutionEngine is the petri dish where everything happens. The configuration objects are as follows:
StringFactory, creates the initial chromosomes/individuals for the population
– pipeline, defined above
StringEvaluator, the fitness function also discussed above
RouletteWheelSelection, selects candidates at random where the probability is proportional to the candidates fitness score
MersenneTwisterRNG, a faster more reliable random number generator

StringsExample.java
    ...
    EvolutionEngine engine
          = new ConcurrentEvolutionEngine(
                      new StringFactory(ALPHABET, target.length()),
                      pipeline,
                      new StringEvaluator(target),
                      new RouletteWheelSelection(),
                      new MersenneTwisterRNG());

The EvolutionObserver is an interesting construct. This visitor gives you a front row seat of things as they happen. It prints out a basic string, but has the potential for coolness.

StringsExample.java
    ...
    engine.addEvolutionObserver(new EvolutionLogger());

Finally, turn on the engine for a population of 100 with a 5% elitism rate. Stop evolving when the fitness score equals 0.

StringsExample.java
    ...
    return engine.evolve(100, // 100 individuals in the population.
                           5, // 5% elitism.
                         new TargetFitness(0, false));
}

The code for the above is part of the Watchmaker Framework so download and poke it to your heart’s content.

Simple Math Test – A Genetic Program

The Watchmaker documentation states that it is a framework for genetic algorithms. That made me quite concerned as Toby Segaran’s Simple Math Test is not an example of a genetic algorithm. However, a quick look at the examples in the WF download revealed a package named org.uncommons.watchmaker.example.geneticprogramming. In that package there is a file named GeneticProgrammingExample.java and it states:

GeneticProgrammingExample.java
...
/**
 * Simple tree-based genetic programming application based on the first example
 * in Chapter 11 of Toby Segaran's Progamming Collective Intelligence.
 * @author Daniel Dyer
 */

Gotta love it. Less work for me and I don’t have to fight my way into the framework to figure out how to use WF in a genetic programming context. It seems unfair that both of the examples I chose are already done in WF, but that’s life. However, I will come up with a GP example that WF, JGAP, and ECJ have not done and see what it takes to implement them. One day. Soon.

[Perhaps Daniel Dyer will include the GP-related bits in an upcoming version of Watchmaker. Code reuse is a nice thing. I am hoping ECJ does that as well.]

The example geneticprogramming package contains the following files:
Node.java
BinaryNode.java
LeafNode.java

IfThenElse.java
IsGreater.java

Addition.java
Subtraction.java
Multiplication.java
Constant.java
Parameter.java

Simplification.java

TreeCrossover.java
TreeEvaluator.java
TreeFactory.java
TreeMutation.java

GeneticProgrammingExample.java

That’s a lot of files. Not having many of those classes in the framework shows.

In GP, the solution is an executable tree. During the evolution of the solution there will be trees that are not executable as well as trees that execute, but produce incorrect solutions.

In Watchmaker, the EvolutionEngine controls the process of creating a population, evolving them using various operators like crossover and mutation, checking their fitness and selecting the survivors for the next generation. This entails configuring it with:

In this case, the EvolutionEngine, an instance of ConcurrentEvolutionEngine, is configured with:

  • a TreeFactory which produces a tree of Nodes
  • an EvolutionPipeline which contains the crossover, mutation and simplification operators
  • a TreeEvaluator which will evaluate the fitness of the current tree/chromosome (given 2 input values does the tree produce the desired output value?)
  • a RouletteWheelSelection strategy
  • a MersenneTwisterRNG random number generator

You can read more about the RouletteWheelSelection in the Watchmaker Javadocs and at the Newcastle University Engineering Design Center, and about the MersenneTwisterRNG in the Uncommons Math Package Javadocs and at the university where it was developed. The Reader’s Digest version:

  • RouletteWheelSelection – selects candidates by giving them a higher chance of being randomly selected based on their fitness score. The higher the score the higher ther probability of their being chosen. They might be chosen more than once.
  • MersenneTwisterRNG – a random number generator that is very random, very fast and very predictable. It is used by various pieces of Watchmaker to guarantee randomness where it is needed.

Let’s look at the fitness function next. The TreeEvaluator is initialized with an array of two input values and an output value. The code, while different than that found in Collective Intelligence, heads for the same target: if the difference between the result from the evolved formula and the supplied output value is zero then we have a winner.

TreeEvaluator.java
  ...
  double actualValue = candidate.evaluate(entry.getKey());
  double diff = actualValue - entry.getValue();
  error += (diff * diff);

The Watchmaker code takes the difference between the actual and the expected values, squares it and adds it to the error value for each iteration of the input values. When error equals zero then the formula works. Same goal, different path.

The EvolutionPipeline contains three operators that are executed in the order in which they are given:

GeneticProgrammingExample.java
...
    List<EvolutionaryOperator<Node>> operators = new ArrayList<EvolutionaryOperator<Node>>(3);
    operators.add(new TreeMutation(factory, new Probability(0.4d)));
    operators.add(new TreeCrossover());
    operators.add(new Simplification());
...
    EvolutionEngine<Node> engine
        = new ConcurrentEvolutionEngine<Node>(...
                                        new EvolutionPipeline<Node>(operators),
                                        ...);

When a new population is created the EvolutionaryOperators are called in order:

  1. TreeMutation, with a mutation probability of 40%
  2. TreeCrossover, which is a single-point crossover
  3. Simplification, which asks each Node to simplify itself; for example, if the Node contains 3 + 5 then simplify it to 8, but if the Node contains x + 5 (a variable and a constant), then leave it alone.

All that is left is the TreeFactory. The TreeFactory needs to know 4 things to do its job:

  • the parameter count for the nodes that are not constants (for example, Addition gets 2 parameters)
  • the maximum depth of any given tree (too shallow doesn’t allow for a rich enough execution tree, too deep might generate a bunch of useless code)
  • the probability to use in the creation of functions
  • the probability to use in the creation of parameters

The EvolutionEngine will call TreeFactory.generateRandomCandidate(), which will call TreeFactory.makeNode(), to randomly create Nodes of functions, parameters or constants:

TreeFactory.java
  ...
private Node makeNode(Random rng, int maxDepth)
{
  if (functionProbability.nextEvent(rng) && maxDepth > 1)
  {
      // Max depth for sub-trees is one less than max depth for this node.
      int depth = maxDepth - 1;
      switch (rng.nextInt(5))
      {
        case 0: return new Addition(makeNode(rng, depth), makeNode(rng, depth));
        case 1: return new Subtraction(makeNode(rng, depth), makeNode(rng, depth));
        case 2: return new Multiplication(makeNode(rng, depth), makeNode(rng, depth));
        case 3: return new IfThenElse(makeNode(rng, depth), makeNode(rng, depth), makeNode(rng, depth));
        default: return new IsGreater(makeNode(rng, depth), makeNode(rng, depth));
      }
  }
  ...

When the TreeFactory creates a function it is only creating one of five possibilities. Adding additional baseline functions is pretty trivial once you see it presented like this. Daniel Dyer really has done a great job.

The last thing the TreeFactory does is either create a parameter with a parameter count of 1 or 0, or return a constant with a value of 0-10. All too easy.

TreeFactory.java
  ...
  else if (parameterProbability.nextEvent(rng))
  {
    return new Parameter(rng.nextInt(parameterCount));
  }
  else
  {
    return new Constant(rng.nextInt(11));
  }
}

Just for fun I have included an example of the code from two of the Nodes: Addition and IfThenElse.

Addition.java
  ...
  public Addition(Node left, Node right)
  {
      super(left, right, '+');
  }

  ...
  public double evaluate(double[] programParameters)
  {
    return left.evaluate(programParameters) + right.evaluate(programParameters);
  }
  ...

The constructor for Addition takes in two Nodes and a printable version of the operation. There is no check to see if the Nodes can evaluate to something useable by the Addition object; this is part of the evolution process. The trees that don’t work will be disposed of.

The evaluate() method calls evaluate() on the left and right Nodes and adds them together. ‘Nuff said.

The next example is the IfThenElse Node. It contains three nodes: a condition, code to execute if the condition is true, or code to execute if the condition is false.

IfThenElse.java
  ...
  public IfThenElse(Node condition, Node then, Node otherwise)
  {
    this.condition = condition;
    this.then = then;
    this.otherwise = otherwise;
  }
  ...

Again, notice no check on the actual capabilities of each Node.

IfThenElse.java
  ...
  public double evaluate(double[] programParameters)
  {
    return condition.evaluate(programParameters) > 0 // If...
           ? then.evaluate(programParameters)   // Then...
           : otherwise.evaluate(programParameters);  // Else...
  }
  ...

The evaluate() method resolves to the Java ternary operator and returns the result of either the then Node or otherwise Node. Yes, there is more code in the IfThenElse class (about 200 lines including comments), but the concepts are approachable and the code is quite clean/clear.

There. The Watchmaker framework and the Hello World and Simple Math Test examples.

I feel better now.

The Output

...
Generation 28: 493.0
Generation 29: 233.0
Generation 30: 233.0
Generation 31: 45.0
Generation 32: 5.0
Generation 33: 0.0
(((arg0 + 4.0) * arg0) + ((arg1 + -5.0) + (((6.0 + arg1) + 4.0) - arg0)))

The above simplified becomes:

(((arg0 + 4.0) * arg0) + ((arg1 + -5.0) + (((6.0 + arg1) + 4.0) - arg0)))
arg0^2 + 4*arg0 + arg1 - 5 + 6 + arg1 + 4 - arg0
arg0^2 + 3*arg0 + 2*arg1 + 5

If we make arg0 –> x and arg1 –> y then the above becomes x^2 + 3*x + 2*y + 5 which matches Toby Segaran’s formula. Go team!

And finally, in my never ending attempt to compare apples-to-apples, and failing miserably I might add, I present the same problem run by each of the three frameworks:

JGAP: 25 generations @ 1000 individuals/generation;
ECJ: 7 generations @ 1024 individuals/generation;
Watchmaker: 33 generations @ 1000 individuals/generation;

ECJ wins again.

The Bad

I was kinda disappointed when I realized that GP was not supported out of the box; the example above was an easy way to discover how to implement GP in Watchmaker, but it felt strange that such a good framework would leave it out. However, Watchmaker is at version 0.6.2 so I have high hopes for the future. Keep going!

The Good

Watchmaker is so easy to use! It uses all kinds of great design ideas, naming conventions, examples, etc. C’mon! You’re using generics! You rock!

The Code

Download the Watchmaker framework and go to town.

Advertisements
  1. Dan
    September 14, 2009 at 6:59 pm

    This is an interesting series of articles. I’d been meaning to investigate JGAP and ECJ more closely, so I’m glad that somebody impartial has compared them against Watchmaker.

    ECJ and JGAP are definitely more established and more comprehensive than Watchmaker. Where I hope Watchmaker has an advantage is the different design approach it takes. The initial focus has been on trying to get the core abstractions right and to get good multi-threaded performance.

    You’re right to highlight the lack of explicit support for GP, and that’s something that I’m open to addressing. I haven’t added any GP classes to the framework yet because the code for the tree-based GP example in Watchmaker feels a bit cumbersome to me. That might just be because Java’s not so suited to this kind of dynamic programming (compared to functional languages like Haskell and Lisp), or perhaps I just haven’t found the best approach yet.

    Putting a positive spin on the lack of explicit GP support, in some ways it’s reflective of the “Watchmaker philosophy”. The key idea is that it doesn’t matter what type of entity you evolve. The framework provides the machinery for type-safe, generational evolutionary algorithms, which you can then apply to any type of object. A GP tree is just one of the unlimited number of types that you can evolve. This also means you aren’t restricted to tree-based GP, you can do linear GP and probably graph-based GP too.

    In response to your figures on the relative performance of the 3 frameworks, I would expect that the same evolutionary algorithms with the same parameters (population size, elitism, mutation/cross-over probabilities) should, when averaged over several runs, complete in the same number of generations regardless of the framework used. Isolating why they don’t would be useful, though probably not so easy as there may be subtle differences in how the different frameworks implement certain concepts.

    What would be more interesting (to me at least) from a performance perspective would be to measure the throughput of each framework. Given the same problem and using the same population size, how many generations per second can each framework process?

    Dan.

    • tr8dr
      November 19, 2009 at 2:49 pm

      I’m a bit late to this discussion, however would like to point out that the performance difference is less about the framework and rather more about algorithms. For instance ECJ supports Differential Evolution and Swarming approaches, both of which are often superior to straight GA.

      I’ve been a long time user of JGAP and now looking to switch due to some inadequacies in the framework, but also because its algorithms are too basic (no DE for instance).

      Watchmaker looks interesting in that it is clean. Any plans to add Differential Evaluation?

  2. cvalcarcel
    September 16, 2009 at 6:57 pm

    Dan,

    I almost don’t know what to say…and that doesn’t happen often.

    Thank you for your comments! And now about your comments:
    – yes, your abstractions are quite cool. I really do like all three frameworks, but I think of them as ECJ, Watchmaker and JGAP (in that order). It has been a great experience working with all three.

    – I understand that creating the GP portion of the framework in Java can be challenging. GA/GP programming is not what Java was designed for; I wonder what a GA/GP oriented language would be like?

    – What is interesting about what is missing from Watchmaker (and ECJ and JGAP) is what makes all gaps interesting: why is it missing? How can it be added? Is it worth adding? ECJ admits to avoiding generics due to overhead. I wonder if that is still true given the proliferation of dual- and quad-core processors? Given the GP example in Watchmaker writing entry-level GP should be much simpler. Would it have been nice to have them built-in? Well, yes. Is it an impediment to learning GP? No.

    – I must admit to feeling comfortable in tree-based GP, but linear and graph-based GP looks interesting as well. Could be a future post.

    – About the relative performance: while it may not be as bad as comparing apples-to-compilers the fact is that the frameworks kinda sorta do the same thing. Yes, it would be great to create three equally configured versions and see what happens; it would be just as interesting to see what kinds of optimizations could be done to all 3 frameworks to show them at their best; speed is not the only consideration (being the root of all evil and all). In the case of the examples I chose there were very black-and-white answers to the problems. GP is about finding optimal solutions to gray problems. Would all three frameworks give the same optimal solution? If yes, why? If no, why not? Did the best solution take the longest? The shortest? Those are interesting questions! I am looking forward to finding out.

    BTW, in case no one has told you this today: thank you for taking the time to implement Watchmaker (and to the implementers of JGAP and ECJ: thank you! Thank you!). It is a great example of OO, Java, Java features and design, and an interesting way to learn/use GA.

    Thank you! I look forward to seeing it grow! I have a couple of examples from Toby Segaran’s book that I want to implement in each of the frameworks, but this time I will do a different one per framework.

  3. blissweb
    September 16, 2016 at 3:07 am

    This is awesome work. Can’t believe I’m only just finding this now. I can’t wait to try out Watchmaker after realizing that JGAP is pretty slow and clunky. ECJ looks way too complex for my little brain for the moment. Totally agree about the Tree stuff really should be in the main Watchmaker package. If you hadn’t have pointed out those classes buried in the examples I would have probably have moved on to another package.

  4. blissweb
    September 18, 2016 at 1:36 am

    Just started fiddling with Watchmaker. I have a question. In the theory of Genetic Programming, some of the Terminals can be zero returning functions such as Move or DidMove etc. So how would I encode these functions as part of my parameters. In the above example they are all simple values, and I can’t see any examples. In JGAP it seems straightforward.

  1. September 17, 2009 at 9:18 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: