Home > Genetic Programming, Learning > JGAP: A First/Simple Tutorial

JGAP: A First/Simple Tutorial


(This is going to be a long one.)

Genetic Programming. The only thing that strikes more fear in my heart is Lisp. Another blogger whose post I can’t find described learning Lisp as being similar to scaling a vertical wall. I have spent quite a bit of time reading about Lisp and sadly have to agree.

However, I have hope. I am reading Peter Norvig’s Paradigms of Artificial Intelligence Programming and while it is endlessly fascinating the Lisp is only marginally more appealing. He does a great job of describing certain Lisp concepts in a way that even I can understand. The chapter on Eliza is one of the best of the book.

Another on the scale of interesting, but daunting, software concepts, is Genetic Programming (GP). After reading the chapter of genetic programming (Evolving Intelligence) in Toby Segaran’s book Programming Collective Intelligence I found my interest rather piqued and started to look deeper and deeper into GP. I follow a lot of topics related to software, but most I only follow enough to know they exist and may or may not be interesting. GP has always been on the interesting side, but with the potential to be quite involved.

After reading the chapter on GP in Programming Collective Intelligence, and forcing a number of my colleagues to read it as well, I came away thinking that perhaps it was time to start delving into it. After all, it looks hard and if it looks hard it must be worth looking trying.

<MyThoughts>

Hmm.

Python. Looks a lot like Lisp.

Uses lambda functions. Looks a lot like Java inner classes without all the syntactic sugar.

Cool. Small functions. Maybe I should learn more Python after all…

Running it from the Python command line interpreter. Interesting. I wonder how hard that would be to do in Eclipse? I have only used Pydev for small Python objects at work.

Hmm. The code creates a number of wrappers to functions.

Primitive tests.

Wouldn’t I have to write a lot of Java to duplicate this? Might be an interesting project, but looks tedious.

</MyThoughts>

Over time I had collected links to various open source projects about GP, but it was not until I read the Segaran chapter that I decided to look at any of them.

I settled on JGAP. Why? Laziness…I mean…less work. That’s it; less work. Also, it is written in Java so I feel comforted.

JGAP purports to do quite a bit of work for the developer, but allows you to extend things if you so choose. There are a number of examples shipped with the framework so how hard can it be?

The example from the GP chapter in Collective Intelligence looks like this: here are a list of inputs and their associated output (Programming Collective Intelligence, page 259):

Input 1 Input 2 Output
26 35 829
8 24 141
20 1 467
33 11 1215
37 16 1517

The point of the exercise is to discover what equation will return the result given the 2 input values. The answer is x^2+2y+3x+5 where x is input 1 and y is input 2 (Programming Collective Intelligence, page 259).

Eventually, the Python code breeds a solution that looks like this (Programming Collective Intelligence, page 267):

(X*(2+X))+X+4+Y+Y+(10>5)

If you simplify the above you get:

2x+x^2+x+4+2y+1

x^2+2x+x+2y+4+1

x^2+3x+2y+5

Which is pretty much what Toby Segaran was looking for. He briefly talks about how inefficient the code out of a genetically created program can be, but it is pretty awesome in any case.

If you want to take a look at the Python code that was used then read the sample chapter on Safari Books Online.

So today’s goal is: duplicate the Python result with JGAP using the least amount of code possible. How to do that?

I am not going to discuss GP in any real depth. I want to get JGAP up and running and see what I can do with a simple example.

The JGAP document lists 4 steps involved in creating a GP using JGAP:

  1. Create a GP Configuration Object
  2. Create an initial Genotype
  3. Evolve the population
  4. Optionally implement custom functions and terminals (a mutable static number)

The JGAP documentation leaves out the use of a fitness function in their GP section, even though you have to use one and they have one implemented. I would list that as the first thing you should do.

In Test-driven Development it would be:

  1. Write a test
  2. Write whatever code it takes to pass the test
  3. Refactor the code so you won’t be embarrassed when your mom sees it
  4. Repeat

If we think of the fitness function as a test program then we can think of GP as a type of programming that can make use of TDD. In fact, the folks at NeoCoreTechs are working on extreme genetic programming (XGP) as a way of growing applications rather than writing them. Sounds like a noble goal, but a hard one. In other words, sounds like it’s worth doing.

Let’s port the example from the Programming Collective Intelligence book to JGAP.

Implement a Fitness Function

The fitness function (unit test for you TDD’ers) needs to score the incoming chromosomes/solutions. The original Python fitness function is:

def scorefunction(tree,s):
  dif=0
  for data in s:
    v=tree.evaluate([data[0],data[1]])
    dif+=abs(v-data[2])
  return dif

As the value returned by the chromosome gets closer to the desired output the result of the subtraction gets closer to 0.

The Java evaluate():double method looks like:

    protected double evaluate(final IGPProgram program) {
        double result = 0.0f;

        long longResult = 0;
        for (int i = 0; i < _input1.length; i++) {
            // Set the input values
            _xVariable.set(_input1[i]);
            _yVariable.set(_input2[i]);
            // Execute the genetically engineered algorithm
            long value = program.execute_int(0, NO_ARGS);

            // The closer longResult gets to 0 the better the algorithm.
            longResult += Math.abs(value - _output[i]);
        }

        result = longResult;

        return result;
    }

(I know: I could have gotten away with just the longResult and let the return do an autoboxing, but I didn’t want to.)

Looks rather similar to the original Python code only with autoboxing. The _xVariable and _yVariable are references to objects under the control of the chromosomes in the population and having them means we can give the chromosomes new values to help them do their job: figuring out what the formula is that will get us the desired output.

Create a GP Configuration Object

I have the SimpleMathTest class initialize itself in its constructor so this is where I create the GPConfiguration object. Values for the maximum initialization depth, population size and maximum crossover depth are arbitrary for now. These are numbers I found used in some of the other JGAP examples and figured they were good enough.

In addition, I assigned the fitness function to GPConfiguration using setFitnessFunction().

    public SimpleMathTest() throws InvalidConfigurationException {
        super(new GPConfiguration());

        GPConfiguration config = getGPConfiguration();

        _xVariable = Variable.create(config, "X", CommandGene.IntegerClass);
        _yVariable = Variable.create(config, "Y", CommandGene.IntegerClass);

        config.setGPFitnessEvaluator(new DeltaGPFitnessEvaluator());
        config.setMaxInitDepth(4);
        config.setPopulationSize(1000);
        config.setMaxCrossoverDepth(8);
        config.setFitnessFunction(new SimpleMathTestFitnessFunction(INPUT_1, INPUT_2, OUTPUT, _xVariable, _yVariable));
        config.setStrictProgramCreation(true);
    }

Create an initial genotype

The genotype represents a configured GP environment. This is where we pass the references to _xVariable and _yVariable used by the fitness function.

    public GPGenotype create() throws InvalidConfigurationException {
        GPConfiguration config = getGPConfiguration();

        // The return type of the GP program.
        Class[] types = { CommandGene.IntegerClass };

        // Arguments of result-producing chromosome: none
        Class[][] argTypes = { {} };

        // Next, we define the set of available GP commands and terminals to
        // use.
        CommandGene[][] nodeSets = {
            {
                _xVariable,
                _yVariable,
                new Add(config, CommandGene.IntegerClass),
                new Multiply(config, CommandGene.IntegerClass),
                new Terminal(config, CommandGene.IntegerClass, 0.0, 10.0, true)
            }
        };

        GPGenotype result = GPGenotype.randomInitialGenotype(config, types, argTypes,
                nodeSets, 20, true);

        return result;
    }

Evolve the population

At this stage all we are doing is creating the population, which calls the fitness function, and checking to see if our test values match.

        GPGenotype gp = problem.create();
        gp.setVerboseOutput(true);
        gp.evolve(30);

        System.out.println("Formula to discover: x^2 + 2y + 3x + 5");
        gp.outputSolution(gp.getAllTimeBest());

The final output from all the above work is (and this may vary based on the created population):

(Y + Y) + ((5 + X) + ((X * X) + (X + X)))

which, when simplified, becomes the desired formula. Not bad for a few hours work. Of course, I did not come up with this in my first pass. I hit a dead end at one point and refactored like mad once I had it working, but the results are impressive given that I worked on it for a short amount of time.

The Actual Pieces

The fitness function in class SimpleMathTestFitnessFunction:

package hiddenclause.example;

import org.jgap.gp.GPFitnessFunction;
import org.jgap.gp.IGPProgram;
import org.jgap.gp.terminal.Variable;

public class SimpleMathTestFitnessFunction extends GPFitnessFunction {

    private Integer[] _input1;
    private Integer[] _input2;
    private int[] _output;
    private Variable _xVariable;
    private Variable _yVariable;

    private static Object[] NO_ARGS = new Object[0];

    public SimpleMathTestFitnessFunction(Integer input1[], Integer input2[],
            int output[], Variable x, Variable y) {
        _input1 = input1;
        _input2 = input2;
        _output = output;
        _xVariable = x;
        _yVariable = y;
    }

    @Override
    protected double evaluate(final IGPProgram program) {
        double result = 0.0f;

        long longResult = 0;
        for (int i = 0; i < _input1.length; i++) {
            // Set the input values
            _xVariable.set(_input1[i]);
            _yVariable.set(_input2[i]);
            // Execute the genetically engineered algorithm
            long value = program.execute_int(0, NO_ARGS);

            // The closer longResult gets to 0 the better the algorithm.
            longResult += Math.abs(value - _output[i]);
        }

        result = longResult;

        return result;
    }

}

The actual GP program to find the secret formula in class SimpleMathTest:

package hiddenclause.example;

import org.jgap.InvalidConfigurationException;
import org.jgap.gp.CommandGene;
import org.jgap.gp.GPProblem;
import org.jgap.gp.function.Add;
import org.jgap.gp.function.Multiply;
import org.jgap.gp.function.Pow;
import org.jgap.gp.impl.DeltaGPFitnessEvaluator;
import org.jgap.gp.impl.GPConfiguration;
import org.jgap.gp.impl.GPGenotype;
import org.jgap.gp.terminal.Terminal;
import org.jgap.gp.terminal.Variable;

/**
 * @author carlos
 *
 */
public class SimpleMathTest extends GPProblem {
    @SuppressWarnings("boxing")
    private static Integer[] INPUT_1 = { 26, 8, 20, 33, 37 };

    @SuppressWarnings("boxing")
    private static Integer[] INPUT_2 = { 35, 24, 1, 11, 16 };

    private static int[] OUTPUT = { 829, 141, 467, 1215, 1517 };

    private Variable _xVariable;
    private Variable _yVariable;

    public SimpleMathTest() throws InvalidConfigurationException {
        super(new GPConfiguration());

        GPConfiguration config = getGPConfiguration();

        _xVariable = Variable.create(config, "X", CommandGene.IntegerClass);
        _yVariable = Variable.create(config, "Y", CommandGene.IntegerClass);

        config.setGPFitnessEvaluator(new DeltaGPFitnessEvaluator());
        config.setMaxInitDepth(4);
        config.setPopulationSize(1000);
        config.setMaxCrossoverDepth(8);
        config.setFitnessFunction(new SimpleMathTestFitnessFunction(INPUT_1, INPUT_2, OUTPUT, _xVariable, _yVariable));
        config.setStrictProgramCreation(true);
    }

    @Override
    public GPGenotype create() throws InvalidConfigurationException {
        GPConfiguration config = getGPConfiguration();

        // The return type of the GP program.
        Class[] types = { CommandGene.IntegerClass };

        // Arguments of result-producing chromosome: none
        Class[][] argTypes = { {} };

        // Next, we define the set of available GP commands and terminals to
        // use.
        CommandGene[][] nodeSets = {
            {
                _xVariable,
                _yVariable,
                new Add(config, CommandGene.IntegerClass),
                new Multiply(config, CommandGene.IntegerClass),
                new Terminal(config, CommandGene.IntegerClass, 0.0, 10.0, true)
            }
        };

        GPGenotype result = GPGenotype.randomInitialGenotype(config, types, argTypes,
                nodeSets, 20, true);

        return result;
    }

    public static void main(String[] args) throws Exception {
        GPProblem problem = new SimpleMathTest();

        GPGenotype gp = problem.create();
        gp.setVerboseOutput(true);
        gp.evolve(30);

        System.out.println("Formula to discover: x^2 + 2y + 3x + 5");
        gp.outputSolution(gp.getAllTimeBest());
    }

}

(And remember: I don’t warrant any of the above code to do anything but crash. Please do not use it for anything at all except as examples of using the JGAP framework. Especially don’t use the above code in nuclear submarines, nuclear power plants or Cylons. Okay, maybe Cylons.)

Yeah, Java is more verbose. We really need to do something about that.

Much thanks to Toby Segaran for his lucid explanation of GP principles and to the JGAP team for what appears to be an awesome framework.

Advertisements
  1. arturo
    October 1, 2011 at 4:32 pm

    hi, i just need to setup three parameters.
    /* configuracion del algoritmo */
    // GPConfiguration conf = new GPConfiguration();
    // conf.setFitnessEvaluator(new DefaultFitnessEvaluator()); //a higher fitness value is seen as fitter
    // conf.setMaxCrossoverDepth(2*r * c);
    // conf.setCrossoverProb(0.2f);
    // conf.setMutationProb(0.0075f);\\

    but the evolution do not work

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: