Archive

Archive for September, 2009

Writing an Eclipse Plug-in: Upgrading to Eclipse 3.5

September 26, 2009 Leave a comment

This is inconvenient, but only just so.

In moving from Eclipse 3.4 to 3.5 I found that the launch configurations created under Eclipse 3.4, but run under 3.5, decided not to work anymore. How to fix them? Sadly, the quickest way was to just delete the old ones and create new ones.

So here are instructions on how to create the launch configurations for the past and future examples of this ongoing project if you want to use Eclipse 3.5:

  • Open the Run Configuration window.
  • Delete the launch configuration named customplugin. (If you have been slowly building this new project the you probably have a launch configuration named customplugin. It is safe to delete it. If you have been slowly building your own application and including all of the things I have been describing for the last few months then I am not sure how to help except to recommend copying down all of your configuration information, deleting the launch configuration and creating a new one.)
  • In the list to the left right click on Eclipse Application and select New.
  • Enter the following in the Main tab:
    • Name: customplugin
    • Location: ${workspace_loc}/../runtime-customplugin
    • Clear: check. Leave Workspace selected
    • Ask for Confirmation Before Clearing: uncheck, but if you want to be asked each time before clearing out your current runtime workbench environment you can certainly leave this checked.
    • Run a Product: org.eclipse.platform.ide
    • Runtime JRE: [whatever version of Java 6 you have been using]
  • Enter the following in the Plug-ins tab:
    • Launch With: Plug-ins Selected Below Only
    • In the Workspace node put a checkmark next to the customplugin project. Uncheck the Target Platform node. This should unselect all of the default plug-ins.
    • To the right click Add Required Plug-ins
  • Click Apply.

Go to town.

Advertisements

Help! My Eclipse 3.5 Plug-in Editor has Stopped Working!

September 26, 2009 Leave a comment

The other day I went back to work on a plug-in and found that the plug-in editor was misbehaving. I have multiple installs of Eclipse (3.4 and 3.5) installed on (K)ubuntu and I suspected that might be the problem even though I never had a problem with multiple installations of Eclipse before. Heavens! What’s a blogger to do?

In fact I did what most people would do: I went to Google. What I found was Saminda Wijeratne’s blog where the problem was the same, but the solution described was quite different than I eventually discovered. However, I discovered the solution because the blog encouraged me to go to the Eclipse Preferences window and look at the Plug-in Development –> Target Definitions.

At that point I realized that the ${eclipse_home} variable was pointing to my 3.4 installation instead of 3.5! A little bit more digging also led me to discover that I could not change ${eclipse_home} explicitly within Eclipse. The solution is probably specific to Kubuntu: when I added Eclipse 3.4 and 3.5 to KDE using the Menu Editor I did not set the Advanced –> Work Path. I thought Eclipse would be smart enough to get it right, but I turned out to be wrong (not a good habit to get into).

In the image below notice how the Work Path for the highlighted Eclipse 3.5 entry specifically references my 3.5 path.

kdw-menu-editor-eclipse-3.5.
Once that was done, saved and tested there was much joy in Wonderland.

The cat was alive (though it took a while to find it).

Help! I Can’t Find An Executable Version of XML Copy Editor for 64-bit Kubuntu!

September 18, 2009 5 comments

[I hate when things change; like when Dr. Who died and became someone else. Very disturbing. Anyway, the links to XML Copy Editor listed below don’t work as the maintainer of the getdeb site is no longer updating software for anything prior to Ubuntu 9.04. The getdeb XML Copy Editor drop was part of the Jaunty Jackolope distro which I guess was before 9.04. Any software for prior versions is still available on the getdeb archive site at http://old.getdeb.net/ and the 64-bit version of XML Copy Editor can be found at http://old.getdeb.net/release/4263. Sorry for any confusion this might have caused even though I didn’t cause it and the maintainer of getdeb didn’t mean to. So get over it. Download XML Copy Editor and be done with it.]

[Another update: 3/12/10: Going to http://old.getdeb.net/ does not appear to work. Apparently they are working on the legacy side of their site. Be patient.]

And now for something completely different. By different I mean short.

If you have been looking for a great XML editor I would recommend XML Copy Editor. The original goal of this post was to show you how to download the code and all of the dependencies needed to run XML Copy Editor. The last time I downloaded XML Copy Editor it did not come in a ready-to-run form for Kubuntu. I had to download all sorts of dependencies and cut the heads off of a few chickens to get it to work, but work it did and I was happy.

After installing Kubuntu again I found that I needed to revisit the installation of XML Copy Editor. Wouldn’t you know? I forgot to take notes on how I did it. And it was not trivial the last time I did it.

As it turns out there is a web site that contains ready-to-run, pre-baked versions of all sorts of program including XML Copy Editor. The site is getdeb and I also recommend it as a place to find software you might have thought was unavailable on Ubuntu.

I happen to run Kubuntu 64-bit so a search for XML Copy Editor returned XML Copy Editor for Ubuntu Jaunty 32 bit. Scrolling down the page revealed a version available for Ubuntu Jaunty 64 bit which exactly matched my need.

The cat was alive.

These are a few of my favorite things

XML Copy Editor does a number of things rather well:

  • Checks well-formedness
  • Validates the XML using DTDs, XSDs, RELAX NG and others
  • Applies XSL transforms so you can test what it is you are transforming
  • Executes arbitrary XPath on the current file
  • Supports the creation of 26 XML-related file types

Do I have to mention that it also uses colored syntax, element folding and a built in web browser?

XML Copy Editor: download it from getdeb. Install it. Use it.

Watchmaker: A First and Second Example

September 13, 2009 6 comments

Time to look at the framework that I was sure I was going to like more than JGAP and ECJ.

I do like it; more than JGAP, but not as much as ECJ. I’ll send flowers later.

Since both of the examples, Hello World (genetic algorithm) and Simple Math Test (a genetic program), are already done in Watchmaker all I will be doing is explaining how they were done using the Watchmaker framework.

Hello World – A Genetic Algorithm

This is straight out of the Watchmaker Framework example code. Only two classes are needed:

  • StringEvaluator, the fitness function
  • StringsExample, the class that starts the evolution

The StringEvaluator class does three things:

  • hold onto the string the chromosomes are supposed to evolve into
  • check every character of the candidate string to determine how close it is to the desired string
  • return false when asked if a valid fitness value increases or decreases.

The StringsExample class creates the evolutionary environment in which the strings will evolve. The interesting stuff happens in here (reformatted to fit on this page).

This first section creates crossover and mutation objects specific to strings.

  • StringMutation mutates the evolving string using the alphabet as mutation values with a probability value of 0.02.
  • StringCrossover handles the crossover of strings based on the defined number of crossover points. The default is 1 crossover point.

The pipeline contains the list of operators to be applied to the population; in this case just the two we just discussed.

StringsExample.java
...
public static String evolveString(String target)
{
    List<EvolutionaryOperator> operators
                       = new ArrayList<EvolutionaryOperator>(2);
    operators.add(new StringMutation(ALPHABET, new Probability(0.02d)));
    operators.add(new StringCrossover());
    EvolutionaryOperator pipeline
                      = new EvolutionPipeline(operators);

The EvolutionEngine is the petri dish where everything happens. The configuration objects are as follows:
StringFactory, creates the initial chromosomes/individuals for the population
– pipeline, defined above
StringEvaluator, the fitness function also discussed above
RouletteWheelSelection, selects candidates at random where the probability is proportional to the candidates fitness score
MersenneTwisterRNG, a faster more reliable random number generator

StringsExample.java
    ...
    EvolutionEngine engine
          = new ConcurrentEvolutionEngine(
                      new StringFactory(ALPHABET, target.length()),
                      pipeline,
                      new StringEvaluator(target),
                      new RouletteWheelSelection(),
                      new MersenneTwisterRNG());

The EvolutionObserver is an interesting construct. This visitor gives you a front row seat of things as they happen. It prints out a basic string, but has the potential for coolness.

StringsExample.java
    ...
    engine.addEvolutionObserver(new EvolutionLogger());

Finally, turn on the engine for a population of 100 with a 5% elitism rate. Stop evolving when the fitness score equals 0.

StringsExample.java
    ...
    return engine.evolve(100, // 100 individuals in the population.
                           5, // 5% elitism.
                         new TargetFitness(0, false));
}

The code for the above is part of the Watchmaker Framework so download and poke it to your heart’s content.

Simple Math Test – A Genetic Program

The Watchmaker documentation states that it is a framework for genetic algorithms. That made me quite concerned as Toby Segaran’s Simple Math Test is not an example of a genetic algorithm. However, a quick look at the examples in the WF download revealed a package named org.uncommons.watchmaker.example.geneticprogramming. In that package there is a file named GeneticProgrammingExample.java and it states:

GeneticProgrammingExample.java
...
/**
 * Simple tree-based genetic programming application based on the first example
 * in Chapter 11 of Toby Segaran's Progamming Collective Intelligence.
 * @author Daniel Dyer
 */

Gotta love it. Less work for me and I don’t have to fight my way into the framework to figure out how to use WF in a genetic programming context. It seems unfair that both of the examples I chose are already done in WF, but that’s life. However, I will come up with a GP example that WF, JGAP, and ECJ have not done and see what it takes to implement them. One day. Soon.

[Perhaps Daniel Dyer will include the GP-related bits in an upcoming version of Watchmaker. Code reuse is a nice thing. I am hoping ECJ does that as well.]

The example geneticprogramming package contains the following files:
Node.java
BinaryNode.java
LeafNode.java

IfThenElse.java
IsGreater.java

Addition.java
Subtraction.java
Multiplication.java
Constant.java
Parameter.java

Simplification.java

TreeCrossover.java
TreeEvaluator.java
TreeFactory.java
TreeMutation.java

GeneticProgrammingExample.java

That’s a lot of files. Not having many of those classes in the framework shows.

In GP, the solution is an executable tree. During the evolution of the solution there will be trees that are not executable as well as trees that execute, but produce incorrect solutions.

In Watchmaker, the EvolutionEngine controls the process of creating a population, evolving them using various operators like crossover and mutation, checking their fitness and selecting the survivors for the next generation. This entails configuring it with:

In this case, the EvolutionEngine, an instance of ConcurrentEvolutionEngine, is configured with:

  • a TreeFactory which produces a tree of Nodes
  • an EvolutionPipeline which contains the crossover, mutation and simplification operators
  • a TreeEvaluator which will evaluate the fitness of the current tree/chromosome (given 2 input values does the tree produce the desired output value?)
  • a RouletteWheelSelection strategy
  • a MersenneTwisterRNG random number generator

You can read more about the RouletteWheelSelection in the Watchmaker Javadocs and at the Newcastle University Engineering Design Center, and about the MersenneTwisterRNG in the Uncommons Math Package Javadocs and at the university where it was developed. The Reader’s Digest version:

  • RouletteWheelSelection – selects candidates by giving them a higher chance of being randomly selected based on their fitness score. The higher the score the higher ther probability of their being chosen. They might be chosen more than once.
  • MersenneTwisterRNG – a random number generator that is very random, very fast and very predictable. It is used by various pieces of Watchmaker to guarantee randomness where it is needed.

Let’s look at the fitness function next. The TreeEvaluator is initialized with an array of two input values and an output value. The code, while different than that found in Collective Intelligence, heads for the same target: if the difference between the result from the evolved formula and the supplied output value is zero then we have a winner.

TreeEvaluator.java
  ...
  double actualValue = candidate.evaluate(entry.getKey());
  double diff = actualValue - entry.getValue();
  error += (diff * diff);

The Watchmaker code takes the difference between the actual and the expected values, squares it and adds it to the error value for each iteration of the input values. When error equals zero then the formula works. Same goal, different path.

The EvolutionPipeline contains three operators that are executed in the order in which they are given:

GeneticProgrammingExample.java
...
    List<EvolutionaryOperator<Node>> operators = new ArrayList<EvolutionaryOperator<Node>>(3);
    operators.add(new TreeMutation(factory, new Probability(0.4d)));
    operators.add(new TreeCrossover());
    operators.add(new Simplification());
...
    EvolutionEngine<Node> engine
        = new ConcurrentEvolutionEngine<Node>(...
                                        new EvolutionPipeline<Node>(operators),
                                        ...);

When a new population is created the EvolutionaryOperators are called in order:

  1. TreeMutation, with a mutation probability of 40%
  2. TreeCrossover, which is a single-point crossover
  3. Simplification, which asks each Node to simplify itself; for example, if the Node contains 3 + 5 then simplify it to 8, but if the Node contains x + 5 (a variable and a constant), then leave it alone.

All that is left is the TreeFactory. The TreeFactory needs to know 4 things to do its job:

  • the parameter count for the nodes that are not constants (for example, Addition gets 2 parameters)
  • the maximum depth of any given tree (too shallow doesn’t allow for a rich enough execution tree, too deep might generate a bunch of useless code)
  • the probability to use in the creation of functions
  • the probability to use in the creation of parameters

The EvolutionEngine will call TreeFactory.generateRandomCandidate(), which will call TreeFactory.makeNode(), to randomly create Nodes of functions, parameters or constants:

TreeFactory.java
  ...
private Node makeNode(Random rng, int maxDepth)
{
  if (functionProbability.nextEvent(rng) && maxDepth > 1)
  {
      // Max depth for sub-trees is one less than max depth for this node.
      int depth = maxDepth - 1;
      switch (rng.nextInt(5))
      {
        case 0: return new Addition(makeNode(rng, depth), makeNode(rng, depth));
        case 1: return new Subtraction(makeNode(rng, depth), makeNode(rng, depth));
        case 2: return new Multiplication(makeNode(rng, depth), makeNode(rng, depth));
        case 3: return new IfThenElse(makeNode(rng, depth), makeNode(rng, depth), makeNode(rng, depth));
        default: return new IsGreater(makeNode(rng, depth), makeNode(rng, depth));
      }
  }
  ...

When the TreeFactory creates a function it is only creating one of five possibilities. Adding additional baseline functions is pretty trivial once you see it presented like this. Daniel Dyer really has done a great job.

The last thing the TreeFactory does is either create a parameter with a parameter count of 1 or 0, or return a constant with a value of 0-10. All too easy.

TreeFactory.java
  ...
  else if (parameterProbability.nextEvent(rng))
  {
    return new Parameter(rng.nextInt(parameterCount));
  }
  else
  {
    return new Constant(rng.nextInt(11));
  }
}

Just for fun I have included an example of the code from two of the Nodes: Addition and IfThenElse.

Addition.java
  ...
  public Addition(Node left, Node right)
  {
      super(left, right, '+');
  }

  ...
  public double evaluate(double[] programParameters)
  {
    return left.evaluate(programParameters) + right.evaluate(programParameters);
  }
  ...

The constructor for Addition takes in two Nodes and a printable version of the operation. There is no check to see if the Nodes can evaluate to something useable by the Addition object; this is part of the evolution process. The trees that don’t work will be disposed of.

The evaluate() method calls evaluate() on the left and right Nodes and adds them together. ‘Nuff said.

The next example is the IfThenElse Node. It contains three nodes: a condition, code to execute if the condition is true, or code to execute if the condition is false.

IfThenElse.java
  ...
  public IfThenElse(Node condition, Node then, Node otherwise)
  {
    this.condition = condition;
    this.then = then;
    this.otherwise = otherwise;
  }
  ...

Again, notice no check on the actual capabilities of each Node.

IfThenElse.java
  ...
  public double evaluate(double[] programParameters)
  {
    return condition.evaluate(programParameters) > 0 // If...
           ? then.evaluate(programParameters)   // Then...
           : otherwise.evaluate(programParameters);  // Else...
  }
  ...

The evaluate() method resolves to the Java ternary operator and returns the result of either the then Node or otherwise Node. Yes, there is more code in the IfThenElse class (about 200 lines including comments), but the concepts are approachable and the code is quite clean/clear.

There. The Watchmaker framework and the Hello World and Simple Math Test examples.

I feel better now.

The Output

...
Generation 28: 493.0
Generation 29: 233.0
Generation 30: 233.0
Generation 31: 45.0
Generation 32: 5.0
Generation 33: 0.0
(((arg0 + 4.0) * arg0) + ((arg1 + -5.0) + (((6.0 + arg1) + 4.0) - arg0)))

The above simplified becomes:

(((arg0 + 4.0) * arg0) + ((arg1 + -5.0) + (((6.0 + arg1) + 4.0) - arg0)))
arg0^2 + 4*arg0 + arg1 - 5 + 6 + arg1 + 4 - arg0
arg0^2 + 3*arg0 + 2*arg1 + 5

If we make arg0 –> x and arg1 –> y then the above becomes x^2 + 3*x + 2*y + 5 which matches Toby Segaran’s formula. Go team!

And finally, in my never ending attempt to compare apples-to-apples, and failing miserably I might add, I present the same problem run by each of the three frameworks:

JGAP: 25 generations @ 1000 individuals/generation;
ECJ: 7 generations @ 1024 individuals/generation;
Watchmaker: 33 generations @ 1000 individuals/generation;

ECJ wins again.

The Bad

I was kinda disappointed when I realized that GP was not supported out of the box; the example above was an easy way to discover how to implement GP in Watchmaker, but it felt strange that such a good framework would leave it out. However, Watchmaker is at version 0.6.2 so I have high hopes for the future. Keep going!

The Good

Watchmaker is so easy to use! It uses all kinds of great design ideas, naming conventions, examples, etc. C’mon! You’re using generics! You rock!

The Code

Download the Watchmaker framework and go to town.

DBUnit in Eclipse

September 5, 2009 4 comments

Just the other day I was wondering how DBUnit was doing. As a former consultant I used to use DBUnit along with various JUnit extensions on a regular basis.

Given that Eclipse has moved on, JUnit has moved on and DBUnit has moved on I thought I would present a straightforward example of how to use DBUnit with JUnit 4.0 and Eclipse.

Not that much has changed therefore there is not going to be a lot of hand holding here.

Assumptions

Eclipse 3.5
JUnit 4.0 – included with Eclipse
DBUnit 2.4.5
SLF4J 1.5.8 – DBUnit needs this
HSQL DB 1.8.0

I implemented this example on Kubuntu 9.10, if that makes any difference.

If you are new to Eclipse then just download any version that seems reasonable as long as it includes a Java development environment.

The Easy Part

Make sure all of the above software is available somewhere on your machine. If not, install all the software in your favorite places.

Start Eclipse.

The Short Version

  1. Start your database
  2. Create a Java Project
  3. Add DBUnit to your classpath
  4. Write and run a database test
    • Create initial and expected dataset files
    • Extend DBTestCase (inheritance) or use a JUnit class (composition)
    • Implement your test methods

The Longer Version

Start your database

I don’t have a database to run so I downloaded and installed HSQL. To run the HSQL server, which I prefer in examples, open a command window, go to the HSQL folder and run:

java -cp lib/hsqldb.jar org.hsqldb.Server -database.0 file:hiddenclause -dbname.0 xdb

In this case the database name is xdb with the database files named hiddenclause.*. Call your files whatever. I will add test data later.

My Eclipse default configuration includes:
Source folder name: src
Output folder name: classes

Default execution environment: JavaSE-1.6

Create a Java Project

Create a Java Project named DBUnitExample. ‘Nuff said.

Add DBUnit to your classpath

Once the project appears in the Package Explorer, right click on the project name and select Properties –> Java Build Path –> Libraries. Click Add External JARs and add the DBUnit JAR file, in this case dbunit-2.4.5.jar, to the list of libraries in the classpath. Yes, you could also have done this when you first created the project.

Add:
slf4j-api-1.5.8.jar
slf4j-simple-1.5.8.jar
hsqldb.jar
to the classpath as well.

Click OK to close the Properties window.

Write and run a database test

Add Test Data

As running a test on a fresh database is a little difficult start the HSQL Database Manager from another shell (in the HSQL directory):

java -cp lib/hsqldb.jar org.hsqldb.util.DatabaseManager

In the Connect window enter:
Setting Name: hiddenclause example
Type: HSQL Database Engine Server
Driver: org.hsqldb.jdbcDriver
URL: jdbc:hsqldb:hsql://localhost/xdb
User: sa
Password: [leave blank]

Click OK.

Almost done. Select Options –> Insert Test Data. Now we have 4 tables worth of data to test with. Run a delete on the CUSTOMER table so that is is empty.

Close the Database Manager.

Write a Database Test

The steps for writing a DBUnit test are:
1. Create initial and expected dataset files
2. Extend DBTestCase (inheritance) or use a JUnit class (composition)
3. Implement your test methods

Once you get comfortable with that the additional steps are:
1. Create initial and expected dataset files
2. Extend DBTestCase (inheritance) or use a JUnit class (composition)
3. Implement getSetUpOperation() and getTearDownOperation() (optional)
4. Override setUpDatabaseConfig() (optional)
5. Implement your test methods

We’ll just do the first one using the test data created by HSQL.

Create initial and expected dataset files

The DBUnit dataset can come from anywhere (files, databases, spreadsheets, etc.). Where the data comes from is hidden behind the class that implements IDataSet. For this example, we will use XML datasets.

Here is the initial dataset file:
customer-init.xml

<?xml version="1.0" encoding="UTF-8"?>
<dataset.
    <CUSTOMER />
</dataset>

Here is the expected dataset (what we expect to find in the database after executing some code):

<?xml version="1.0" encoding="UTF-8"?>
<dataset>
     <CUSTOMER ID="1"
               FIRSTNAME="John"
               LASTNAME="Smith"
               STREET="1 Main Street"
               CITY="Anycity" />
</dataset>
Extend DBTestCase (inheritance) or use a JUnit class (composition)

The first version of CustomerTest will inherit from the DBUnit class DBTestCase. That is the recommended way of creating a DBUnit test. It uses the JUnit 3.8.2 classes which still works even with the JUnit 4.0 JAR file.

public class CustomerTest extends DBTestCase {
...
    @Override
    protected IDataSet getDataSet() throws Exception {
        ...
    }
}

The getDataSet() method is called to initialize the database before the test. Consider it part of your setup logic. Let’s load the initialization dataset.

    @Override
    protected IDataSet getDataSet() throws Exception {
        return new FlatXmlDataSet(
                 new FileInputStream("customer-init.xml"));
    }

There are a number of properties that need to be set prior to DBUnit doing its magic. You can set those properties in the constructor:

    public CustomerTest(String name) {
        super(name);
        System.setProperty(
          PropertiesBasedJdbcDatabaseTester.DBUNIT_DRIVER_CLASS,
          "org.hsqldb.jdbcDriver");
        System.setProperty(
          PropertiesBasedJdbcDatabaseTester.DBUNIT_CONNECTION_URL,
          "jdbc:hsqldb:hsql://localhost/xdb");
        System.setProperty(
          PropertiesBasedJdbcDatabaseTester.DBUNIT_USERNAME,
          "sa");
        System.setProperty(
          PropertiesBasedJdbcDatabaseTester.DBUNIT_PASSWORD,
          "");

        _customerFactory = CustomerFactory.getInstance();
    }

In real life you would set the driver name, connection URL and username and password to their appropriate values.

Implement your test methods

For this example, we are going to test an insert into the db.

    public void testInsert() throws Exception {
        // insert a customer into the database
        Customer customer = _customerFactory.create("John", "Smith");
        customer.setStreet("1 Main Street");
        customer.setCity("Anycity");
        _customerFactory.update(customer);
...

The code for the CustomerFactory and Customer are at the end of this post.

The data that has just been entered into the database becomes your actual assertable values. Go get them.

        // get the actual table values
        IDatabaseConnection connection = getConnection();
        IDataSet databaseDataSet = connection.createDataSet();
        ITable actualTable = databaseDataSet.getTable("CUSTOMER");

The values defined in customer-expected.xml are what you expect the values to be. Go get them.

        // get the expected table values
        IDataSet expectedDataSet = new FlatXmlDataSet(
                                          new FileInputStream("customer-expected.xml"));
        ITable expectedTable = expectedDataSet.getTable("CUSTOMER");

Check the actual against the expected and complain or not as the case may be.
Assertion.assertEquals(expectedTable, actualTable);
}[/sourcecode]
A version that uses a JUnit class as a wrapper around the DBUnit code looks like this:

/**
 * This is an example only! Use it for anything else at your own risk!
 * You have been warned! Coder/user beware!
 *
 * copyright 2009 Carlos Valcarcel
 */
package hiddenclause.example.dbunit;

import java.io.FileInputStream;

import org.dbunit.Assertion;
import org.dbunit.IDatabaseTester;
import org.dbunit.JdbcDatabaseTester;
import org.dbunit.database.IDatabaseConnection;
import org.dbunit.dataset.IDataSet;
import org.dbunit.dataset.ITable;
import org.dbunit.dataset.xml.FlatXmlDataSet;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;

/**
 * @author carlos
 */
public class CustomerJunitTest {

    private CustomerFactory _customerFactory;

    private IDatabaseTester databaseTester;

    @Before
    public void setUp() throws Exception {
        databaseTester = new JdbcDatabaseTester("org.hsqldb.jdbcDriver",
                                                "jdbc:hsqldb:hsql://localhost/xdb",
                                                "sa", "");
        // initialize your dataset here
        IDataSet dataSet = new FlatXmlDataSet(new FileInputStream("customer-init.xml"));

        databaseTester.setDataSet(dataSet);

        // will call default setUpOperation
        databaseTester.onSetup();

        _customerFactory = CustomerFactory.getInstance();
    }

    @Test
    public void testInsert() throws Exception {
        // insert a customer into the database
        Customer customer = _customerFactory.create("John", "Smith");
        customer.setStreet("1 Main Street");
        customer.setCity("Anycity");
        _customerFactory.update(customer);

        // get the actual table values
        IDatabaseConnection connection = databaseTester.getConnection();
        IDataSet databaseDataSet = connection.createDataSet();
        ITable actualTable = databaseDataSet.getTable("CUSTOMER");

        // get the expected table values
        IDataSet expectedDataSet = new FlatXmlDataSet(
                                          new FileInputStream("customer-expected.xml"));
        ITable expectedTable = expectedDataSet.getTable("CUSTOMER");

        Assertion.assertEquals(expectedTable, actualTable);

    }

    @After
    public void tearDown() throws Exception {
        databaseTester.onTearDown();
    }
}

Things to notice:
– less configuration (the System.setProperty() calls are gone)
– explicit creation of a IDatabaseTester object
– explicit call to databaseTester.onSetup()
– explicit call to databaseTester.onTearDown()

Run the Database Test

With all the pieces in place it is now safe to run the CustomerTest DBUnit class. You will probably see some warning messages in the Console view about the data type factory being incorrect. You can safely ignore that error for this example. In real life you probably want to instantiate a new DataTypeFactory based on the database you are using.

If any of the above does not quite work as described let me know and I will update the above explanation.

The Code

customer-init.xml

<?xml version="1.0" encoding="UTF-8"?>
<dataset>
    <CUSTOMER />
</dataset>

customer-expected.xml

<?xml version="1.0" encoding="UTF-8"?>
<dataset>
    <CUSTOMER ID="1"
              FIRSTNAME="John"
              LASTNAME="Smith"
              STREET="1 Main Street"
              CITY="Anycity" />
</dataset>

CustomerTest.java

/**
 * This is an example only! Use it for anything else at your own risk!
 * You have been warned! Coder/user beware!
 *
 * copyright 2009 Carlos Valcarcel
 */
package hiddenclause.example.dbunit;

import java.io.FileInputStream;

import org.dbunit.Assertion;
import org.dbunit.DBTestCase;
import org.dbunit.PropertiesBasedJdbcDatabaseTester;
import org.dbunit.database.IDatabaseConnection;
import org.dbunit.dataset.IDataSet;
import org.dbunit.dataset.ITable;
import org.dbunit.dataset.xml.FlatXmlDataSet;

/**
 * @author carlos
 */
public class CustomerTest extends DBTestCase {

    private CustomerFactory _customerFactory;

    public CustomerTest(String name) {
        super(name);
        System.setProperty(
          PropertiesBasedJdbcDatabaseTester.DBUNIT_DRIVER_CLASS,
          "org.hsqldb.jdbcDriver");
        System.setProperty(
          PropertiesBasedJdbcDatabaseTester.DBUNIT_CONNECTION_URL,
          "jdbc:hsqldb:hsql://localhost/xdb");
        System.setProperty(
          PropertiesBasedJdbcDatabaseTester.DBUNIT_USERNAME,
          "sa");
        System.setProperty(
          PropertiesBasedJdbcDatabaseTester.DBUNIT_PASSWORD,
          "");

        _customerFactory = CustomerFactory.getInstance();
    }

    public void testInsert() throws Exception {
        // insert a customer into the database
        Customer customer = _customerFactory.create("John", "Smith");
        customer.setStreet("1 Main Street");
        customer.setCity("Anycity");
        _customerFactory.update(customer);

        // get the actual table values
        IDatabaseConnection connection = getConnection();
        IDataSet databaseDataSet = connection.createDataSet();
        ITable actualTable = databaseDataSet.getTable("CUSTOMER");

        // get the expected table values
        IDataSet expectedDataSet = new FlatXmlDataSet(
                                          new FileInputStream("customer-expected.xml"));
        ITable expectedTable = expectedDataSet.getTable("CUSTOMER");

        Assertion.assertEquals(expectedTable, actualTable);

    }
    /*
     * (non-Javadoc)
     * @see org.dbunit.DatabaseTestCase#getDataSet()
     */
    @Override
    protected IDataSet getDataSet() throws Exception {
        return new FlatXmlDataSet(
                 new FileInputStream("customer-init.xml"));
    }

}

CustomerFactory.java

/**
 * This is an example only! Use it for anything else at your own risk!
 * You have been warned! Coder/user beware!
 *
 * copyright 2009 Carlos Valcarcel
 */
package hiddenclause.example.dbunit;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.sql.Statement;

/**
 * @author carlos
 *
 */
public class CustomerFactory {

    static {
        try {
            Class.forName("org.hsqldb.jdbcDriver");
        } catch (ClassNotFoundException e) {
            e.printStackTrace();
        }
    }

    public static CustomerFactory getInstance()
    {
        return new CustomerFactory();
    }

    public Customer create(String firstName, String lastName) {
        return new Customer(1, firstName, lastName);
    }

    public void update(Customer customer) throws SQLException {
        Connection connection = DriverManager.getConnection("jdbc:hsqldb:hsql://localhost/xdb");
        String sql = "insert into customer (id, firstname, lastname, street, city) values ("
                   + customer.getId() + ", "
                   + "'" + customer.getFirstName() + "', "
                   + "'" + customer.getLastName() + "', "
                   + "'" + customer.getStreet() + "', "
                   + "'" + customer.getCity() + "'"
                   + ")";

        Statement stmt = connection.createStatement();
        stmt.execute(sql);
        if (stmt.getUpdateCount() != 1) {
            throw new SQLException("Insert failed!");
        }
    }

}

Customer.java

/**
 * This is an example only! Use it for anything else at your own risk!
 * You have been warned! Coder/user beware!
 *
 * copyright 2009 Carlos Valcarcel
 */
package hiddenclause.example.dbunit;

/**
 * @author carlos
 *
 */
public class Customer {

    private int _id;
    private String _firstName;
    private String _lastName;
    private String _street;
    private String _city;

    public Customer(int id, String firstName, String lastName) {
        _id = id;
        _firstName = firstName;
        _lastName = lastName;
    }

    public int getId() {
        return _id;
    }

    public String getFirstName() {
        return _firstName;
    }

    public String getLastName() {
        return _lastName;
    }

    public String getStreet() {
        return _street;
    }

    public String getCity() {
        return _city;
    }

    public void setStreet(String street) {
        _street = street;
    }

    public void setCity(String city) {
        _city = city;
    }

}

ECJ: A Second Tutorial

September 1, 2009 6 comments

This is the fifth of my postings on introductory examples in genetic algorithms and genetic programming. If you have been following these posts you already know that I do not go into what is GA or GP or how you would go about implementing your own GA/GP systems. If you want an introduction please read Chapter 11, Evolving Intelligence, from Toby Segaran’s Programming Collective Intelligence. It is a great introduction to GA/GP and I highly recommend it.

Time to look at the ECJ version of the GP example. Let me warn you: there are a lot of steps I will be skipping; look at the code I modified and at the code from the ECJ code drop. This framework isn’t as straightforward as JGAP or Watchmaker, but I am coming to believe it is the more powerful of the three.

Simple Math Test – A Genetic Program

Input: a series of number pairs

Output: the formula the transforms the pair of numbers to a desired output value.

As it turns out ECJ Tutorial 4 is an example of the above. In a testament to the use of highly accurate names the example is called MultiValuedRegression. Toby Segaran calls his multi-valued regression example Simple Math Test.

(Hmm. Which example would you rather try? I am pretty certain that those of us (well, me) who fit in the novice category of the Dreyfus model of skill acquisition would prefer the simpler name…unless you are into math in which case you are already insulted that we would grow code to figure out the solution instead of figuring it out ourselves. But I digress.)

I thought this post was going to be a short one. ECJ is both simple enough and complex enough that I will refer you to the full code from the ECJ download to view all the pieces involved in running this example and to the tutorial documentation to understand how this example works. However, looking at the fitness function and at the custom class I had to write will help in understanding how the ECJ pieces fit.

The following gratuitous diagram is from the ECJ documentation.

High-level Diagram of the ECJ GP Framework from the ECJ Tutorial 4 Documention

While I would have preferred a UML diagram, this will do. It is all about aggregation relationships anyway:

  • Individuals have a Species
  • Individuals contain a GPTree (solution tree)
  • A GPTree has GPTreeConstraints
  • A GPTree has Nodes which have NodeConstraints
  • Nodes may have child nodes. The child nodes in turn have NodeConstraints
  • NodeConstraints contain the child type and the Node return type

Here is my gratuitous UML diagram.

ECJ GP UML Diagram

ECJ GP UML Diagram

I admit it is not as explicit as ECJ’s diagram and it doesn’t use color (I’m not good with color. Except for purple. And maybe salmon. And bone. Or is it eggshell?).

The original ECJ MultiValuedRegression example does not use constants. The fitness function used the formula x^2 * y + x*y + y, but if you recall the Toby Segaran formula was x^2 + 2y + 3x + 5. What’s different? The use of actual numbers: 2, 3 and 5, to be exact.

In order to create the parse tree of code to be executed we have to use operators for addition and multiplication, variable wrappers for x and y, and numeric wrappers for integers (in the ECJ example, the subtraction operator is also used so I left it for old-times sake). In a situation where the formula to be evolved is quite unknown you may find yourself throwing in trig functions, power notation, the division operator and any other functions you think will help your population evolve in the proper direction.

I created a new Eclipse project and called it SimpleMathTest. I added the ECJ installation to the project’s classpath and in addition I copied the following into the src folder from the ECJ code drop:

  • ec.app.tutorial4.*.java. Rename the package to hiddenclause.ec.app.tutorial4 or whatever you like, but any place in the instructions below where I reference the package name you need to substitute the proper name.
  • ec.params
  • simple.params
  • koza.params
  • tutorial4.params

The various param files refer to each other. In order to make them visible to each other I had to change the paths declared using the parent.0 property.

Within tutorial4.params I changed the value of parent.0 to:

parent.0 = koza.params

Within koza.params I changed the value of parent.0 to:

parent.0 = simple.params

Within simple.params I changed the value of parent.0 to:

parent.0 = ec.params

In order to run the example within Eclipse create a run configuration for the project. The Run Configuration tabs for SimpleMathTest should be set as:

  • Main
    • Project: SimpleMathTest
    • Main class: ec.Evolve
  • Arguments
    • Program Arguments: -file tutorial4.params -p gp.tree.c=true
    • Working Directory: ${workspace_loc:SimpleMathTest/src/hiddenclause/ec/app/tutorial4}

No other tabs needed to be changed.

The Fitness Code

I changed the fitness code from:

MultiValuedRegression.java

...
    expectedResult = currentX * currentX * currentY
                   + currentX * currentY
                   + currentY;
...

to this:

...
    expectedResult = currentX * currentX
                   + 2 * currentY
                   + 3 * currentX
                   + 5;
...

The MultiValuedRegression class has a small local API, but a rather large one if you look at its inheritance tree. The three methods I care about within MultiValuedRegression are:

  • setup() – this is where the object gets information from the properties database. It is only called once.
  • clone() – creates a deep copy of the MultiValuedRegression object.
  • evaluate() – the fitness function. Well, technically not the fitness function as the KozaFitness object looks at the fitness score generated by evaluate() and picks who goes into the next generation.

Running the modified example code in ECJ did not find the formula. From a testing perspective the failure was to be expected. Now I could implement a class to handle constants and update the configuration file.

The Custom Data Classes

I implemented two classes using ECJ naming conventions: Int and IntData. Int is a wrapper for an integer value; it has to inherit from the ERC (Ephemeral Random Constants) class which holds a constant value. IntData is a wrapper for the result of the calculation and is checked in the fitness function; its value changes with each individual checked. Int is created and populated in the GPTree as it tries and reverse engineer the formula; IntData is passed in to each Individual as a place to store the result of the code execution (in this case a calculation).

The Int class has 6 local methods. They are all quite shallow so take a look at the code for a peek into what they do (the code is located below). The method I care about is eval(): it takes an incoming GPData object, downcasts it to an object of type IntData, and stores the Int object’s value in it.

    @Override
    public void eval(final EvolutionState state, final int thread,
                     final GPData input,
                     final ADFStack stack,
                     final GPIndividual individual,
                     final Problem problem) {
        IntData rd = ((IntData) (input));
        rd.x = _val;
    }

The IntData class has one local method named copyTo(). All it does is take its current value and assign it to an incoming GPData object.

public class IntData extends GPData {
    public int x; // return value

    @Override
    public GPData copyTo(final GPData gpd)
    {
        ((IntData) gpd).x = x;
        return gpd;
    }
}

The Configuration File

The changes in here were pretty easy: Add the new wrapper as a function, add the result wrapper, and declare the use of the fitness function.

The new wrapper is defined in tutorial4.param as:

...
gp.fs.0.size = 6
...
gp.fs.0.func.2 = hiddenclause.ec.app.tutorial4.Int
gp.fs.0.func.2.nc = nc0
...

The fitness function is declared as:

eval.problem = hiddenclause.ec.app.tutorial4.MultiValuedRegression

The result wrapper is declared twice: once for external use (the value checked within MultiValuedRegression) and once for internal use.

eval.problem.data = hiddenclause.ec.app.tutorial4.IntData
eval.problem.stack.context.data = hiddenclause.ec.app.tutorial4.IntData

The Output

Once all that was done, I was able to run the example and see if I could evolve the result I was looking for. The out.stat file had this to say:

...
Final Statistics
================
Total Individuals Evaluated: 7168

Best Individual of Run:
Evaluated: true
Fitness: Raw=0.0 Adjusted=1.0 Hits=10
Tree 0:
((x - (3 - x)) + (y + y)) + ((8 + x) + (x * x))

The above simplifies to:

x - 3 + x + y + y + 8 + x + x * x

2x - 3 + 2y + 8 + x + x^2

3x + 5 + 2y + x^2

x^2 + 2y + 3x + 5

The cat was alive.

The Code

tutorial4.params

# Copyright 2006 by Sean Luke and George Mason University
# Licensed under the Academic Free License version 3.0
# See the file "LICENSE" for more information

# Modified by Carlos Valcarcel for use as an example on the Hidden Clause blog.

parent.0 = koza.params

# We have one function set, of class GPFunctionSet
gp.fs.size = 1
gp.fs.0 = ec.gp.GPFunctionSet
# We'll call the function set "f0".  It uses the default GPFuncInfo class
#gp.fs.0.name = f0
#gp.fs.0.info = ec.gp.GPFuncInfo

# The function set.
gp.fs.0.size = 6
gp.fs.0.func.0 = hiddenclause.ec.app.tutorial4.X
gp.fs.0.func.0.nc = nc0
gp.fs.0.func.1 = hiddenclause.ec.app.tutorial4.Y
gp.fs.0.func.1.nc = nc0
gp.fs.0.func.2 = hiddenclause.ec.app.tutorial4.Int
gp.fs.0.func.2.nc = nc0
gp.fs.0.func.3 = hiddenclause.ec.app.tutorial4.Add
gp.fs.0.func.3.nc = nc2
gp.fs.0.func.4 = hiddenclause.ec.app.tutorial4.Sub
gp.fs.0.func.4.nc = nc2
gp.fs.0.func.5 = hiddenclause.ec.app.tutorial4.Mul
gp.fs.0.func.5.nc = nc2

eval.problem = hiddenclause.ec.app.tutorial4.MultiValuedRegression
eval.problem.data = hiddenclause.ec.app.tutorial4.IntData
# The following should almost *always* be the same as eval.problem.data
# For those who are interested, it defines the data object used internally
# inside ADF stack contexts
eval.problem.stack.context.data = hiddenclause.ec.app.tutorial4.IntData

Int.java

/*
 * This is a version of the MultiValuedRegression code from the ECJ code drop to
 * present an implementation of Toby Segaran's SimpleMathTest example.
 *
 * This is an example only! Use it for anything else at your own risk!
 * You have been warned! Coder/user beware!
 */

package hiddenclause.ec.app.tutorial4;

import ec.EvolutionState;
import ec.Problem;
import ec.gp.ADFStack;
import ec.gp.ERC;
import ec.gp.GPData;
import ec.gp.GPIndividual;
import ec.gp.GPNode;
import ec.util.Code;
import ec.util.Parameter;

public class Int extends ERC {
    private int _val;

    @Override
    public void checkConstraints(final EvolutionState state,
                                 final int tree,
                                 final GPIndividual typicalIndividual,
                                 final Parameter individualBase) {
        super.checkConstraints(state, tree, typicalIndividual, individualBase);
        if (children.length != 0)
            state.output.error("Incorrect number of children for node "
                    + toStringForError() + " at " + individualBase);
    }

    @Override
    public void eval(final EvolutionState state,
                     final int thread,
                     final GPData input,
                     final ADFStack stack,
                     final GPIndividual individual,
                     final Problem problem) {
        IntData rd = ((IntData) (input));
        rd.x = _val;
    }

    @Override
    public void resetNode(EvolutionState state, int thread) {
        _val = Math.abs(state.random[thread].nextInt() % 10);
    }

    @Override
    public String encode() {
        return Code.encode(_val);
    }

    @Override
    public boolean nodeEquals(GPNode node) {
        if (this.getClass() != node.getClass())
            return false;
        return (((Int) node)._val == _val);
    }

    @Override
    public String toString() {
        return Integer.toString(_val);
    }
}

IntData.java

/*
 * This is a version of the MultiValuedRegression code from
 * the ECJ code drop to present an implementation of Toby
 * Segaran's SimpleMathTest example.
 *
 * This is an example only! Use it for anything else at your own risk!
 * You have been warned! Coder/user beware!
 */

package hiddenclause.ec.app.tutorial4;

import ec.gp.GPData;

public class IntData extends GPData {
    public int x; // return value

    @Override
    public GPData copyTo(final GPData gpd)
    {
        ((IntData) gpd).x = x;
        return gpd;
    }
}