Lesson 4 - Lesson 6 - Main page - Algorithm-Based - Component-Based - Hints - EO documentation

Tutorial Lesson 5: using your own genotype

In this lesson, you will learn how to design and evolve your own genotype structure. Note that at the moment, only algorithms involving a scalar fitness (double) are implemented (see test dir for Pareto optimization of multiple-objective fitness - or be patient :-)

The minimum code you'll have to write is first, of course, the code for the genotype structure. Then, the representation-dependent code: intialization procedure(s), variation operators (quadratic crossover, mutation operator), ... and evaluation function - and that's it : we have prepared some template files and the script create.sh that will take care of generating a few other files to make your application complete.

In what follows, we will suppose that you want to evolve some data structure, and that you have enough programming skills to be able to write C code for its random initilialization, its crossover, its mutation and the computation of its fitness.
The examples will be described supposing you want to evolve ... bitstings to solve the OneMax problem (oh no!!!).

New May 2004 : A second script, createSimple, was added some time ago, that generates much simpler set of files, and the stat.tmpl file is now used to allow you to compute and print and save-to-disk and plot-on-line your own statistics. But you'll have to find out by yourself how those work, sorry, no time. It should be easy by just looking at the code (in main file, and in OneMaxEA.cpp and the newly created eoOneMaxStat.h.

New May 2004 : In the same simplified main file (e.g. OneMaxEA.cpp after running ./createsimple OneMax in dir .../eo/tutorial/Templates), you will also be able to use fitness sharing (together with roulette) as a possible selector.


Using template files
Follow this very simple procedure: Smooth application building: We shall now take a look in turn at the 4 files mentionned above, then describe rapidly the other files, especially the main files.

Genotype - and its pre-requisites: eoOneMax.h

First thing is to write the code for the structure of the genotype. This is done by filling in the template file eoOneMax.h. There are 4 places that you should consider filling in:

You can of course also add a destructor if needed, and any other helper method. For instance, you will probably consider adding accessors and setters for the private data - unless you prefer to make everything public :-(
See now an example of a comple eoOneMax.h file. Note that this is the only "colored" completed file we will show, you will have to go to the .../tutorial/OneMax dir to browse all files at once.

Initialization:

Initializer: eoOneMaxInit.h
You must provide an eoInit object for your genotype, that is an object that will randomize a genotype built by the default constructor of the EO class (here, an eoOneMax object) . Here you must at least fill the code for such randomization.
But you might also need some parameters (e.g. the size of the bitstring in eoOneMax): you should then pass them through the constructor of the eoOneMaxInit class, and store it in its private data.
And of course you might need to add a destructor (no example here) if you use complex data type in the object.

Parameters: make_genotype_OneMax.h
There is another file you will probably want to modify as far as initialization is concerned, that is make_genotype_OneMax.h. Such helper files are used for all components of advanced EO programs (see Lesson 4). The main reason for that is to allow separate compilation of such sub-components for known EO types, as this is done in the directories src/ga and src/es. But a useful consequence is to make your code modular. For instance, the make_genotype_OneMax.h file takes care of all the preparation of all data regarding the eoInit object - and returns to the main fonction a reference to an eoInit that will later be used to initialize the whole population in a representation-independent way.
What you have to do in that file is to set values for all parameters needed by the eoOneMaxInit class, for instance by reading them from the parser: this allows later to modify them easily from the command-line. Note however that an alternative could be to pass the parser to the constructor of the eoOneMaxInit class, and to read all parameters there...
Note: Remember that the make_xxx files were first introduced to allow the user to compile sepearately most of the code of standard Evolutionary Algorithm written in EO for bitstring and real vector representations (


Evaluation: eoOneMaxEvalFunc.h

The eoOneMaxEvalFunc is the object that will compute the fitness of an eoOneMax object. You have to fill in the code for the computation of the fitness value (there is no way that this can be done automatically :-) Note that this code must  be run only if the _eo.invalid() test returns true, to avoid computing the fitness of an object that has not  been modified and thus still holds a valid fitness value. After computing the fitness, store it into the object by calling the fitness(FitnessType) method.
Should you need some specific data for that, the constructor of the object is the place to get such data, and you should, again, store it in some private data of the class.


Variation Operators
You can write as many crossover and mutation operators as you like. However, we only provide the template files for quadratic crossover and mutation, but you could then easily write the equivalent code for binary crossover, or general variation operator.

Crossover: eoOneMaxQuadCrossover.h
As usual, you must go and write the code for the operator() that will perform the crossover, possibly modifying both arguments. Don't forget to update the boolean parameter that will report whether the genotypes have been modified - allowing to recompute the fitness only of the ones that have actually been modified. You can also have parameters to the crossover by passing them to the constructor, ans storing them in the private data of the crossover object.

Mutation: eoOneMaxMutation.h
Here again, you must go and write the code for the operator() that will perform the mutation, eventually modifying its arguments. Don't forget to update the boolean parameter that will report whether the genotype has been modified - allowing to recompute the fitness only of the ones that have actually been modified. You can also have parameters to the mutation by passing them to the constructor, ans storing them in the private data of the mutation object.

Parameters: make_op_OneMax.h
First of all, if you intend to use only one crossover operator and one mutation operator, you have nothing to modify in make_op_OneMax function, except maybe reading user-defined parameters (copy the code from make_genotype_OneMax) and passing them to the appropriate operator constructor.
As it is written now, it allows you enter a crossover probability Pcross and a mutation probability Pmut, and to build an eoGeneralOp that will call in sequence the eoOneMaxQuadCrossover that is defined above with probability Pcross and the eoOneMaxMutation also defined above with probability Pmut.  Beware that all allocated objects must be stored in the eoState otherwise you will run into trouble (see EO Memory Management explanations).

However, everything is there (commented out) to allow you to use more than one crossover and one mutation - provided you write the code for them , of course.

The code starts by defining an eoOneMaxQuadCrossoverobject, then reads some application rate that is totally useless if you have only one crossover, then creates an eoPropCombinedQuadOp with this simple oeprator. The same story repeats for the mutation. Finally, the eoGeneralOp is created from those combined operators and the individulal level probabilities Pcross and  Pmut.

In order to add a second crossover operator for instance (called eoOneMaxSecondCrossover in the commented code) all you need to is

In case you have more than one operator of a kind, then of course the relative weights of application do make sense, allowing you to tune with command-line parameters the proportion with which each operator that will be applied.


Main files: OneMaxEA.cpp and OneMaxLibEA.cpp

As a start, you should only (eventually) modify in OneMaxEA.cpp the type of fitness you will be handling, namely double if you are maximizing, or eoMinimizingFitness if you are minimizing. Then running make will result in a perfectly valid executable named OneMaxEA.

The skeleton of the main file here mimics that of the main file in Lesson4, and uses the make_xxx separate files construct: the part of an Evolutionary Algorithm related to the evolution engine is indepenent of the representation, and can be directly used ... provided it is compiled with the right template (remember everything in EO is templatized over the EO objects it handles.  Main file OneMaxEA.cpp is written so that it includes eveything - and thus everytime you run make (or make OneMaxEA), you compile the code for make_pop, make_continue and make_checkpoint that is defined in the .../src/do directory.

The basic construct is (for instance to build the evolution engine)
 
#include <do/make_algo_scalar.h>
eoAlgo<Indi>&  make_algo_scalar(eoParser& _parser, eoState& _state, eoEvalFunc<Indi>& _eval, eoContinue<Indi>& _continue, eoGenOp<Indi>& _op)
{
 return do_make_algo_scalar(_parser, _state, _eval, _continue, _op);
}
First, include the code (from the do directory). Then define the make_xxx function from the do_make_xxx function.
Of course, such construct is stupid here, as you could perfectly call directly the do_make_xxx function in the main. However, if you ever want to do separate compilation of some parts, you will need such construct (see Lesson4) so we have kept it here for consistency reasons.

Go in your application directory, and look at the differences between both files and you'll see how this is handled in both cases.

Reducing compilation time:
However, we also provide another main file (OneMaxLibEA.cpp)that only includes the code that is specific to your application, and is supposed to be linked with another object file that will contain the code that is representation independent (make_OneMax.cpp).  This is done by running make OneMaxLibEA on the command-line.
For example, on a PentiumIII 400MHz with g++ 2.96, compiling OneMaxEA takes about 33s, compiling both make_OneMax.o and OneMaxLibEA takes about 54s but compiling only OneMaxLibEA takes only 14s  make_OneMax.o is up-to-date ...

Hints:
While developping the genotype structure itself in file eoOneMax.h, you should use the OneMaxEA.cpp file. But after the eoOneMax.h file is frozen, you should use the eoOneMaxLibEA.cpp file. Of course, both  resulting programs are strictly identical!


Lesson 4 - Lesson 6 - Main page - Algorithm-Based - Component-Based - Hints - EO documentation

Marc Schoenauer