depouille_et_compare Software


The depouille_et_compare Software can be used to post-treat some files that are the result of some computer experiments; an experiment here means a series of « runs », and each « run » has produced one text file made of series of columns of real values. One of the columns generally represents some cost (e.g. time, number of function evaluations, etc), and another one represents some quantity of interest to the programmer, that varies during on run. Both columns can be given as parameters to depouille.

Running depouille on the files of several runs will generate a series new files containing some simple statistical data related to the series of different runs. For each value of the cost, the corresponding values of the quantity of interest of the different runs build up a statistical sample, for which depouille will compute the average, the median, the min and max values, as well as the standard deviation. All these files are ready to be plotted using any reasonnable plotter (e.g. gnuplot). Note that there is no need that all files have the same values for the cost. The depouille program will assume a constant value of the quantity of interest between the different values of the cost, and hence reconstruct the values at different (and regular) cost steps (a parameter of depouille).

After having ran depouille on 2 different experiments (involving of course the same quantity of interest), the compare program runs different statistical test on the 2 sets of files (using the results of depouille) that show, for all values of the cost, whether one is significantly better than the other (with confidence levels 5% and 10%). Of course, both experiments need to have used the same cost step within depouille. The available statistical tests are the Wilcoxon, the signed-Wilcoxon and the Kolmogorov tests.

Each test tests the hypothesis
H0 : both samples come from the same distribution
against the hypothesis
H1 : both samples do not come from the same distribution. Each test returns here 0 (H0 is accepted) or 1 (H0 is refused).

In what follows, the red parts should be typed at your terminal - and green words are file names.


Installation

Get the tar gzipped file containing the source files : depouille_et_compare.tgz.
Extract everything from it with the command:
bash$ tar xfvz depouille_et_compare.tgz

Once done, go in the directory depouille_et_compare and type:
bash$ make

this should compile the program and create the 2 executables described below.


Usage

Before launching depouille, you need to create a file containing the list of all « runs » for the experiment you wish to analyze. For instance, if all these files are in the directory RESULTS with suffix .xg, you can create the experiment file by running

bash$ ls RESULTS/*.xg > listRESULTS
You can then run the program by running

bash$ path_to_depouille_et_compare/bin/depouille listRESULTS cost_interval cost_column quantity_column

Where cost_interval is the step for the cost, and cost_column (resp. quantity_column) are the number of the columns where the cost (resp. the quantity of interest) are written in the run files. The last 3 parameters are optional, default values are 50 1 et 2.

Example:
bash$ /home/smith/depouille_et_compare/bin/depouille listRESULTS 50 2 4

This is to be done for each experiment you wish to analyze.

A directory named Resultats_depouille is created in the current dir, and all results of depouille are stored there, i.e.:
- listRESULTS.echantillons :a CSV file with all samples at the given cost steps; first column is the cost, other columns the values at this cost for all runs;
- listRESULTS.moy : at each cost step (first column), the average value of the quantity of interest (second column);
- listRESULTS.med : same, with the median value;
- listRESULTS.max : same, with the max value;
- listRESULTS.min : same, with the min value;
- listRESULTS.sig : same, second column is standard deviation of the samples at this cost step;
- listRESULTS.conf : gives the confidence interval (one sigma);

The second part is the comparison of 2 experiments. Needless to say, it is mandatory that both experiments have been analyzed using depouille with the same cost interval. If this is the case, type in:

bash$ path_to_depouille_et_compare/bin/compare fichier1 fichier2 test

Where fichier1 and fichier2 are the two files containing the list of runs files that have been handled by depouille earlier, and test is a single character indicating which statistical test you wish to run: w for Wilcoxon, s for signed Wilcoxon, and k for Kolmogorov. No argument results in all 3 tests being ran. In any case, the 2-tailed Student test is done whatever on the averages of both samples.

Example:
bash$ /home/durand/depouille_et_compare/bin/compare listRESULTS1 listRESULTS2 s
for a signed Wilcoxon test.
A directory named Resultats_compare is created in the current directory, where the results of compare are stored, i.e.:
- listRESULTS1_listRESULTS2.student : gives the results of the Student tests, as well as the threshold values for the 1%, 5% et 10% confidence levels; H0 is accepted if the value of the test is outside the corresponding interval (of course, the larger interval corresponds to the 1% level, and the smallest to the 10% level);
- listRESULTS1_listRESULTS2_test.r01 : gives the results of the corresponding statistical test with 1% confidence, where 0 means H0 is accepted, 1 that it is not;
- listRESULTS1_listRESULTS2_test.r05 : same with 5% confidence level;
- listRESULTS1_listRESULTS2_test.r10 : same with 10% confidence level;


Exemple

In the depouille_et_compare 1.0 distibution, there is a directory named exemple (WARNING, French spelling :-)
In this dir, the directory res1 contains the results of some Genetic Algorithm run, 50 runs with mutation probability 0.1, and 50 runs with mutation probability 0.6. Each run file contains the output of the GA program, with several columns containing in turn generation counter, evaluation counter, CPU-time counter, best fitness in population, average fitness in population, standard deviation of fitness in population. The files are named best_average_{PPP}_{RR}.xg where PPP is the value of the mutation probability that was used for this run, and RR the number of the run. The files pm{PPP} contain the corresponding lists of run files.
Running, in directory exemple/res1,

../../bin/depouille pm{PPP}
will create the corresponding files in sub-dir Resultats_depouille. After both 0.6 and 0.1 experiments have been ran by depouille, compare can be ran with

../../bin/compare pm{0.6} pm{0.1}
will create the directory Resultats_compare.
Some calls to gnuplot will then generate the figures that are stored as jpg files in directory exemple.