Jcf's entry to the ICFP 2001 Programming Contest (entry 080)

This page quickly describes my entry to the ICFP Programming Contest 2001.

Overview

As I was on vacations by the time of the contest, I entered the contest as a one-man-team. For the same reason, I only worked at night (roughly 3 hours each night), not to bother my wife and daugther. Thus my program is surely not going to compete with codes written by teams having worked full time on the subject; but I had fun participating, which is the most important.

My code is entirely written in OCaml and is 500 lines long (including some additional utility programs). The code is documented with a literate programming tool for Objective Caml, ocamlweb (see the PostScript file below).

How it works

The initial document is read with a lex+yacc parser (a full lex parsing is possible, and probably faster, but this was easier). This document is kept and will be output as the solution if the timeout is reached before the optimization is computed, or if the optimization is not better.

Spaces are compressed by the lexer (outside of TT environments). Thus the document kept as the ``original'' one is already smaller (in most cases) than the given one. If we are forced to return this one, at least spaces have been compressed.

Then the document is interpreted as a stream of characters decorated with contexts, following the specification given in the task. Originally, I chose to proceed by successive transformations of the document, but it appeared to be really easier (and probably faster) to work on the interpretation.

Finally, a new document is re-built from this stream, using this simple algorithm:

In particular, my code never looks ahead in the interpretation; it is only performing local operations. It neither uses the PL environment.

Defensive programming

Afraid of being eliminated for Stupidity or Incorrectness, I spent quite a lot of time having a robust and correct code.

Regarding robustness, the program keeps the original document. When the program is launched, an alarm is set 5 seconds before the end of the timeout, with Unix.alarm; if triggered, it will output the original document. Similarly, if the optimization produces some too big document, the original one is given back.

Regarding Correctness, I setup an automatic benchmark of my program. For this purpose, I first wrote a validator (using the jury's one would have necessitated too many dialup connections :-), a program to dump the interpretation of a document and a program to generate random documents of arbitrary size. Using these programs, I could test my program heavily, and I did find bugs (which I could fix easily using the dumping program).

I also insisted on the running time, although it is not the first criterion which will be used by the jury. In particular, I tried to write only tail-recursive functions (and also because I was afraid begin eliminated for a Stack Overflow uncaught exception :-) For better performances, files manipulating streams are pre-processed with Camlp4. It really improved the running time. Thank you, Daniel!

The code

The code is available, as a compressed tarball icftp-2001.tar.gz. Once decompressed, go in directory source, configure with ./configure and compile with make. You can run make check to see the program running on my own benchmark files.

You can also read the sources in a nicer way, looking at the PostScript document generated from the sources (with ocamlweb).


Jean-Christophe.Filliatre[at]lri.fr (formatted with yamlpp).