This material is presented to ensure timely dissemination of scholarly work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse copyrighted component of this work in other works, must be obtained from the publishers.

ALGORITHMIC COMPLEXITY
and
Communication Problems:

INTRODUCTION

The theory of complexity enables problems to be classified according to the difficulty with which they can be solved by the use of algorithms. Let us begin with a few definitions.

An algorithm is a finite representation of a computational method of solving a problem. It has the form of a finite length text written in an ad hoc language, where each "phrase" has an unambiguous meaning. This text describes computations that can be carried out in a finite time, using a mechanical computing procedure with a memory, on some data: the input or instances of the problem, properly coded. Such a coded input is a finite sequence of symbols from a given alphabet (i.e., a word).

The difficulty of problem P can be measured as the time taken by an algorithm to solve it, or as the amount of memory space required by this algorithm. There are several ways to measure this difficulty. The complexity of the worst case corresponds to the maximum time (or to the maximum memory space) for a "best algorithm" (according to this criterion) to solve P. The average complexity is the "average" time for a best (in terms of average time) algorithm to solve P.

The problems considered here deal with communication theory, specifically with error-correcting codes, cryptography and vector quantization. They demonstrate two aspects of complexity: for codes - compact representation of messages to be transmitted on a channel - the theory enables the efficiency of coding and decoding methods to be discussed; in cryptography - the art of enciphering messages for security reasons - the theory provides methods that are best when based on difficult problems.

Many models of mechanical computing procedure have been designed. Surprisingly, they are all equivalent, since what is computable in one of them can also be computed in the others. This is Church's very well-known thesis: whatever is intuitively computable by an algorithm is computable by any other formal computation model. In studying complexity theory, we have followed the classical path, taking the Turing machine as a model. This machine possesses one advantage: it is both a model for a computer program (but far from an actual program) and a clock that measures the execution time of such a program (through the notion of universal machine, it can also be a computer model, but that is another story ...). It then leads to a simple definition of the complexity C(A, ) of an algorithm A: for any integer n, C(A,n) is the longest execution time of A on a word of length n.

In 1964, Cobham introduced the class P of the problems that are solvable in polynomial time, i.e., problems solvable by an algorithm A verifying: for any n, C(A,n) is less than p(n), where p is a given polynomial (it is then said that A runs in polynomial time). This is the class of "easy" problems. There are difficult problems (also called "exponential"), and, of course, problems that are not solvable with an algorithm (improperly called undecidable). This simplicity did not last very long. In 1965, alongside class P, Edmonds introduced class NP, and he was the first to conjecture that P is not equal to NP. Intuitively, this class NP corresponds to those problems that admit an algorithmic "proof" of reasonable length and whose validity can be verified in polynomial time. The subtlety here is that, unlike the case of class P, the proof is assumed to be given. For instance, if we consider an integer m and if we take an integer q, it can be verified in polynomial time whether q does - or does not - divide m (q is a "proof" of the non-primality of m). However, there is no polynomial algorithm in log(m) (the length of the binary word that represents m) known, such that the non-primality of m might be obtained. Edmonds' conjecture has then been considered in a formal way by Cook [1971], and more thoroughly studied by Karp [1972]. Cook's idea was that in class NP, there exist problems that are "more difficult" than others, the NP-complete problems. It would be sufficient if only one NP-complete problem were in P to conclude that P=NP. Edmonds' conjecture was the first of many about a whole hierarchy of complexity classes.

The main aim of this book is to study the NP-complete problems that occur in coding, cryptography and vector quantization. The first three chapters deal fairly completely with the theory of algorithmic complexity.

In chapter I we present concepts that are independent of a particular model of computation. We describe some problems of which we shall make much use later on. We discuss the method for coding instances of a problem (problems and languages) and we give structural properties for classes of languages, such as completeness. We set out the concept of transformation (how to transform a problem P into a problem P' whose solution is known). We then show, in an intuitive way, some concepts that will be formalised later on (decidable versus undecidable problems, polynomial problems, nondeterministic polynomial problems). In this book, we have decided to precede or follow strict descriptions by informally presented information to help the reader to build up his intuition.

The computation model is set out in chapter II (Turing machines). We explain the notion of recursive language (which corresponds to decidable problems), we define complexity and we classify languages and problems according to their complexity. Rather than choosing the classical way (nondeterministic Turing machines) of defining class NP, we have preferred the more intuitive idea mentioned above: that such a problem admits a "proof" that can be written - according to the size of the statement - with only a polynomial number of symbols and that can be validated by a machine in polynomial time. We then study the structure of class NP and use the proofs of NP-completeness for most problems given at the beginning of chapter I. Next we discuss the existence, among the difficult problems, of slightly easier problems (pseudo-polynomial, such as the knapsack), or slightly more difficult (strongly NP-complete, such as the travelling salesman).

While the first two chapters deal with decision problems (their solutions correspond to answer 'Yes' or answer 'No', e.g., "Is 4578911 prime?"), Chapter III sharpens the notion of difficulty by extending it to search and optimization problems. It also widens what has been constructed with P and NP, by setting the polynomial hierarchy. Finally, we discuss a few variations on the concept of complexity: complexity in space, complexity of enumeration problems, probabilistic complexity and relativisation (concepts such as communication complexity, class NC or Kolmogoroff complexity are not considered).

Chapter IV deals with error-correcting codes, which were created in 1948 (the famous "Second Theorem" of Shannon). The first work in this field was of a combinatorial nature (Hamming codes in 1949); thus Golay codes have been discovered after a remarkable arithmetic identity had been observed. The theory was algebrised in 1958 with the BCH codes and algorithmised with their decoding (Berlekamp-Massey, 1966). These codes are frequently used (Reed-Muller codes for Mariner spacecraft, Reed-Solomon codes for numerical discs, etc.). Research in coding uses sophisticated tools: Galois fields, number theory and, more recently, algebraic geometry.

In this chapter, we have departed our general rule and given the algorithms; otherwise we would have said all there is to be said by stating that the decoding of linear codes is NP-complete. The appendix contains a short presentation of algebraic decoding of binary cyclic BCH codes, a special case of linear codes.

The amount of space devoted to covering radius, the fourth "fundamental parameter" of a code after length, dimension and minimal distance, is greater than is usual in books on coding. There are several reasons for this:
- this book is not a book on coding,
- it enables us to show a $Pi_2$-complete problem,
- it is a favorite subject with a majority of authors.

In chapter V we demonstrate the links between complexity and cryptography. Since this is not a book on cryptography either, the authors favour public-key cryptography (created around 1976). Thus, after a general introduction to cryptography and a more specific introduction to public-key cryptography, the reader will find the descriptions of three famous cryptosystems of this type. Their mechanisms are based on three problems introduced in the first chapter: factorization, linear decoding and the knapsack problem. Each description contains a statement of the enciphering/deciphering scheme, a discussion on the security of the system and specific remarks. The reader familiar with works on cryptography will notice that we have greatly emphasised the knapsack problem, which can be used for several manipulations, mainly combinatorial. Finally, we mention recent (around 1985) zero-knowledge authentication protocols. Here again we use a problem from the first chapter - the k-colouring of a graph - together with the problem of factorization mentioned previously.

The aim in chapter VI is to use the tools described in the first three chapters to deal with the problem of vector quantization; in doing this, we distort somewhat the initial grounds of the problem. Thus, after recalling a few basic statements in information theory (i.e., source coding), we arrive at a very simple situation: we wish to represent a binary source in a compact way, satisfying Hamming's distortion criterion. We can then see a parallel relation between vector quantization and the decoding of a correcting code. The chapter ends by showing that the general problem of quantization is NP-complete: bad news which, as for coding, has a mainly theoretical incidence.

It is assumed readers will have undergraduate knowledge of algebra (groups, rings, fields, vector spaces, linear algebra). Basics in graph theory would also prove useful, but in any case the fundamental concepts are given at the beginning.

Revenir à la page d'accueil
Revenir en haut de la page