Adaline, Perceptron and Backpropagation

Introduction

Single-layer neural networks can be trained using various learning algorithms. The best-known algorithms are the Adaline, Perceptron and Backpropagation algorithms for supervised learning. The first two are specific to single-layer neural networks while the third can be generalized to multi-layer perceptrons.

Credits

The applet was written by Olivier Michel. This page written by Alix Herrmann.

Presentation

Let's consider a single-layer neural network with b inputs and c outputs:

W_ij = weight from input i to unit j in output layer; W_jis the vector of all the weights of the j-th neuron in the output layer.
I^p = input vector (pattern p) = (I₁^p, I₂^p, ..., I_b^p).
T^p = target output vector (pattern p) = (T₁^p, T₂^p, ..., T_c^p).
A^p = Actual output vector (pattern p) = (A₁^p, A₂^p, ..., A_c^p).
g() = sigmoid activation function: g(a ) = [1 + exp (-a)]^-1

Theory

Click on each topic to learn more. Then scroll down to the applet.

Applet

This applet allows you to compare the different learning algorithms. The network implemented here has two inputs and a single output neuron. In this tutorial, you will train it to classify 2-dimensional data points into two categories.

Click here to see the instructions. You may find it helpful to open a separate browser window for the instructions, so you can view them at the same time as the applet window.

Questions

Ideal case: place 10 red points (class 1) and 10 blue points (0) in two similar, distinct, and linearly separable clusters.

Compare the speed of convergence of the four algorithms. Which one is the fastest?
Which values of the learning rate provide the best results ?

Different cluster dispersions: Place 20 red points (1) in a very narrow cluster (strongly correlated points) and 5 blue points (0) in a very wide cluster in such a way that the classes are linearly separable.

Compare the performance of the four algorithms on this problem. Which one is the best?
Which values of the learning rate provide the best results ?

Imperfectly separable case: Place 10 red points to (1) and 10 blue points (0) in two similar, linearly separable clusters. Then, place an additional blue point inside the red cluster.

Compare the behavior of the perceptron with the behavior of the pocket algorithm.
Which values for the learning rate give the best results ?

For which kind of problem is the Adaline algorithm the best ?
For which kind of problem is the Backpropagation algorithm the best ?
For which kind of problem is the Perceptron algorithm the best ?
For which kind of problem is the Pocket algorithm the best ?