Lecture 6

Neural Networks

Consider the plane shown above; this ane be used to make decisions about patterns in the x-y plane. For every point in the plane x-y there is a corresponding value for z where (x,y,z) lies on the angled plane. If z is negative then the point lies above the dotted line, otherwise it is below the dotted line.

The equation for z is

We can generalise this for more dimensions

o is the output, w are the weights and i is the input.

This can be drawn as shown below.

Generally we don't care about the size of the output, it will be a yes/no answer for a class; so use the following:

here:

The function in the box is called the sigmoid function and is:

If t is the expected output then there will be an error between t and o. To train the system we need to minimise this error. For a number of these functions there will be one error - the sum of the individual errors.

To train the classifier, just change each w so that e becomes smaller.

Gradient Descent

Now

Where

therefore

so

Weight updating rule:

where n is a parameter used to change the speed of gradient descent. Note that we want to reduce the error so the minus sign disappears.

For theta:

therefore

This device is sometimes called a perceptron. The training rule is called the delta rule.

Perceptron Learning Algorithm

The Multilayer Perceptron

This perceptron can only learn simple problems.