Neural Computing
You don't need brains, but you must be well connected.
--------------------------------------------------------------------------------
------------------------------------------------
A neural network is an interconnected system of layers of nodes -- or neurons, as the nodes would be called in brain science. It has an input layer, an output layer, and intervening layers, possibly with feedback and feed forward loops as well. Its function is to receive a collection of inputs (a vector input) and transform them into an output, which could be either a vector or a scalar. For an introduction to neural networks and a picture, click here.
The network transforms the inputs into intermediate values and projects them to the output. The inputs are connected to each node of the intermediate layer of nodes. The tiny circles in this schematic identify connecting points between connecting lines. Each of the four inputs therefore distributes to all of the intermediate nodes.
Each connection is weighted according to its importance in the transformation. The weightings are the products of prior training of the network and are presumed fixed for the transformation. The weights can be updated at any time to meet changing conditions of the problem for which the net was designed.
----------------------------------------------
The neural network computes by transforming inputs to outputs. The conversion is a general function and can be written in vector/matrix notation, the result of which is a tensor operation. (For the sense of it as applied to variables in quantum mechanics, click here.)
You should know that a vector is an n-tuple of values, each of which defines a component of the vector. It is usually shown in the form:
In = (a1, a2, ..., an)
The n-tuple defines a point in an n-space. If n equals 2, for example, we have a vector in a plane surface. The vector is a point in the surface and has two components, one in each dimension of the plane, and its value is defined by the values of the two components. In the above neural network, there are four input values, so the input vector is a 4-vector and would be represented spatially as a point in a 4-space.
The four input components are connected to the three output components by means of the weighted connecting lines, and the mass of related weights is identified mathematically as a matrix. In fact, it is a 3x4 matrix, containing the three intermediate components in one direction and the four input components in the other direction.
Each combination of weights -- which generates its own matrix -- is one member of a huge population of combinations, and a training exercise draws on a large sample of that population to establish the transformation.
This matrix is multiplied on the left by the input vector, I4, to yield:
I4*M = O3
where I4 = (a, b, c, d) and O3 = (x, y, z).
The final output is then usually the sum of the x, y, and z outputs:
Of = x + y + z.
----------------------------------------------
The optimum is in the training. No matter what structure you design, no matter what inputs you formulate, it's the training that establishes the best output for the given mode. When you train a net, your objective is to select from the many possible combinations of weights for your network configuration the one that gives you a best result. The optimum is usually obtained by minimizing a mean squared error function. However, this optimum may not be the best overall result. The product of training is largely conditioned by the starting weights and the approach taken from there to vary the weights to obtain the minimum. You may not be starting form the most advantageous point or choose the best variations for the changes. Typically the process only determines a relative optimum, or more correctly a local optimum.
To get the picture, imagine you're in Afghanistan and you have to find the lowest altitude of the country. This would be easy enough, of course, if you had a contour map of the country and could read every hill and valley. You would then simply look for the smallest altitude reading. But you don't have that advantage with a new network construction. You have essentially no information. So you have to start from scratch. And so you have to select your starting weights at random.
Building the network for Afghanistan, the chance selection could in fact put you at one of the topmost points in the country. From there, depending on the path you take, you might drop precipitously, but the valley you reach could still be miles high and well above the lowest point of the country. But of course you wouldn't know that. All you would know at this time is that you have found a lower altitude than the starting altitude. Remember that you don't have a map.
To try to reach an even lower altitude -- a better training result -- you can now start over, with a new set of weights, and repeat the training procedure. This time your starting point would likely be different than before, and your training path would likely also lead to a different result. But you won't know this until the error function has again been minimized.
This procedure can be repeated again and again until you are satisfied with the results or tired of the effort. But it should be clear that a global best result is not only difficult to reach but also not to be expected, because the very next trial could improve the result.
----------------------------------------------
It's easy enough to do a computation once the computing procedure has been established. But that's only a small part of the operation. More difficult is the formulation of the problem. Here you have to select the kind of information you need to provide as input -- i.e., your independent variables -- and you have to design the computing procedure itself to handle the information.
With neural networks, you not only have to decide what to incorporate as your input vector, but you also have to construct the computational procedure. This is like developing equations or writing a computer program to handle the processing. But there is the added difficulty of data preparation. Given the structural design of a neural network, i.e., given:
the final transformation still depends on the choice of inputs and the nature of the training to which the network is subjected.
-------------------------------------------
Neural networks have been likened to statistics, mainly due to the similarity between sets of possible weights over the nets and the probability distribution over a set of samples. Each combination of weights of a network represents one possible configuration of weights in a class of possible weights for that network, and each combination can also be thought to be one observation of a sample of such values. You get one sample for the learning phase of the network, and another sample for the test phase.
Similarly, neural networks can be likened to fuzzy systems in that the weights of a network can represent degrees of relevance or truth or membership in one or another class. Keep in mind that truth values in Fuzzy logic can range from 0 to 1 and are not confined solely to 0 and 1.
------------------------------------------