Clever Algorithms: Nature-Inspired Programming Recipes

A book by Jason Brownlee

Perceptron

Perceptron.

Taxonomy

The Perceptron algorithm belongs to the field of Artificial Neural Networks and more broadly Computational Intelligence. It is a single layer feedforward neural network (single cell network) that inspired many extensions and variants, not limited to ADALINE and the Widrow-Hoff learning rules.

Inspiration

The Perceptron is inspired by the information processing of a single neural cell (called a neuron). A neuron accepts input signals via its dendrites, which pass the electrical signal down to the cell body. The axon carry the signal out to synapses, which are the connections of a cell's axon to other cell's dendrites. In a synapse, the electrical activity is converted into molecular activity (neurotransmitter molecules crossing the synaptic cleft and binding with receptors). The molecular binding develops an electrical signal which is passed onto the connected cells dendrites.

Strategy

The information processing objective of the technique is to model a given function by modifying internal weightings of input signals to produce an expected output signal. The system is trained using a supervised learning method, where the error between the system's output and a known expected output is presented to the system and used to modify its internal state. State is maintained in a set of weightings on the input signals. The weights are used to represent an abstraction of the mapping of input vectors to the output signal for the examples that the system was exposed to during training.

Procedure

The Perceptron is comprised of a data structure (weights) and separate procedures for training and applying the structure. The structure is really just a vector of weights (one for each expected input) and a bias term.

Algorithm (below) provides a pseudocode for training the Perceptron. A weight is initialized for each input plus an additional weight for a fixed bias constant input that is almost always set to 1.0. The activation of the network to a given input pattern is calculated as follows:

$activation \leftarrow \sum_{k=1}^{n}\big( w_{k} \times x_{ki}\big) + w_{bias} \times 1.0$

where $n$ is the number of weights and inputs, $x_{ki}$ is the $k^{th}$ attribute on the $i^{th}$ input pattern, and $w_{bias}$ is the bias weight. The weights are updated as follows:

$w_{i}(t+1) = w_{i}(t) + \alpha \times (e(t)-a(t)) \times x_{i}(t)$

where $w_i$ is the $i^{th}$ weight at time $t$ and $t+1$, $\alpha$ is the learning rate, $e(t)$ and $a(t)$ are the expected and actual output at time $t$, and $x_i$ is the $i^{th}$ input. This update process is applied to each weight in turn (as well as the bias weight with its contact input).

Input: ProblemSize, InputPatterns, $iterations_{max}$, $learn_{rate}$
Output: Weights
Weights $\leftarrow$ InitializeWeights{ProblemSize}
For ($i=1$ To $iterations_{max}$)
    $Pattern_i$ $\leftarrow$ SelectInputPattern{InputPatterns}
    $Activation_i$ $\leftarrow$ ActivateNetwork{$Pattern_i$, Weights}
    $Output_i$ $\leftarrow$ TransferActivation{$Activation_i$}
    UpdateWeights{$Pattern_i$, $Output_i$, $learn_{rate}$}
End
Return (Weights)

Pseudocode for the Perceptron.

Heuristics

The Perceptron can be used to approximate arbitrary linear functions and can be used for regression or classification problems.
The Perceptron cannot learn a non-linear mapping between the input and output attributes. The XOR problem is a classical example of a problem that the Perceptron cannot learn.
Input and output values should be normalized such that $x \in [0,1)$.
The learning rate ($\alpha \in [0,1]$) controls the amount of change each error has on the system, lower learning rages are common such as 0.1.
The weights can be updated in an online manner (after the exposure to each input pattern) or in batch (after a fixed number of patterns have been observed).
Batch updates are expected to be more stable than online updates for some complex problems.
A bias weight is used with a constant input signal to provide stability to the learning process.
A step transfer function is commonly used to transfer the activation to a binary output value $1 \leftarrow activation \geq 0$, otherwise $0$.
It is good practice to expose the system to input patterns in a different random order each enumeration through the input set.
The initial weights are typically small random values, typically $\in [0, 0.5]$.

Code Listing

Listing (below) provides an example of the Perceptron algorithm implemented in the Ruby Programming Language. The problem is the classical OR boolean problem, where the inputs of the boolean truth table are provided as the two inputs and the result of the boolean OR operation is expected as output.

The algorithm was implemented using an online learning method, meaning the weights are updated after each input pattern is observed. A step transfer function is used to convert the activation into a binary output $\in{0,1}$. Random samples are taken from the domain to train the weights, and similarly, random samples are drawn from the domain to demonstrate what the network has learned. A bias weight is used for stability with a constant input of 1.0.

def random_vector(minmax)
  return Array.new(minmax.size) do |i|
    minmax[i][0] + ((minmax[i][1] - minmax[i][0]) * rand())
  end
end

def initialize_weights(problem_size)
  minmax = Array.new(problem_size + 1) {[-1.0,1.0]}
  return random_vector(minmax)
end

def update_weights(num_inputs, weights, input, out_exp, out_act, l_rate)
  num_inputs.times do |i|
    weights[i] += l_rate * (out_exp - out_act) * input[i]
  end
  weights[num_inputs] += l_rate * (out_exp - out_act) * 1.0
end

def activate(weights, vector)
  sum = weights[weights.size-1] * 1.0
  vector.each_with_index do |input, i|
    sum += weights[i] * input
  end
  return sum
end

def transfer(activation)
  return (activation >= 0) ? 1.0 : 0.0
end

def get_output(weights, vector)
  activation = activate(weights, vector)
  return transfer(activation)
end

def train_weights(weights, domain, num_inputs, iterations, lrate)
  iterations.times do |epoch|
    error = 0.0
    domain.each do |pattern|
      input = Array.new(num_inputs) {|k| pattern[k].to_f}
      output = get_output(weights, input)
      expected = pattern.last.to_f
      error += (output - expected).abs
      update_weights(num_inputs, weights, input, expected, output, lrate)
    end
    puts "> epoch=#{epoch}, error=#{error}"
  end
end

def test_weights(weights, domain, num_inputs)
  correct = 0
  domain.each do |pattern|
    input_vector = Array.new(num_inputs) {|k| pattern[k].to_f}
    output = get_output(weights, input_vector)
    correct += 1 if output.round == pattern.last
  end
  puts "Finished test with a score of #{correct}/#{domain.size}"
  return correct
end

def execute(domain, num_inputs, iterations, learning_rate)
  weights = initialize_weights(num_inputs)
  train_weights(weights, domain, num_inputs, iterations, learning_rate)
  test_weights(weights, domain, num_inputs)
  return weights
end

if __FILE__ == $0
  # problem configuration
  or_problem = [[0,0,0], [0,1,1], [1,0,1], [1,1,1]]
  inputs = 2
  # algorithm configuration
  iterations = 20
  learning_rate = 0.1
  # execute the algorithm
  execute(or_problem, inputs, iterations, learning_rate)
end

Perceptron in Ruby

Download: perceptron.rb.

References

Primary Sources

The Perceptron algorithm was proposed by Rosenblatt in 1958 [Rosenblatt1958]. Rosenblatt proposed a range of neural network structures and methods. The 'Perceptron' as it is known is in fact a simplification of Rosenblatt's models by Minsky and Papert for the purposes of analysis [Minsky1969]. An early proof of convergence was provided by Novikoff [Novikoff1962].

Learn More

Minsky and Papert wrote the classical text titled "Perceptrons" in 1969 that is known to have discredited the approach, suggesting it was limited to linear discrimination, which reduced research in the area for decades afterward [Minsky1969].

Bibliography

[Minsky1969]	M. L. Minsky and S. A. Papert, "Perceptrons", MIT Press, 1969.
[Novikoff1962]	A. B. Novikoff, "On convergence proofs on perceptrons", Symposium on the Mathematical Theory of Automata, 1962.
[Rosenblatt1958]	F. Rosenblatt, "The Perceptron: A Probabilistic Model for Information Storage and\n\tOrganization in the Brain", Cornell Aeronautical Laboratory, Psychological Review, 1958.

Please Note: This content was automatically generated from the book content and may contain minor differences.