Clever Algorithms: Nature-Inspired Programming Recipes

By Jason Brownlee PhD

Home | Read Online

This is the ad-supported version of the book. Buy it now if you like it.




The Perceptron algorithm belongs to the field of Artificial Neural Networks and more broadly Computational Intelligence. It is a single layer feedforward neural network (single cell network) that inspired many extensions and variants, not limited to ADALINE and the Widrow-Hoff learning rules.


The Perceptron is inspired by the information processing of a single neural cell (called a neuron). A neuron accepts input signals via its dendrites, which pass the electrical signal down to the cell body. The axon carry the signal out to synapses, which are the connections of a cell's axon to other cell's dendrites. In a synapse, the electrical activity is converted into molecular activity (neurotransmitter molecules crossing the synaptic cleft and binding with receptors). The molecular binding develops an electrical signal which is passed onto the connected cells dendrites.


The information processing objective of the technique is to model a given function by modifying internal weightings of input signals to produce an expected output signal. The system is trained using a supervised learning method, where the error between the system's output and a known expected output is presented to the system and used to modify its internal state. State is maintained in a set of weightings on the input signals. The weights are used to represent an abstraction of the mapping of input vectors to the output signal for the examples that the system was exposed to during training.


The Perceptron is comprised of a data structure (weights) and separate procedures for training and applying the structure. The structure is really just a vector of weights (one for each expected input) and a bias term.

Algorithm (below) provides a pseudocode for training the Perceptron. A weight is initialized for each input plus an additional weight for a fixed bias constant input that is almost always set to 1.0. The activation of the network to a given input pattern is calculated as follows:

$activation \leftarrow \sum_{k=1}^{n}\big( w_{k} \times x_{ki}\big) + w_{bias} \times 1.0$

where $n$ is the number of weights and inputs, $x_{ki}$ is the $k^{th}$ attribute on the $i^{th}$ input pattern, and $w_{bias}$ is the bias weight. The weights are updated as follows:

$w_{i}(t+1) = w_{i}(t) + \alpha \times (e(t)-a(t)) \times x_{i}(t)$

where $w_i$ is the $i^{th}$ weight at time $t$ and $t+1$, $\alpha$ is the learning rate, $e(t)$ and $a(t)$ are the expected and actual output at time $t$, and $x_i$ is the $i^{th}$ input. This update process is applied to each weight in turn (as well as the bias weight with its contact input).

Input: ProblemSize, InputPatterns, $iterations_{max}$, $learn_{rate}$
Output: Weights
Weights $\leftarrow$ InitializeWeights(ProblemSize)
For ($i=1$ To $iterations_{max}$)
    $Pattern_i$ $\leftarrow$ SelectInputPattern(InputPatterns)
    $Activation_i$ $\leftarrow$ ActivateNetwork($Pattern_i$, Weights)
    $Output_i$ $\leftarrow$ TransferActivation($Activation_i$)
    UpdateWeights($Pattern_i$, $Output_i$, $learn_{rate}$)
Return (Weights)
Pseudocode for the Perceptron.


  • The Perceptron can be used to approximate arbitrary linear functions and can be used for regression or classification problems.
  • The Perceptron cannot learn a non-linear mapping between the input and output attributes. The XOR problem is a classical example of a problem that the Perceptron cannot learn.
  • Input and output values should be normalized such that $x \in [0,1)$.
  • The learning rate ($\alpha \in [0,1]$) controls the amount of change each error has on the system, lower learning rages are common such as 0.1.
  • The weights can be updated in an online manner (after the exposure to each input pattern) or in batch (after a fixed number of patterns have been observed).
  • Batch updates are expected to be more stable than online updates for some complex problems.
  • A bias weight is used with a constant input signal to provide stability to the learning process.
  • A step transfer function is commonly used to transfer the activation to a binary output value $1 \leftarrow activation \geq 0$, otherwise $0$.
  • It is good practice to expose the system to input patterns in a different random order each enumeration through the input set.
  • The initial weights are typically small random values, typically $\in [0, 0.5]$.

Code Listing

Listing (below) provides an example of the Perceptron algorithm implemented in the Ruby Programming Language. The problem is the classical OR boolean problem, where the inputs of the boolean truth table are provided as the two inputs and the result of the boolean OR operation is expected as output.

The algorithm was implemented using an online learning method, meaning the weights are updated after each input pattern is observed. A step transfer function is used to convert the activation into a binary output $\in\{0,1\}$. Random samples are taken from the domain to train the weights, and similarly, random samples are drawn from the domain to demonstrate what the network has learned. A bias weight is used for stability with a constant input of 1.0.

def random_vector(minmax)
  return do |i|
    minmax[i][0] + ((minmax[i][1] - minmax[i][0]) * rand())

def initialize_weights(problem_size)
  minmax = + 1) {[-1.0,1.0]}
  return random_vector(minmax)

def update_weights(num_inputs, weights, input, out_exp, out_act, l_rate)
  num_inputs.times do |i|
    weights[i] += l_rate * (out_exp - out_act) * input[i]
  weights[num_inputs] += l_rate * (out_exp - out_act) * 1.0

def activate(weights, vector)
  sum = weights[weights.size-1] * 1.0
  vector.each_with_index do |input, i|
    sum += weights[i] * input
  return sum

def transfer(activation)
  return (activation >= 0) ? 1.0 : 0.0

def get_output(weights, vector)
  activation = activate(weights, vector)
  return transfer(activation)

def train_weights(weights, domain, num_inputs, iterations, lrate)
  iterations.times do |epoch|
    error = 0.0
    domain.each do |pattern|
      input = {|k| pattern[k].to_f}
      output = get_output(weights, input)
      expected = pattern.last.to_f
      error += (output - expected).abs
      update_weights(num_inputs, weights, input, expected, output, lrate)
    puts "> epoch=#{epoch}, error=#{error}"

def test_weights(weights, domain, num_inputs)
  correct = 0
  domain.each do |pattern|
    input_vector = {|k| pattern[k].to_f}
    output = get_output(weights, input_vector)
    correct += 1 if output.round == pattern.last
  puts "Finished test with a score of #{correct}/#{domain.size}"
  return correct

def execute(domain, num_inputs, iterations, learning_rate)
  weights = initialize_weights(num_inputs)
  train_weights(weights, domain, num_inputs, iterations, learning_rate)
  test_weights(weights, domain, num_inputs)
  return weights

if __FILE__ == $0
  # problem configuration
  or_problem = [[0,0,0], [0,1,1], [1,0,1], [1,1,1]]
  inputs = 2
  # algorithm configuration
  iterations = 20
  learning_rate = 0.1
  # execute the algorithm
  execute(or_problem, inputs, iterations, learning_rate)
Perceptron in Ruby
Download: perceptron.rb.


Primary Sources

The Perceptron algorithm was proposed by Rosenblatt in 1958 [Rosenblatt1958]. Rosenblatt proposed a range of neural network structures and methods. The 'Perceptron' as it is known is in fact a simplification of Rosenblatt's models by Minsky and Papert for the purposes of analysis [Minsky1969]. An early proof of convergence was provided by Novikoff [Novikoff1962].

Learn More

Minsky and Papert wrote the classical text titled "Perceptrons" in 1969 that is known to have discredited the approach, suggesting it was limited to linear discrimination, which reduced research in the area for decades afterward [Minsky1969].


[Minsky1969] M. L. Minsky and S. A. Papert, "Perceptrons", MIT Press, 1969.
[Novikoff1962] A. B. Novikoff, "On convergence proofs on perceptrons", Symposium on the Mathematical Theory of Automata, 1962.
[Rosenblatt1958] F. Rosenblatt, "The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain", Cornell Aeronautical Laboratory, Psychological Review, 1958.
Clever Algorithms: Nature-Inspired Programming Recipes

Free Course

Get one algorithm per week...
  • ...delivered to your inbox
  • ...described in detail
  • read at your own pace
Sign-up Now

Own A Copy

This 438-page ebook has...
  • ...45 algorithm descriptions
  • practice usage
  • ...pseudo code
  • ...Ruby code
  • ...primary sources
Buy Now

Please Note: This content was automatically generated from the book content and may contain minor differences.

Clever Algorithms: Nature-Inspired Programming Recipes

Do you like Clever Algorithms?
Buy the book now.

© Copyright 2015. All Rights Reserved. | About | Contact | Privacy