# Clever Algorithms: Nature-Inspired Programming Recipes

By Jason Brownlee PhD

This is the ad-supported version of the book. Buy it now if you like it.

# Negative Selection Algorithm

Negative Selection Algorithm, NSA.

## Taxonomy

The Negative Selection Algorithm belongs to the field of Artificial Immune Systems. The algorithm is related to other Artificial Immune Systems such as the Clonal Selection Algorithm, and the Immune Network Algorithm.

## Inspiration

The Negative Selection algorithm is inspired by the self-nonself discrimination behavior observed in the mammalian acquired immune system. The clonal selection theory of acquired immunity accounts for the adaptive behavior of the immune system including the ongoing selection and proliferation of cells that select-for potentially harmful (and typically foreign) material in the body. An interesting aspect of this process is that it is responsible for managing a population of immune cells that do not select-for the tissues of the body, specifically it does not create self-reactive immune cells known as auto-immunity. This problem is known as 'self-nonself discrimination' and it involves the preparation and on going maintenance of a repertoire of immune cells such that none are auto-immune. This is achieved by a negative selection process that selects-for and removes those cells that are self-reactive during cell creation and cell proliferation. This process has been observed in the preparation of T-lymphocytes, naive versions of which are matured using both a positive and negative selection process in the thymus.

## Metaphor

The self-nonself discrimination principle suggests that the anticipatory guesses made in clonal selection are filtered by regions of infeasibility (protein conformations that bind to self-tissues). Further, the self-nonself immunological paradigm proposes the modeling of the unknown domain (encountered pathogen) by modeling the complement of what is known. This is unintuitive as the natural inclination is to categorize unknown information by what is different from that which is known, rather than guessing at the unknown information and filtering those guesses by what is known.

## Strategy

The information processing principles of the self-nonself discrimination process via negative selection are that of a anomaly and change detection systems that model the anticipation of variation from what is known. The principle is achieved by building a model of changes, anomalies, or unknown (non-normal or non-self) data by generating patterns that do not match an existing corpus of available (self or normal) patterns. The prepared non-normal model is then used to either monitor the existing normal data or streams of new data by seeking matches to the non-normal patterns.

## Procedure

Algorithm (below) provides a pseudocode listing of the detector generation procedure for the Negative Selection Algorithm. Algorithm (below) provides a pseudocode listing of the detector application procedure for the Negative Selection Algorithm.

Input: SelfData
Output: Repertoire
Repertoire $\leftarrow \emptyset$
While ($\neg$StopCondition())
Detectors $\leftarrow$ GenerateRandomDetectors()
For ($Detector_{i}$ $\in$ Repertoire)
If ($\neg$Matches($Detector_{i}$, SelfData))
Repertoire $\leftarrow$ $Detector_{i}$
End
End
End
Return (Repertoire)
Pseudocode for detector generation.
Input: InputSamples, Repertoire
For ($Input_{i}$ $\in$ InputSamples)
$Inputi_{class}$ $\leftarrow$ "non-self"
For ($Detector_{i}$ $\in$ Repertoire)
If (Matches($Input_{i}$, $Detector_{i}$))
$Inputi_{class}$ $\leftarrow$ "self"
Break
End
End
End
Pseudocode for detector application.

## Heuristics

• The Negative Selection Algorithm was designed for change detection, novelty detection, intrusion detection and similar pattern recognition and two-class classification problem domains.
• Traditional negative selection algorithms used binary representations and binary matching rules such as Hamming distance, and $r$-contiguous bits.
• A data representation should be selected that is most suitable for a given problem domain, and a matching rule is in turn selected or tailored to the data representation.
• Detectors can be prepared with no prior knowledge of the problem domain other than the known (normal or self) dataset.
• The algorithm can be configured to balance between detector convergence (quality of the matches) and the space complexity (number of detectors).
• The lack of dependence between detectors means that detector preparation and application is inherently parallel and suited for a distributed and parallel implementation, respectively.

## Code Listing

Listing (below) provides an example of the Negative Selection Algorithm implemented in the Ruby Programming Language. The demonstration problem is a two-class classification problem where samples are drawn from a two-dimensional domain, where $x_i \in [0,1]$. Those samples in $1.0>x_i>0.5$ are classified as self and the rest of the space belongs to the non-self class. Samples are drawn from the self class and presented to the algorithm for the preparation of pattern detectors for classifying unobserved samples from the non-self class. The algorithm creates a set of detectors that do not match the self data, and are then applied to a set of randomly generated samples from the domain. The algorithm uses a real-valued representation. The Euclidean distance function is used during matching and a minimum distance value is specified as a user parameter for approximate matches between patterns. The algorithm includes the additional computationally expensive check for duplicates in the preparation of the self dataset and the detector set.

def random_vector(minmax)
return Array.new(minmax.length) do |i|
minmax[i][0] + ((minmax[i][1] - minmax[i][0]) * rand())
end
end

def euclidean_distance(c1, c2)
sum = 0.0
c1.each_index {|i| sum += (c1[i]-c2[i])**2.0}
return Math.sqrt(sum)
end

def contains?(vector, space)
vector.each_with_index do |v,i|
return false if v<space[i][0] or v>space[i][1]
end
return true
end

def matches?(vector, dataset, min_dist)
dataset.each do |pattern|
dist = euclidean_distance(vector, pattern[:vector])
return true if dist <= min_dist
end
return false
end

def generate_detectors(max_detectors, search_space, self_dataset, min_dist)
detectors = []
begin
detector = {:vector=>random_vector(search_space)}
if !matches?(detector[:vector], self_dataset, min_dist)
detectors << detector if !matches?(detector[:vector], detectors, 0.0)
end
end while detectors.size < max_detectors
return detectors
end

def generate_self_dataset(num_records, self_space, search_space)
self_dataset = []
begin
pattern = {}
pattern[:vector] = random_vector(search_space)
next if matches?(pattern[:vector], self_dataset, 0.0)
if contains?(pattern[:vector], self_space)
self_dataset << pattern
end
end while self_dataset.length < num_records
return self_dataset
end

def apply_detectors(detectors, bounds, self_dataset, min_dist, trials=50)
correct = 0
trials.times do |i|
input = {:vector=>random_vector(bounds)}
actual = matches?(input[:vector], detectors, min_dist) ? "N" : "S"
expected = matches?(input[:vector], self_dataset, min_dist) ? "S" : "N"
correct += 1 if actual==expected
puts "#{i+1}/#{trials}: predicted=#{actual}, expected=#{expected}"
end
puts "Done. Result: #{correct}/#{trials}"
return correct
end

def execute(bounds, self_space, max_detect, max_self, min_dist)
self_dataset = generate_self_dataset(max_self, self_space, bounds)
puts "Done: prepared #{self_dataset.size} self patterns."
detectors = generate_detectors(max_detect, bounds, self_dataset, min_dist)
puts "Done: prepared #{detectors.size} detectors."
apply_detectors(detectors, bounds, self_dataset, min_dist)
return detectors
end

if __FILE__ == \$0
# problem configuration
problem_size = 2
search_space = Array.new(problem_size) {[0.0, 1.0]}
self_space = Array.new(problem_size) {[0.5, 1.0]}
max_self = 150
# algorithm configuration
max_detectors = 300
min_dist = 0.05
# execute the algorithm
execute(search_space, self_space, max_detectors, max_self, min_dist)
end

Negative Selection Algorithm in Ruby

## References

### Primary Sources

The seminal negative selection algorithm was proposed by Forrest, et al. [Forrest1994] in which a population of detectors are prepared in the presence of known information, where those randomly generated detectors that match against known data are discarded. The population of pattern guesses in the unknown space then monitors the corpus of known information for changes. The algorithm was applied to the monitoring of files for changes (corruptions and infections by computer viruses), and later formalized as a change detection algorithm [Dhaeseleer1996a] [Dhaeseleer1996].

The Negative Selection algorithm has been applied to the monitoring of changes in the execution behavior of Unix processes [Forrest1996] [Hofmeyr1998], and to monitor changes in remote connections of a network computer (intrusion detection) [Hofmeyr1999] [Hofmeyr1999a]. The application of the algorithm has been predominantly to virus host intrusion detection and their abstracted problems of classification (two-class) and anomaly detection. Esponda provides some interesting work showing some compression and privacy benefits provided by maintaining a negative model (non-self) [Darlington2005] Ji and Dasgupta provide a contemporary and detailed review of Negative Selection Algorithms covering topics such as data representations, matching rules, detector generation procedures, computational complexity, hybridization, and theoretical frameworks [Ji2007]. Recently, the validity of the application of negative selection algorithms in high-dimensional spaces has been questioned, specifically given the scalability of the approach in the face of the exponential increase in volume within the problem space [Stibor2006].

## Bibliography

 [Darlington2005] C. F. Esponda Darlington, "Negative Representations of Information", [PhD Thesis] The University of New Mexico, 2005. [Dhaeseleer1996] P. D'haeseleer, "An immunological approach to change detection: theoretical results", in Proceedings of the 9th IEEE Computer Security Foundations Workshop, 1996. [Dhaeseleer1996a] P. D'haeseleer and S. Forrest and P. Helman, "An immunological approach to change detection: algorithms, analysis and implications", in Proceedings of the IEEE Symposium on Security and Privacy, 1996. [Forrest1994] S. Forrest and A. S. Perelson and L. Allen and R. Cherukuri, "Self-Nonself Discrimination in a Computer", in Proceedings of the 1992 IEEE Symposium on Security and Privacy, 1994. [Forrest1996] S. Forrest and S. A. Hofmeyr and A. Somayaji and T. A. Longstaff, "A Sense of Self for Unix Processes", in Proceedings of the 1996 IEEE Symposium on Security and Privacy, 1996. [Hofmeyr1998] S. A. Hofmeyr and S. Forrest and A. Somayaji, "Intrusion Detection using Sequences of System Calls", Journal of Computer Security, 1998. [Hofmeyr1999] S. Hofmeyr and S. Forrest, "Immunity by Design: An Artificial Immune System", in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), 1999. [Hofmeyr1999a] S. A. Hofmeyr, "An Immunological Model of Distributed Detection and its Application to Computer Security", [PhD Thesis] Department of Computer Sciences, University of New Mexico, 1999. [Ji2007] Z. Ji and D. Dasgupta, "Revisiting Negative Selection Algorithms", Evolutionary Computation, 2007. [Stibor2006] T. Stibor, "On the Appropriateness of Negative Selection for Anomaly Detection and Network Intrusion Detection", [PhD Thesis] Darmstadt University of Technology, 2006.

### Free Course

Get one algorithm per week...
• ...described in detail
Sign-up Now

### Own A Copy

This 438-page ebook has...
• ...45 algorithm descriptions
• ...best practice usage
• ...pseudo code
• ...Ruby code
• ...primary sources

Please Note: This content was automatically generated from the book content and may contain minor differences.

Do you like Clever Algorithms?