SelfOrganizing MapSelfOrganizing Map, SOM, SelfOrganizing Feature Map, SOFM, Kohonen Map, Kohonen Network. TaxonomyThe SelfOrganizing Map algorithm belongs to the field of Artificial Neural Networks and Neural Computation. More broadly it belongs to the field of Computational Intelligence. The SelfOrganizing Map is an unsupervised neural network that uses a competitive (winnertakeall) learning strategy. It is related to other unsupervised neural networks such as the Adaptive Resonance Theory (ART) method. It is related to other competitive learning neural networks such as the the Neural Gas Algorithm, and the Learning Vector Quantization algorithm, which is a similar algorithm for classification without connections between the neurons. Additionally, SOM is a baseline technique that has inspired many variations and extensions, not limited to the AdaptiveSubspace SelfOrganizing Map (ASSOM). InspirationThe SelfOrganizing Map is inspired by postulated feature maps of neurons in the brain comprised of featuresensitive cells that provide ordered projections between neuronal layers, such as those that may exist in the retina and cochlea. For example, there are acoustic feature maps that respond to sounds to which an animal is most frequently exposed, and tonotopic maps that may be responsible for the order preservation of acoustic resonances. StrategyThe information processing objective of the algorithm is to optimally place a topology (grid or lattice) of codebook or prototype vectors in the domain of the observed input data samples. An initially random pool of vectors is prepared which are then exposed to training samples. A winnertakeall strategy is employed where the most similar vector to a given input pattern is selected, then the selected vector and neighbors of the selected vector are updated to closer resemble the input pattern. The repetition of this process results in the distribution of codebook vectors in the input space which approximate the underlying distribution of samples from the test dataset. The result is the mapping of the topology of codebook vectors to the underlying structure in the input samples which may be summarized or visualized to reveal topologically preserved features from the input space in a lowdimensional projection. ProcedureThe SelfOrganizing map is comprised of a collection of codebook vectors connected together in a topological arrangement, typically a one dimensional line or a two dimensional grid. The codebook vectors themselves represent prototypes (points) within the domain, whereas the topological structure imposes an ordering between the vectors during the training process. The result is a low dimensional projection or approximation of the problem domain which may be visualized, or from which clusters may be extracted. Algorithm (below) provides a highlevel pseudocode for preparing codebook vectors using the SelfOrganizing Map method. Codebook vectors are initialized to small floating point values, or sampled from the domain. The Best Matching Unit (BMU) is the codebook vector from the pool that has the minimum distance to an input vector. A distance measure between input patterns must be defined. For realvalued vectors, this is commonly the Euclidean distance: $dist(x,c) = \sum_{i=1}^{n} (x_i  c_i)^2$where $n$ is the number of attributes, $x$ is the input vector and $c$ is a given codebook vector. The neighbors of the BMU in the topological structure of the network are selected using a neighborhood size that is linearly decreased during the training of the network. The BMU and all selected neighbors are then adjusted toward the input vector using a learning rate that too is decreased linearly with the training cycles: $c_i(t+1) = learn_{rate}(t) \times (c_i(t)  x_i)$where $c_i(t)$ is the $i^{th}$ attribute of a codebook vector at time $t$, $learn_{rate}$ is the current learning rate, an $x_i$ is the $i^{th}$ attribute of a input vector. The neighborhood is typically square (called bubble) where all neighborhood nodes are updated using the same learning rate for the iteration, or Gaussian where the learning rate is proportional to the neighborhood distance using a Gaussian distribution (neighbors further away from the BMU are updated less). Input :
InputPatterns , $iterations_{max}$, $learn_{rate}^{init}$, $neighborhood_{size}^{init}$, $Grid_{width}$, $Grid_{height}$
Output :
CodebookVectors
CodebookVectors $\leftarrow$ InitializeCodebookVectors ($Grid_{width}$, $Grid_{height}$, InputPatterns )For ($i=1$ To $iterations_{max}$)$learn_{rate}^{i}$ $\leftarrow$ CalculateLearningRate ($i$, $learn_{rate}^{init}$)$neighborhood_{size}^{i}$ $\leftarrow$ CalculateNeighborhoodSize ($i$, $neighborhood_{size}^{init}$)$Pattern_i$ $\leftarrow$ SelectInputPattern (InputPatterns )$Bmu_i$ $\leftarrow$ SelectBestMatchingUnit ($Pattern_i$, CodebookVectors )Neighborhood $\leftarrow$ $Bmu_i$Neighborhood $\leftarrow$ SelectNeighbors ($Bmu_i$, CodebookVectors , $neighborhood_{size}^{i}$)For ($Vector_i$ $\in $Neighborhood )For ($Vector_{i}^{attribute}$ $\in$ $Vector_i$)$Vector_{i}^{attribute}$ $\leftarrow$ $Vector_{i}^{attribute}$ + $learn_{rate}^{i}$ $\times$ ($Pattern_{i}^{attribute}$ – $Vector_{i}^{attribute}$) End End End Return (CodebookVectors )Heuristics
Code ListingListing (below) provides an example of the SelfOrganizing Map algorithm implemented in the Ruby Programming Language. The problem is a feature detection problem, where the network is expected to learn a predefined shape based on being exposed to samples in the domain. The domain is twodimensional $x,y \in [0,1]$, where a shape is predefined as a square in the middle of the domain $x,y \in [0.3,0.6]$. The system is initialized to vectors within the domain although is only exposed to samples within the predefined shape during training. The expectation is that the system will model the shape based on the observed samples. The algorithm is an implementation of the basic SelfOrganizing Map algorithm based on the description in Chapter 3 of the seminal book on the technique [Kohonen1995]. The implementation is configured with a $4 \times 5$ grid of nodes, the Euclidean distance measure is used to determine the BMU and neighbors, a Bubble neighborhood function is used. Error rates are presented to the console, and the codebook vectors themselves are described before and after training. The learning process is incremental rather than batch, for simplicity. An extension to this implementation would be to visualize the resulting network structure in the domain  shrinking from a mesh that covers the whole domain, down to a mesh that only covers the predefined shape within the domain. def random_vector(minmax) return Array.new(minmax.size) do i minmax[i][0] + ((minmax[i][1]  minmax[i][0]) * rand()) end end def initialize_vectors(domain, width, height) codebook_vectors = [] width.times do x height.times do y codebook = {} codebook[:vector] = random_vector(domain) codebook[:coord] = [x,y] codebook_vectors << codebook end end return codebook_vectors end def euclidean_distance(c1, c2) sum = 0.0 c1.each_index {i sum += (c1[i]c2[i])**2.0} return Math.sqrt(sum) end def get_best_matching_unit(codebook_vectors, pattern) best, b_dist = nil, nil codebook_vectors.each do codebook dist = euclidean_distance(codebook[:vector], pattern) best,b_dist = codebook,dist if b_dist.nil? or dist<b_dist end return [best, b_dist] end def get_vectors_in_neighborhood(bmu, codebook_vectors, neigh_size) neighborhood = [] codebook_vectors.each do other if euclidean_distance(bmu[:coord], other[:coord]) <= neigh_size neighborhood << other end end return neighborhood end def update_codebook_vector(codebook, pattern, lrate) codebook[:vector].each_with_index do v,i error = pattern[i]codebook[:vector][i] codebook[:vector][i] += lrate * error end end def train_network(vectors, shape, iterations, l_rate, neighborhood_size) iterations.times do iter pattern = random_vector(shape) lrate = l_rate * (1.0(iter.to_f/iterations.to_f)) neigh_size = neighborhood_size * (1.0(iter.to_f/iterations.to_f)) bmu,dist = get_best_matching_unit(vectors, pattern) neighbors = get_vectors_in_neighborhood(bmu, vectors, neigh_size) neighbors.each do node update_codebook_vector(node, pattern, lrate) end puts ">training: neighbors=#{neighbors.size}, bmu_dist=#{dist}" end end def summarize_vectors(vectors) minmax = Array.new(vectors.first[:vector].size){[1,0]} vectors.each do c c[:vector].each_with_index do v,i minmax[i][0] = v if v<minmax[i][0] minmax[i][1] = v if v>minmax[i][1] end end s = "" minmax.each_with_index {bounds,i s << "#{i}=#{bounds.inspect} "} puts "Vector details: #{s}" return minmax end def test_network(codebook_vectors, shape, num_trials=100) error = 0.0 num_trials.times do pattern = random_vector(shape) bmu,dist = get_best_matching_unit(codebook_vectors, pattern) error += dist end error /= num_trials.to_f puts "Finished, average error=#{error}" return error end def execute(domain, shape, iterations, l_rate, neigh_size, width, height) vectors = initialize_vectors(domain, width, height) summarize_vectors(vectors) train_network(vectors, shape, iterations, l_rate, neigh_size) test_network(vectors, shape) summarize_vectors(vectors) return vectors end if __FILE__ == $0 # problem configuration domain = [[0.0,1.0],[0.0,1.0]] shape = [[0.3,0.6],[0.3,0.6]] # algorithm configuration iterations = 100 l_rate = 0.3 neigh_size = 5 width, height = 4, 5 # execute the algorithm execute(domain, shape, iterations, l_rate, neigh_size, width, height) end Download: som.rb.
ReferencesPrimary SourcesThe SelfOrganizing Map was proposed by Kohonen in 1982 in a study that included the mathematical basis for the approach, summary of related physiology, and simulation on demonstration problem domains using one and two dimensional topological structures [Kohonen1982]. This work was tightly related two other papers published at close to the same time on topological maps and selforganization [Kohonen1981] [Kohonen1982b]. Learn MoreKohonen provides a detailed introduction and summary of the SelfOrganizing Map in a journal article [Kohonen1990a]. Kohonen et al. provide a practical presentation of the algorithm and heuristics for configuration in the technical report written to accompany the released SOMPAK implementation of the algorithm for academic research [Kohonen1996a]. The seminal book on the technique is "SelfOrganizing Maps" by Kohonen, which includes chapters dedicated to the description of the basic approach, physiological interpretations of the algorithm, variations, and summaries of application areas [Kohonen1995]. Bibliography

Free CourseGet one algorithm per week...
Own A CopyThis 438page ebook has...


Please Note: This content was automatically generated from the book content and may contain minor differences. 

Do you like Clever Algorithms? Buy the book now. 