Clever Algorithms: Nature-Inspired Programming Recipes

By Jason Brownlee PhD

Home | Read Online

This is the ad-supported version of the book. Buy it now if you like it.

Cross-Entropy Method

Cross-Entropy Method, Cross Entropy Method, CEM.


The Cross-Entropy Method is a probabilistic optimization belonging to the field of Stochastic Optimization. It is similar to other Stochastic Optimization and algorithms such as Simulated Annealing, and to Estimation of Distribution Algorithms such as the Probabilistic Incremental Learning Algorithm.


The Cross-Entropy Method does not have an inspiration. It was developed as an efficient estimation technique for rare-event probabilities in discrete event simulation systems and was adapted for use in optimization. The name of the technique comes from the Kullback-Leibler cross-entropy method for measuring the amount of information (bits) needed to identify an event from a set of probabilities.


The information processing strategy of the algorithm is to sample the problem space and approximate the distribution of good solutions. This is achieved by assuming a distribution of the problem space (such as Gaussian), sampling the problem domain by generating candidate solutions using the distribution, and updating the distribution based on the better candidate solutions discovered. Samples are constructed step-wise (one component at a time) based on the summarized distribution of good solutions. As the algorithm progresses, the distribution becomes more refined until it focuses on the area or scope of optimal solutions in the domain.


Algorithm (below) provides a pseudocode listing of the Cross-Entropy Method algorithm for minimizing a cost function.

Input: $Problem_{size}$, $Samples_{num}$, $UpdateSamples_{num}$, $Learn_{rate}$, $Variance_{min}$
Output: $S_{best}$
Means $\leftarrow$ InitializeMeans()
Variances $\leftarrow$ InitializeVariances()
$S_{best}$ $\leftarrow$ $\emptyset$
While (Max(Variances) $\leq$ $Variance_{min}$)
    Samples $\leftarrow$ $0$
    For ($i=0$ To $Samples_{num}$)
        Samples $\leftarrow$ GenerateSample(Means, Variances)
    If (Cost($Samples_{0}$) $\leq$ Cost($S_{best}$))
        $S_{best}$ $\leftarrow$ $Samples_{0}$
    $Samples_{selected}$ $\leftarrow $SelectBestSamples(Samples, $UpdateSamples_{num}$)
    For ($i=0$ To $Problem_{size}$)
        $Means_i$ $\leftarrow$ $Means_i$ + $Learn_{rate}$ $\times$ Mean($Samples_{selected}$, $i$)
        $Variances_i$ $\leftarrow$ $Variances_i$ + $Learn_{rate}$ $\times$ Variance($Samples_{selected}$, $i$)
Return ($S_{best}$)
Pseudocode for the Cross-Entropy Method.


  • The Cross-Entropy Method was adapted for combinatorial optimization problems, although has been applied to continuous function optimization as well as noisy simulation problems.
  • A alpha ($\alpha$) parameter or learning rate $\in [0.1]$ is typically set high, such as 0.7.
  • A smoothing function can be used to further control the updates the summaries of the distribution(s) of samples from the problem space. For example, in continuous function optimization a $\beta$ parameter may replace $\alpha$ for updating the standard deviation, calculated at time $t$ as $\beta_{t} = \beta - \beta \times (1-\frac{1}{t})^q$, where $\beta$ is initially set high $\in [0.8, 0.99]$ and $q$ is a small integer $\in [5, 10]$.

Code Listing

Listing (below) provides an example of the Cross-Entropy Method algorithm implemented in the Ruby Programming Language. The demonstration problem is an instance of a continuous function optimization problem that seeks $\min f(x)$ where $f=\sum_{i=1}^n x_{i}^2$, $-5.0\leq x_i \leq 5.0$ and $n=3$. The optimal solution for this basin function is $(v_0,\ldots,v_{n-1})=0.0$.

The algorithm was implemented based on a description of the Cross-Entropy Method algorithm for continuous function optimization by Rubinstein and Kroese in Chapter 5 and Appendix A of their book on the method [Rubinstein2004]. The algorithm maintains means and standard deviations of the distribution of samples for convenience. The means and standard deviations are initialized based on random positions in the problem space and the bounds of the whole problem space respectively. A smoothing parameter is not used on the standard deviations.

def objective_function(vector)
  return vector.inject(0.0) {|sum, x| sum +  (x ** 2.0)}

def random_variable(minmax)
  min, max = minmax
  return min + ((max - min) * rand())

def random_gaussian(mean=0.0, stdev=1.0)
  u1 = u2 = w = 0
    u1 = 2 * rand() - 1
    u2 = 2 * rand() - 1
    w = u1 * u1 + u2 * u2
  end while w >= 1
  w = Math.sqrt((-2.0 * Math.log(w)) / w)
  return mean + (u2 * w) * stdev

def generate_sample(search_space, means, stdevs)
  vector =
  search_space.size.times do |i|
    vector[i] = random_gaussian(means[i], stdevs[i])
    vector[i] = search_space[i][0] if vector[i] < search_space[i][0]
    vector[i] = search_space[i][1] if vector[i] > search_space[i][1]
  return {:vector=>vector}

def mean_attr(samples, i)
  sum = samples.inject(0.0) do |s,sample|
    s + sample[:vector][i]
  return (sum / samples.size.to_f)

def stdev_attr(samples, mean, i)
  sum = samples.inject(0.0) do |s,sample|
    s + (sample[:vector][i] - mean)**2.0
  return Math.sqrt(sum / samples.size.to_f)

def update_distribution!(samples, alpha, means, stdevs)
  means.size.times do |i|
    means[i] = alpha*means[i] + ((1.0-alpha)*mean_attr(samples, i))
    stdevs[i] = alpha*stdevs[i]+((1.0-alpha)*stdev_attr(samples,means[i],i))

def search(bounds, max_iter, num_samples, num_update, learning_rate)
  means ={|i| random_variable(bounds[i])}
  stdevs ={|i| bounds[i][1]-bounds[i][0]}
  best = nil
  max_iter.times do |iter|
    samples ={generate_sample(bounds, means, stdevs)}
    samples.each {|samp| samp[:cost] = objective_function(samp[:vector])}
    samples.sort!{|x,y| x[:cost]<=>y[:cost]}
    best = samples.first if best.nil? or samples.first[:cost] < best[:cost]
    selected = samples.first(num_update)
    update_distribution!(selected, learning_rate, means, stdevs)
    puts " > iteration=#{iter}, fitness=#{best[:cost]}"
  return best

if __FILE__ == $0
  # problem configuration
  problem_size = 3
  search_space = {|i| [-5, 5]}
  # algorithm configuration
  max_iter = 100
  num_samples = 50
  num_update = 5
  l_rate = 0.7
  # execute the algorithm
  best = search(search_space, max_iter, num_samples, num_update, l_rate)
  puts "done! Solution: f=#{best[:cost]}, s=#{best[:vector].inspect}"
Cross-Entropy Method in Ruby


Primary Sources

The Cross-Entropy method was proposed by Rubinstein in 1997 [Rubinstein1997] for use in optimizing discrete event simulation systems. It was later generalized by Rubinstein and proposed as an optimization method for combinatorial function optimization in 1999 [Rubinstein1999]. This work was further elaborated by Rubinstein providing a detailed treatment on the use of the Cross-Entropy method for combinatorial optimization [Rubinstein2001].

Learn More

De Boer et al. provide a detailed presentation of Cross-Entropy method including its application in rare event simulation, its adaptation to combinatorial optimization, and example applications to the max-cut, traveling salesman problem, and a clustering numeric optimization example [DeBoer2005]. Rubinstein and Kroese provide a thorough presentation of the approach in their book, summarizing the relevant theory and the state of the art [Rubinstein2004].


[DeBoer2005] P. T. De Boer and D. P. Kroese and S. Mannor and R. Y. Rubinstein, "A Tutorial on the Cross-Entropy Method", Annals of Operations Research, 2005.
[Rubinstein1997] R. Y. Rubinstein, "Optimization of Computer simulation Models with Rare Events", European Journal of Operations Research, 1997.
[Rubinstein1999] R. Y. Rubinstein, "The simulated entropy method for combinatorial and continuous optimization", Methodology and Computing in Applied Probability, 1999.
[Rubinstein2001] R. Y. Rubinstein, "Combinatorial optimization, cross-entropy, ants and rare events", in Stochastic optimization: algorithms and applications, pages 303–364, Springer, 2001.
[Rubinstein2004] R. Y. Rubinstein and D. P. Kroese, "The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization", Springer, 2004.
Clever Algorithms: Nature-Inspired Programming Recipes

Free Course

Get one algorithm per week...
  • ...delivered to your inbox
  • ...described in detail
  • read at your own pace
Sign-up Now

Own A Copy

This 438-page ebook has...
  • ...45 algorithm descriptions
  • practice usage
  • ...pseudo code
  • ...Ruby code
  • ...primary sources
Buy Now

Please Note: This content was automatically generated from the book content and may contain minor differences.

Clever Algorithms: Nature-Inspired Programming Recipes

Do you like Clever Algorithms?
Buy the book now.

© Copyright 2015. All Rights Reserved. | About | Contact | Privacy