NAME
AI::Genetic::Pro - Efficient genetic algorithms for professional purpose.
SYNOPSIS
use AI::Genetic::Pro;
sub fitness {
my ($ga, $chromosome) = @_;
return oct('0b' . $ga->as_string($chromosome));
}
sub terminate {
my ($ga) = @_;
my $result = oct('0b' . $ga->as_string($ga->getFittest));
return $result == 4294967295 ? 1 : 0;
}
my $ga = AI::Genetic::Pro->new(
-fitness => \&fitness, # fitness function
-terminate => \&terminate, # terminate function
-type => 'bitvector', # type of chromosomes
-population => 1000, # population
-crossover => 0.9, # probab. of crossover
-mutation => 0.01, # probab. of mutation
-parents => 2, # number of parents
-selection => [ 'Roulette' ], # selection strategy
-strategy => [ 'Points', 2 ], # crossover strategy
-cache => 0, # cache results
-history => 1, # remember best results
-preserve => 3, # remember the bests
-variable_length => 1, # turn variable length ON
);
# init population of 32-bit vectors
$ga->init(32);
# evolve 10 generations
$ga->evolve(10);
# best score
print "SCORE: ", $ga->as_value($ga->getFittest), ".\n";
# save evolution path as a chart
$ga->chart(-filename => 'evolution.png');
# save state of GA
$ga->save('genetic.sga');
# load state of GA
$ga->load('genetic.sga');
DESCRIPTION
This module provides efficient implementation of a genetic algorithm for a professional purpose. It was designed to operate as fast as possible even on very large populations and big individuals/chromosomes. AI::Genetic::Pro
was inspired by AI::Genetic
, so it is in most cases compatible (there are some changes). Additionaly AI::Genetic::Pro
isn't pure Perl solution, so it doesn't have limitations of its ancestor (ie. seriously slow down in case of big populations ( >10000 ) or vectors with size > 33 fields).
If You are looking for pure Perl solution, consider AI::Genetic.
- Speed
-
To increase speed XS code is used, however with portability in mind. This distribution was tested on Windows and Linux platforms (should work on any other).
- Memory
-
This module was designed to use as little memory as possible. Population of size 10000 consist 92-bit vectors uses only ~24MB (in
AI::Genetic
something about ~78MB!!!). - Advanced options
-
To provide more flexibility
AI::Genetic::Pro
supports many statistic distributions, such as:uniform
,natural
,chi_square
and others. This feature can be used in selection or/and crossover. See documentation below.
METHODS
Simply description of available methods. See below.
- $ga->new( %options )
-
Constructor. It accepts options in hash-value style. See options and an example below.
- -fitness
-
This defines a fitness function. It expects a reference to a subroutine.
- -terminate
-
This defines a terminate function. It expects a reference to a subroutine.
- -type
-
This defines the type of chromosomes. Currently,
AI::Genetic::Pro
supports four types:- bitvector
-
Individuals/chromosomes of this type have genes that are bits. Each gene can be in one of two possible states, on or off.
- listvector
-
Each gene of a "listvector" individual/chromosome can assume one string value from a specified list of possible string values.
- rangevector
-
Each gene of a "rangevector" individual/chromosome can assume one integer value from a range of possible integer values. Note that only integers are supported. The user can always transform any desired fractional values by multiplying and dividing by an appropriate power of 10.
- combination
-
Each gene of a "combination" individual/chromosome can assume one string value from a specified list of possible string values. All genes are unique.
- -population
-
This defines the size of the population, i.e. how many chromosomes to simultaneously exist at each generation.
- -crossover
-
This defines the crossover rate. Fairest results are achieved with crossover rate ~0.95.
- -mutation
-
This defines the mutation rate. Fairest results are achieved with mutation rate ~0.01.
- -preserve
-
This defines injection of the bests chromosomes into a next generation. It causes a little slow down, however (very often) much better results are achieved. You can specify, how many chromosomes will be preserved, i.e.
-preserve => 1, # only one chromosome will be preserved # or -preserve => 9, # 9 chromosomes will be preserved # and so on...
Attention! You cannot preserve more chromosomes, than Your population consists.
- -variable_length
-
This defines if variable-length chromosomes are turned on (default off) and a type of allowed mutations. See below.
- level 0
-
Feature is inactive (default). Example:
-variable_length => 0 # chromosomes (i.e. bitvectors) 0 1 0 0 1 1 0 1 1 1 0 1 0 1 0 0 1 1 0 1 1 1 1 0 0 1 1 0 0 1 1 1 0 1 0 0 1 1 0 1 1 1 0 1 0 0 1 1 0 1 1 1 1 0 1 0 # ...and so on
- level 1
-
Feature is active, but chromosomes can varies only on the right side, Example:
-variable_length => 1 # chromosomes (i.e. bitvectors) 0 1 0 0 1 1 0 1 1 1 0 0 1 1 0 1 1 1 1 0 1 1 1 0 1 0 0 1 1 0 1 1 1 0 1 0 0 1 1 0 1 1 1 # ...and so on
- level 2
-
Feature is active and chromosomes can varies on the left side and on the right side; unwanted values/genes on the left side are replaced with
undef
, ie.-variable_length => 2 # chromosomes (i.e. bitvectors) x x x 0 1 1 0 1 1 1 x x x x 0 1 1 1 1 x 1 1 1 0 1 0 0 1 1 0 1 1 1 0 1 0 0 1 1 0 1 1 1 # where 'x' means 'undef' # ...and so on
In this situation returned chromosomes in an array context ($ga->as_array($chromosome)) can have undef values on the left side (only). In a scalar context each undefined value is replaced with a single space. If You don't want to see any
undef
or space, just useas_array_def_only
andas_string_def_only
instead ofas_array
andas_string
.
- -parents
-
This defines how many parents should be used in a corssover.
- -selection
-
This defines how individuals/chromosomes are selected to crossover. It expects an array reference listed below:
-selection => [ $type, @params ]
where type is one of:
- RouletteBasic
-
Each individual/chromosome can be selected with probability poportionaly to its fitness.
- Roulette
-
At the first best individuals/chromosomes are selected. From this collection parents are selected with probability poportionaly to its fitness.
- RouletteDistribution
-
Each individual/chromosome has portion of roulette wheel proportionaly to its fitness. Selection is done with specified distribution. Supported distributions and paremeters are listed below.
-selection => [ 'RouletteDistribution', 'uniform' ]
-
Standard uniform distribution. No additional parameters are needed.
-selection => [ 'RouletteDistribution', 'normal', $av, $sd ]
-
Normal distribution, where
$av
is average (default: size of population /2) and $$sd
is standard deviation (default: size of population). -selection => [ 'RouletteDistribution', 'beta', $aa, $bb ]
-
Beta distribution. The density of the beta is:
X^($aa - 1) * (1 - X)^($bb - 1) / B($aa , $bb) for 0 < X < 1.
$aa
and$bb
are set by default to number of parents.Argument restrictions: Both $aa and $bb must not be less than 1.0E-37.
-selection => [ 'RouletteDistribution', 'binomial' ]
-
Binomial distribution. No additional parameters are needed.
-selection => [ 'RouletteDistribution', 'chi_square', $df ]
-
Chi-square distribution with
$df
degrees of freedom.$df
by default is set to size of population. -selection => [ 'RouletteDistribution', 'exponential', $av ]
-
Exponential distribution, where
$av
is average .$av
by default is set to size of population. -selection => [ 'RouletteDistribution', 'poisson', $mu ]
-
Poisson distribution, where
$mu
is mean.$mu
by default is set to size of population.
- Distribution
-
Chromosomes/individuals are selected with specified distribution. See below.
-selection => [ 'Distribution', 'uniform' ]
-
Standard uniform distribution. No additional parameters are needed.
-selection => [ 'Distribution', 'normal', $av, $sd ]
-
Normal distribution, where
$av
is average (default: size of population /2) and $$sd
is standard deviation (default: size of population). -selection => [ 'Distribution', 'beta', $aa, $bb ]
-
Beta distribution. The density of the beta is:
X^($aa - 1) * (1 - X)^($bb - 1) / B($aa , $bb) for 0 < X < 1.
$aa
and$bb
are set by default to number of parents.Argument restrictions: Both $aa and $bb must not be less than 1.0E-37.
-selection => [ 'Distribution', 'binomial' ]
-
Binomial distribution. No additional parameters are needed.
-selection => [ 'Distribution', 'chi_square', $df ]
-
Chi-square distribution with
$df
degrees of freedom.$df
by default is set to size of population. -selection => [ 'Distribution', 'exponential', $av ]
-
Exponential distribution, where
$av
is average .$av
by default is set to size of population. -selection => [ 'Distribution', 'poisson', $mu ]
-
Poisson distribution, where
$mu
is mean.$mu
by default is set to size of population.
- -strategy
-
This defines strategy of crossover operation. It expects an array reference listed below:
-strategy => [ $type, @params ]
where type is one of:
- PointsSimple
-
Simple crossover in one or many points. Best chromosomes/individuals are selected to new generation. In example:
-strategy => [ 'PointsSimple', $n ]
where
$n
is number of points for crossing. - PointsBasic
-
Crossover in one or many points. In basic crossover selected parents are crossed and one (random) of children is moved to new generation. In example:
-strategy => [ 'PointsBasic', $n ]
where
$n
is number of points for crossing. - Points
-
Crossover in one or many points. In normal crossover selected parents are crossed and the best of child is moved to new generation. In example:
-strategy => [ 'Points', $n ]
where
$n
is number of points for crossing. - PointsAdvenced
-
Crossover in one or many points. After crossover best chromosomes/individuals from all parents and chidren are selected to new generation. In example:
-strategy => [ 'PointsAdvanced', $n ]
where
$n
is number of points for crossing. - Distribution
-
In distribution crossover parents are crossed in points selected with specified distribution. See below.
-strategy => [ 'Distribution', 'uniform' ]
-
Standard uniform distribution. No additional parameters are needed.
-strategy => [ 'Distribution', 'normal', $av, $sd ]
-
Normal distribution, where
$av
is average (default: number of parents/2) and$sd
is standard deviation (default: number of parents). -strategy => [ 'Distribution', 'beta', $aa, $bb ]
-
Beta distribution. The density of the beta is:
X^($aa - 1) * (1 - X)^($bb - 1) / B($aa , $bb) for 0 < X < 1.
$aa
and$bb
are set by default to the number of parents.Argument restrictions: Both $aa and $bb must not be less than 1.0E-37.
-strategy => [ 'Distribution', 'binomial' ]
-
Binomial distribution. No additional parameters are needed.
-strategy => [ 'Distribution', 'chi_square', $df ]
-
Chi-square distribution with
$df
degrees of freedom.$df
by default is set to the number of parents. -strategy => [ 'Distribution', 'exponential', $av ]
-
Exponential distribution, where
$av
is average .$av
by default is set to the number of parents. -strategy => [ 'Distribution', 'poisson', $mu ]
-
Poisson distribution, where
$mu
is mean.$mu
by default is set to the number of parents.
- PMX
-
PMX method defined by Goldberg and Lingle in 1985. Parameters: none.
- OX
-
OX method defined by Davis (?) in 1985. Parameters: none.
- -cache
-
This defines if cache should be used. Correct values are: 1 or 0 (default: 0).
- -history
-
This defines if history should be collected. Correct values are: 1 or 0 (default: 0).
Collect history.
- -strict
-
This defined if check for modifing chromosomes in a fitness (user defined) function is active. Direct modifing chromosomes is not allowed and it is a highway to big troubles. This mode should be used only for testing, becouse it is slow.
- $ga->inject($chromosomes)
-
Inject new, user defined, chromosomes into a current population. See example below:
# example for bitvector my $chromosomes = [ [ 1, 1, 0, 1, 0, 1 ], [ 0, 0, 0, 1, 0, 1 ], [ 0, 1, 0, 1, 0, 0 ], ... ]; # inject $ga->inject($chromosomes);
If You want to delete some chromosomes form population, just
splice
them:my @remove = qw(1 2 3 9 12); for my $idx (@remove){ splice @{$ga->chromosomes}, $idx, 1; }
- $ga->population($population)
-
Set/get size of the population. This defines the size of the population, i.e. how many chromosomes to simultaneously exist at each generation.
- $ga->indType()
-
Get type of individuals/chromosomes. Currently supported types are:
bitvector
-
Chromosomes will be just bitvectors. See documentation of
new
method. listvector
-
Chromosomes will be lists of specified values. See documentation of
new
method. rangevector
-
Chromosomes will be lists of values from specified range. See documentation of
new
method. combination
-
Chromosomes will be uniq lists of specified values. This is used for example in Traveling Salesman Problem. See documentation of
new
method.
In example:
my $type = $ga->type();
- $ga->type()
-
Alias for
indType
. - $ga->crossProb()
-
This method is used to query and set the crossover rate.
- $ga->crossover()
-
Alias for
crossProb
. - $ga->mutProb()
-
This method is used to query and set the mutation rate.
- $ga->mutation()
-
Alias for
mutProb
. - $ga->parents($parents)
-
Set/get number of parents in a crossover.
- $ga->init($args)
-
This method initializes the population with random individuals/chromosomes. It MUST be called before any call to
evolve()
. It expects one argument, which depends on the type of individuals/chromosomes:- bitvector
-
For bitvectors, the argument is simply the length of the bitvector.
$ga->init(10);
This initializes a population where each individual/chromosome has 10 genes.
- listvector
-
For listvectors, the argument is an anonymous list of lists. The number of sub-lists is equal to the number of genes of each individual/chromosome. Each sub-list defines the possible string values that the corresponding gene can assume.
$ga->init([ [qw/red blue green/], [qw/big medium small/], [qw/very_fat fat fit thin very_thin/], ]);
This initializes a population where each individual/chromosome has 3 genes and each gene can assume one of the given values.
- rangevector
-
For rangevectors, the argument is an anonymous list of lists. The number of sub-lists is equal to the number of genes of each individual/chromosome. Each sub-list defines the minimum and maximum integer values that the corresponding gene can assume.
$ga->init([ [1, 5], [0, 20], [4, 9], ]);
This initializes a population where each individual/chromosome has 3 genes and each gene can assume an integer within the corresponding range.
- combination
-
For combination, the argument is an anonymous list of possible values of gene.
$ga->init( [ 'a', 'b', 'c' ] );
This initializes a population where each chromosome has 3 genes and each gene is uniq combination of 'a', 'b' and 'c'. For example genes looks something like that:
[ 'a', 'b', 'c' ] # gene 1 [ 'c', 'a', 'b' ] # gene 2 [ 'b', 'c', 'a' ] # gene 3 # ...and so on...
- $ga->evolve($n)
-
This method causes the GA to evolve the population for the specified number of generations. If its argument is 0 or
undef
GA will evolve the population to infinity unless terminate function is specified. - $ga->getHistory()
-
Get history of the evolution. It is in a format listed below:
[ # gen0 gen1 gen2 ... # generations [ max0, max1, max2, ... ], # max values [ mean, mean1, mean2, ... ], # mean values [ min0, min1, min2, ... ], # min values ]
- $ga->getAvgFitness()
-
Get max, mean and min score of the current generation. In example:
my ($max, $mean, $min) = $ga->getAvgFitness();
- $ga->getFittest($n, $unique)
-
This function returns list of fittests chromosomes from the current population. You can specify: how many chromosomes should be returned and if returned chromosomes should be unique. See example below.
# only one - the best my ($best) = $ga->getFittest; # or 5 bests chromosomes, NOT unique my @bests = $ga->getFittest(5); # or 7 bests and UNIQUE chromosomes my @bests = $ga->getFittest(7, 1);
If You want to get a big number of chromosomes, try to use
getFittest_as_arrayref
function instead (for efficiency). - $ga->getFittest_as_arrayref($n, $unique)
-
This function is very similar to
getFittest
, but it returns reference to an array instead of a list. - $ga->generation()
-
Get number of generation.
- $ga->people()
-
Returns an anonymous list of individuals/chromosomes of the current population.
IMPORTANT: the actual array reference used by the
AI::Genetic::Pro
object is returned, so any changes to it will be reflected in $ga. - $ga->chromosomes()
-
Alias for
people
. - $ga->chart(%options)
-
Generate a chart describing changes of min, mean, max scores in Your population. To satisfy Your needs, You can pass following options:
- -filename
-
File to save a chart in (obligatory).
- -title
-
Title of a chart (default: Evolution).
- -x_label
-
X label (default: Generations).
- -y_label
-
Y label (default: Value).
- -format
-
Format of values, like
sprintf
(default: '%.2f'). - -legend1
-
Description of min line (default: Min value).
- -legend2
-
Description of min line (default: Mean value).
- -legend3
-
Description of min line (default: Max value).
- -width
-
Width of a chart (default: 640).
- -height
-
Height of a chart (default: 480).
- -font
-
Path to font in (*.ttf format) to be used (default: none).
- -logo
-
Path to logo (png/jpg image) to embed in a chart (default: none).
- In example:
-
$ga->chart(-width => 480, height => 320, -filename => 'chart.png');
- $ga->save($file)
-
Save current state of the genetic algorithm to a specified file.
- $ga->load($file)
-
Load a state of the genetic algorithm from a specified file.
- $ga->as_array($chromosome)
-
In an array context return an array representing specified chromosome. In a scalar context return an reference to an array representing specified chromosome. If variable_length is turned on and is set to level 2, an array can have some
undef
values. To get onlynot undef
values useas_array_def_only
instead ofas_array
. - $ga->as_array_def_only($chromosome)
-
In an array context return an array representing specified chromosome. In a scalar context return an reference to an array representing specified chromosome. If variable_length is turned off, this function is just alias for
as_array
. If variable_length is turned on and is set to level 2, this function will return onlynot undef
values from chromosome. See example below:# -variable_length => 2, -type => 'bitvector' my @chromosome = $ga->as_array($chromosome) # @chromosome looks something like that # ( undef, undef, undef, 1, 0, 1, 1, 1, 0 ) @chromosome = $ga->as_array_def_only($chromosome) # @chromosome looks something like that # ( 1, 0, 1, 1, 1, 0 )
- $ga->as_string($chromosome)
-
Return string-representation of specified chromosome. See example below:
# -type => 'bitvector' my $string = $ga->as_string($chromosome); # $string looks something like that # 1___0___1___1___1___0 # or # -type => 'listvector' $string = $ga->as_string($chromosome); # $string looks something like that # element0___element1___element2___element3...
Attention! If variable_length is turned on and is set to level 2, it is possible to get
undef
values on the left side of the vector. In returned stringundef
values will be replaced with spaces. If You don't want to see any spaces, useas_string_def_only
instead ofas_string
. - $ga->as_string_def_only($chromosome)
-
Return string-representation of specified chromosome. If variable_length is turned off, this function is just alias for
as_string
. If variable_length is turned on and is set to level 2, this function will return string withoutundef
values. See example below:# -variable_length => 2, -type => 'bitvector' my $string = $ga->as_string($chromosome); # $string looks something like that # ___ ___ ___1___1___0 $string = $ga->as_string_def_only($chromosome); # $string looks something like that # 1___1___0
- $ga->as_value($chromosome)
-
Return score of specified chromosome. Value of chromosome is calculated by fitness function.
SUPPORT
AI::Genetic::Pro
is still under development, however it is used in many production environments.
TODO
REPORTING BUGS
When reporting bugs/problems please include as much information as possible. It may be difficult for me to reproduce the problem as almost every setup is different.
A small script which yields the problem will probably be of help.
THANKS
LEONID ZAMDBORG for recommending the addition of variable-length chromosomes as well as supplying relevant code samples, for testing and at the end reporting some bugs.
Christoph Meissner for reporting a bug.
Alec Chen for reporting some bugs.
AUTHOR
Strzelecki Lukasz <strzelec@rswsystems.com>
SEE ALSO
COPYRIGHT
Copyright (c) Strzelecki Lukasz. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.