NAME

AI::Gene::Simple

SYNOPSIS

A base class for storing and mutating genetic sequences.

package Somegene;
use AI::Gene::Simple;
our @ISA = qw (AI::Gene::Simple);

sub generate_token {
  my $self = shift;
  my $prev = $_[0] ? $_[0] + (1-rand(1)) : rand(1)*10;
  return $prev;
}

sub calculate {
  my $self = shift;
  my $x = $_[0];
  my $rt=0;
  for (0..(scalar(@{$self->[0]}) -1)) {
    $rt += $self->[0][$_] * ($x ** $_);
  }
  return $rt;
}

sub seed {
  my $self = shift;
  $self->[0][$_] = rand(1) * 10 for (0..$_[0]);
  return $self;
}

# ... then elsewhere

package main;

my $gene = Somegene->new;
$gene->seed(5);
print $gene->calculate(2), "\n";
$gene->mutate_minor;
print $gene->calculate(2), "\n";
$gene->mutate_major;
print $gene->calculate(2), "\n";

DESCRIPTION

This is a class which provides generic methods for the creation and mutation of genetic sequences. Various mutations are provided but the resulting mutations are not checked for a correct syntax. These classes are suitable for genes where it is only necessary to know what lies at a given position in a gene. If you need to ensure a gene maintains a sensible grammar, then you should use the AI::Gene::Sequence class instead, the interfaces are the same though so you will only need to modify your overiding classes if you need to switch from one to the other.

A suitable use for this module might be a series of coefficients in a polynomial expansion or notes to be played in a musical score.

This module should not be confused with the bioperl modules which are used to analyse DNA sequences.

It is intended that the methods in this code are inherited by other modules.

Anatomy of a gene

A gene is a linear sequence of tokens which tell some unknown system how to behave. These methods all expect that a gene is of the form:

[ [ 'token0', 'token1', ...  ], .. other elements ignored ]

Using the module

To use the genetic sequences, you must write your own implementations of the following methods along with some way of turning your encoded sequence into something useful.

generate_token

You may also want to override the following methods:

new
clone
render_gene

The calling conventions for these methods are outlined below.

Mutation methods

Mutation methods are all named mutate_*. In general, the first argument will be the number of mutations required, followed by the positions in the genes which should be affected, followed by the lengths of sequences within the gene which should be affected. If positions are not defined, then random ones are chosen. If lengths are not defined, a length of 1 is assumed (ie. working on single tokens only), if a length of 0 is requested, then a random length is chosen.

If a mutation is attempted which could corrupt your gene (copying from a region beyond the end of the gene for instance) then it will be silently skipped. Mutation methods all return the number of mutations carried out (not the number of tokens affected).

mutate([num, ref to hash of probs & methods])

This will call at random one of the other mutation methods. It will repeat itself num times. If passed a reference to a hash as its second argument, it will use that to decide which mutation to attempt.

This hash should contain keys which fit $1 in mutate_(.*) and values indicating the weight to be given to that method. The module will normalise this nicely, so you do not have to. This lets you define your own mutation methods in addition to overriding any you do not like in the module.

mutate_insert([num, pos])

Inserts a single token into the string at position pos. The token will be randomly generated by the calling object's generate_token method.

mutate_overwrite([num, pos1, pos2, len])

Copies a section of the gene (starting at pos1, length len) and writes it back into the gene, overwriting current elements, starting at pos2.

mutate_reverse([num, pos, len])

Takes a sequence within the gene and reverses the ordering of the elements within that sequence. Starts at position pos for length len.

mutate_shuffle([num, pos1, pos2, len])

This takes a sequence (starting at pos1 length len) from within a gene and moves it to another position (starting at pos2). Odd things might occur if the position to move the sequence into lies within the section to be moved, but the module will try its hardest to cause a mutation.

mutate_duplicate([num, pos1, pos2, length])

This copies a portion of the gene starting at pos1 of length length and then splices it into the gene before pos2.

mutate_remove([num, pos, length]))

Deletes length tokens from the gene, starting at pos. Repeats num times.

mutate_minor([num, pos])

This will mutate a single token at position pos in the gene into one of the same type (as decided by the object's generate_token method).

mutate_major([num, pos])

This changes a single token into a token of any token type. Token at postition pos. The token is produced by the object's generate_token method.

mutate_switch([num, pos1, pos2, len1, len2])

This takes two sequences within the gene and swaps them into each other's position. The first starts at pos1 with length len1 and the second at pos2 with length len2. If the two sequences overlap, then no mutation will be attempted.

The following methods are also provided, but you will probably want to overide them for your own genetic sequences.

generate_token([current token])

This is used by the mutation methods when changing tokens or creating new ones. It is expected to return a single token. If a minor mutation is being attempted, then the method will also be passed the current token.

The provided version of this method returns a random character from 'a'..'z' as both the token type and token.

clone()

This returns a copy of the gene as a new object. If you are using nested genes, or other references as your tokens, then you may need to produce your own version which will deep copy your structure.

new

This returns an empty gene, into which you can put things. If you want to initialise your gene, or anything useful like that, then you will need another one of these.

render_gene

This is useful for debugging, returns a serialised summary of the gene.

AUTHOR

This module was written by Alex Gough (alex@rcon.org).

SEE ALSO

If you are encoding something which must maintain a correct syntax (executable code, regular expressions, formal poems) then you might be better off using AI::Gene::Sequence .

COPYRIGHT

Copyright (c) 2000 Alex Gough <alex@rcon.org>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

BUGS

Some methods will do odd things if you pass them weird values, so try not to do that. So long as you stick to passing positive integers or undef to the methods then they should recover gracefully.

While it is easy and fun to write genetic and evolutionary algorithms in perl, for most purposes, it will be much slower than if they were implemented in another more suitable language. There are some problems which do lend themselves to an approach in perl and these are the ones where the time between mutations will be large, for instance, when composing music where the selection process is driven by human whims.