NAME

Statistics::Sequences::Runs - The Runs-test (Wald-Walfowitz or Swed-Eisenhard Test)

SYNOPSIS

use Statistics::Sequences::Runs;
$runs = Statistics::Sequences::Runs->new();
$runs->load(qw/1 0 0 0 1 1 0 1 1 0 0 1 0 0 1 1 1 1 0 1/);
$runs->test()->dump();

DESCRIPTION

The Runs-test assesses the difference between two independent distributions, or a difference within a single distribution of dichotomous observations, in terms of the frequency of the runs of states within them.

A run is a sequence of identical states on 1 or more consecutive trials. For example, in a signal-detection test, there'll be a series, over time, of hits (H) and misses (M), which might look like H-H-M-H-M-M-M-M-H. Here, there are 5 runs: 3 of hits, and 2 of misses. This number of runs can be compared with the number expected to occur by chance, given the number of observed hits and misses. More runs than expected ("negative serial dependence") generally indicates irregularity, or instability; fewer runs than expected ("positive serial dependence") indicates regularity, or stability. Both can indicate a sequential dependency: either negative (an extra-chance factor, or bias, to produce too many alternations), or positive (an extra-chance factor, or bias, to produce too many repetitions).

The distribution of runs is asymptotically normal - quite quickly, with probabilities well estimated by the normal distribution when both the numbers of H and M exceed 10 (e.g., Kelly, 1982). The deviation of the observed number of runs is therefore reliably assessed by way of a Z-score.

METHODS

Methods are essentially as described in Statistics::Sequences. See this manpage for how to handle non-dichotomous data, e.g., numerical data, or those with more than two categories.

new

$runs = Statistics::Sequences::Runs->new();

Returns a new Runs object. Expects/accepts no arguments but the classname.

load

$runs->load(@data);
$runs->load(\@data);
$runs->load('dist1' => \@data1, 'dist2' => \@data2)
$runs->load({'dist1' => \@data1, 'dist2' => \@data2})

Loads data anonymously or by name. See load in the Statistics::Sequences manpage.

test

$runs->test();

Performs the runs test on the named distributions. If only one distribution name is given, the "one-sample" Runs test is performed, cutting the data at the median, or by the value given as cut. Observations that fall above and below the cut-value then constitute the "groups" to be searched for runs. Otherwise, with two named groups, runs are sought on the basis of the observations belonging to one or the other named group.

dump

$runs->dump(flag => '1|0', text => '0|1|2');

Print Runs-test results to STDOUT. See dump in the Statistics::Sequences manpage for details.

EXAMPLE

Seating at the diner

Swed and Eisenhart (1943) list the occupied (O) and empty (E) seats in a row at a lunch counter. Have people taken up their seats on a random basis? There is no need to dichotomise these data: there is already a single sample, with dichotomous, categorical observations.

use Statistics::Sequences::Runs;
my $runs = Statistics::Sequences::Runs->new();
my @seating = (qw/E O E E O E E E O E E E O E O E/);
$runs->load(\@seating);
$runs->test(ccorr => 1, tails => 1)->dump();

Suggesting some non-random basis for people taking their seats, this outputs:

Runs: observed = 11.00, expected = 7.88, Z = 1.60, 1p = 0.054834

These data are also used as examples of the Turns test and the Vnomes test.

ESP runs

In a single run of a classic ESP test, there are 25 trials, each composed of a randomly generated state (typically, one of 5 possible geometric figures), and a human-generated state drawn from the same pool of alternatives. Tests of the synchrony between the random and human data are then made, typically in terms of the number of "hits" observed versus that expected. The runs of hits and misses can also be tested by dichotomising the data on the basis of the match of the random "targets" with the human "responses", like so:

use Statistics::Sequences::Runs;

# Produce pseudo ESP targets and responses:
my ($i, @targets, @responses);
for ($i = 0; $i < 250; $i++) {
   $targets[$i] = (qw/circle plus square star wave/)[int(rand(5))];
   $responses[$i] = (qw/circle plus square star wave/)[int(rand(5))];
}

# Test for runs of matches between targets and responses:
my $runs = Statistics::Sequences::Runs->new();
$runs->load(targets => \@targets, responses => \@responses);
$runs->match(data => [qw/targets responses/]);
$runs->test();
print "The probability of obtaining these $runs->{'observed'} runs is $runs->{'p_value'}\n";

# But let's test (preferably, if predicted) that the responses were matched to the target on the trial one ahead (as if by "precognition"):
$runs->match(data => [qw/targets responses/], lag => 1)->test();
print "With responses synchronised to targets on the next (+1) sample,\n 
$runs->{'observed'} runs in 250 samplings were produced when $runs->{'expected'} were expected,\n 
a deviation with an associated probability of $runs->{'p_value'}\n";

REFERENCES

Kelly, E. F. (1982). On grouping of hits in some exceptional psi performers. Journal of the American Society for Psychical Research, 76, 101-142.

Swed, F., & Eisenhart, C. (1943). Tables for testing randomness of grouping in a sequence of alternatives. Annals of Mathematical Statistics, 14, 66-87. [Look in ex/checks.pl in the installation dist for a few examples from this paper for testing.]

Wald, A., & Wolfowitz, J. (1940). On a test whether two samples are from the same population. Annals of Mathematical Statistics, 11, 147-162.

Wolfowitz, J. (1943). On the theory of runs with some applications to quality control. Annals of Mathematical Statistics, 14, 280-288. [Suggests some ways in which data may be dichotomised for testing runs.]

TO DO/BUGS

Results are dubious if there are only two observations.

Testing not by z-scores, and/or using poisson distribution for low number of observations

Fu's Markovian solution

REVISION HISTORY

See CHANGES in installation dist for revisions.

AUTHOR/LICENSE

rgarton AT cpan DOT org

This program is free software. It may be used, redistributed and/or modified under the same terms as Perl-5.6.1 (or later) (see http://www.perl.com/perl/misc/Artistic.html).

DISCLAIMER

To the maximum extent permitted by applicable law, the author of this module disclaims all warranties, either express or implied, including but not limited to implied warranties of merchantability and fitness for a particular purpose, with regard to the software and the accompanying documentation.

END

This ends documentation of a Perl implementation of the Wald-Walfowitz Runs test for randomness and group differences within a sequence.

To install Statistics::Sequences, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Statistics::Sequences

CPAN shell

perl -MCPAN -e shell
install Statistics::Sequences

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)