NAME
Statistics::Sequences::Turns - Kendall's test for turning-points - peaks or troughs - in a numerical sequence
SYNOPSIS
use Statistics::Sequences::Turns;
$turns = Statistics::Sequences::Turns->new();
$turns->load(0, 3, 9, 2 , 1, 1, 3, 4, 0, 3, 5, 5, 5, 8, 4, 7, 3, 2, 4, 3, 6);
$turns->test()->dump();
#Z = -0.0982471864864821, 2p = 0.92174
DESCRIPTION
This module implements a test of randomness that is suitable for data of the continuous numerical type - not static categories (like choices between a "banana" and "cheese"), but sequences of numerical values where it is meaningful to speak of values, on trial i, that might be higher or lower than the value on trials i - 1 and i + 1; the trial's neighbours. It is particularly commended for time-series, testing if a numerical sequence shows systematic rather than random oscillations.
Specifically, the test concerns whether the value on trial i (for i is greater than zero and less than n), with respect to its neighbours, is a peak (greater than both neighbours) or a trough (less than both neighbours), and if the frequencies of these turns, as peaks and troughs, is commensurate with what is expected for a randomly generated sequence. In this way, the Turns-test is always based on three consecutive values in a sequence.
If you have one or two sets of categorical data, or two groups of numerical data, you can firstly dichotomize them into an array of zeroes and ones (see Statistics::Sequences/Dichotomising data), and then perform the Turns test to assess the randomness of their sequential association.
METHODS
new
$turns = Statistics::Sequences::Turns->new();
Returns a new Turns object. Expects/accepts no arguments but the classname.
load
$turns->load(@data);
$turns->load(\@data);
$turns->load('dist1' => \@data1, 'dist2' => \@data2)
$turns->load({'dist1' => \@data1, 'dist2' => \@data2})
Loads data anonymously or by name. See load in the Statistics::Sequences manpage.
test
$turns->test();
Performs Kendall's turning-points test on the given or named distribution, yielding a Z statistic.
The number of turns as peaks and troughs are then counted up, starting from element 1, checking if both its left/right (or past/future) neighbours are lesser than it (a peak) or greater than it (a trough). Wherever the values in successive indices of the list are equal, they are treated as a single observation/datum -so the following:
0 0 1 1 0 1 1 1 0 1
is counted up for turns as
0 1 0 1 0 1
So, e.g., there are four turns in the above example - two peaks (0 1 0) and two troughs (1 0 1). (This would not be picked up as a non-random sequence, but if it were repeated, it would be seen to significantly deviate from expectation, p = .035.)
The observed number of turns is compared to the number expected, and this deviation is assessed against the expected deviation, i.e., as a Z-value; Kendall (1973) having observed that the statistic shows "a fairly rapid tendency of the distribution to normality" (p. 24).
dump
$turns->dump(flag => '1|0', text => '0|1|2');
Print test results to STDOUT. See dump in the Statistics::Sequences manpage for details.
EXAMPLE
Seating at the diner
This is the data from Swed and Eisenhart (1943) also given as an example for the Runs test and Vnomes test. It lists the occupied (O) and empty (E) seats in a row at a lunch counter. Have people taken up their seats on a random basis? The Runs test suggested some non-random basis for people to take their seats, ouputting (as per dump
):
Runs: observed = 11.00, expected = 7.88, Z = 1.60, 1p = 0.054834
That means there was more serial discontinuity than expected. What does the test of Turns tell us?
use Statistics::Sequences::Turns;
my $turns = Statistics::Sequences::Turns->new();
my @seating = (qw/E O E E O E E E O E E E O E O E/);
$turns->load(\@data);
$turns->binate(); # transform Es and Os into 1s and 0s
$turns->test(tails => 1)->dump();
This outputs, as returned by string
:
Z = 1.95615199108988, 1p = 0.025224
So each seated person is neighboured by empty seats, and/or each empty seat is neighboured by seated persons, more so than would be expected if people were taking their seats randomly.
REFERENCES
Kendall, M. G. (1973). Time-series. London, UK: Griffin. [The test is described on pages 22-24. Note that in the Example 2.1 for this test, the variable used in the calculation of the expected number of turns should be 52 (i.e., n - 2), not 54.]
SEE ALSO
Statistics::Sequences for other tests of sequences, and for sharing data between these tests.
TO DO/BUGS
Implementation of the serial test for non-overlapping v-nomes.
REVISION HISTORY
See CHANGES in installation dist for revisions.
AUTHOR/LICENSE
- Copyright (c) 2006-2010 Roderick Garton
-
rgarton AT cpan DOT org
This program is free software. It may be used, redistributed and/or modified under the same terms as Perl-5.6.1 (or later) (see http://www.perl.com/perl/misc/Artistic.html).
DISCLAIMER
To the maximum extent permitted by applicable law, the author of this module disclaims all warranties, either express or implied, including but not limited to implied warranties of merchantability and fitness for a particular purpose, with regard to the software and the accompanying documentation.
END
This ends documentation of the Perl implementation of Kendall's turning-points test for randomness of a numerical sequence.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 35:
alternative text 'Statistics::Sequences/Dichotomising data' contains non-escaped | or /