NAME

Statistics::Sequences::Turns - Kendall's turning-points test - of peaks and troughs in a numerical sequence

SYNOPSIS

use strict;
use Statistics::Sequences::Turns 0.12;
my $turns = Statistics::Sequences::Turns->new();
$turns->load([2, 0, 8.5, 5, 3, 5.01, 2, 2, 3]); # numbers; or send as "data => $aref" with each stat call
my $val = $turns->observed(); # or descriptive methods: expected(), variance(), obsdev() and stdev()
$val = $turns->z_value(); # # or in list context get back both z- and p-value
$val = $turns->p_value(); # as above, assume data are loaded
my $href = $turns->stats_hash(values => {observed => 1, p_value => 1}, ccorr => 1); # incl. any other stat-method
$turns->dump(values => {observed => 1, expected => 1, p_value => 1}, ccorr => 1, flag => 1, precision_s => 3, precision_p => 7);
# prints: observed = 11.000, expected = 10.900, p_value = 0.5700167

DESCRIPTION

Implements Kendall's (1973) "turning point test" of sudden changes as peaks and troughs in the values of a numerical sequence. It is sometimes described as a test of "cyclicity", and often used as a test of randomness. However, it simply counts up the number of local maxima and minima within a sequence, regardless of their spacing and magnitude, and so does not indicate if the changes actually cycle between highs and lows, if they are more or less balanced in magnitude, or if any cycling is periodic. Kendall introduced this as a rough test of ups and downs in a sequence ahead of describing more sensitive tests based on autocorrelation and Fourier analysis.

Specifically, for a sequence of numerical data (interval or ordinal) of size N, a count of turns is incremented if the value on trial i, for all i greater than zero and less than N, is, with respect to its immediate neighbours (the values on i - 1 and i + 1), a greater than both neighbours (a peak) or less than both neighbours (a trough). The difference of this observed number from the mean expected number of turns for a randomly generated sequence, taken as a unit of the standard deviation, gives a Z-score for assessing the "randomness" of the sequence, i.e., the absence of a factor systematically affecting the frequency of peaks/troughs, given that, for turns, there is "a fairly rapid tendency of the distribution to normality" (Kendall 1973, p. 24).

METHODS

new

$turns = Statistics::Sequences::Turns->new();

Returns a new Turns object. Expects/accepts no arguments but the classname.

load

$turns->load(@data);
$turns->load(\@data);
$turns->load('sample1' => \@data); # labelled whatever

Loads data anonymously or by name - see load in the Statistics::Data manpage for details on the various ways data can be loaded and then retrieved (more than shown here). Data must be numerical (ordinal, interval type). All elements must be numerical of the method croaks.

add, read, unload

See Statistics::Data for these additional operations on data that have been loaded.

observed

$v = $turns->observed(); # use anonymously loaded data
$v = $turns->observed(index => 1); # ... or give the required "index" for the loaded data
$v = $turns->observed(label => 'mysequence'); # ... or its "label" value
$v = $turns->observed(data => \@data); # ... or just give the data now

Returns observed number of turns. This is the number of peaks and troughs, starting the count from index 1 of the sequence (a flat array), checking if both its immediate left/right (or past/future) neighbours are lesser than it (a peak) or greater than it (a trough). Wherever the values in successive indices in the sequence are equal, they are treated as a single observation/datum - so the following:

0 0 1 1 0 1 1 1 0 1

is counted up for turns as

0 1 0 1 0 1
  * * * *

This shows four turns - two peaks (0 1 0) and two troughs (1 0 1).

Returns 0 if the given list of is empty, or the number of its elements is less than 3.

expected

$v = $turns->expected(); # use first-loaded data; or specify by "index" or "label", or give it as "data" - see observed()
$v = $turns->expected(data => \@data); # use these data
$v = $turns->expected(trials => 10); # don't use actual data; calculate from this number of trials

Returns the expected number of turns, which is set by N the number of trials/observations/sample-size ...:

  E[T] = 2 / 3 (N – 2)

or, equivalently (in some sources),

  E[T] = ( 2N – 4 ) / 3

variance

$v = $turns->variance(); # use first-loaded data; or specify by "index" or "label", or give it as "data" - see observed()
$v = $turns->variance(data => \@data); # use these data
$v = $turns->variance(trials => number); # don't use actual data; calculate from this number of trials

Returns the expected variance in the number of turns for the given length of data N.

  V[T] = (16N – 29 ) / 90

obsdev

$v = $turns->obsdev(); # use data already loaded - anonymously; or specify its "label" or "index" - see observed()
$v = $turns->obsdev(data => \@data); # use these data

Returns the observed deviation from expectation for the loaded/given sequence: observed less expected turn-count (O - E). Alias of observed_deviation is supported.

stdev

$v = $turns->stdev(); # use data already loaded - anonymously; or specify its "label" or "index" - see observed()
$v = $turns->stdev(data => \@data);

Returns square-root of the variance. Aliases standard_deviation and stddev (common in other Statistics modules) are supported.

z_value

$z = $turns->z_value(ccorr => 1); # use data already loaded - anonymously; or specify its "label" or "index" - see observed()
$z = $turns->z_value(data => $aref, ccorr => 1);
($z, $p) = $turns->z_value(data => $aref, ccorr => 1, tails => 2); # same but wanting an array, get the p-value too

Returns the deviation ratio, or Z-score, taking the turncount expected from that observed and dividing by the root variance, by default with a continuity correction in the numerator. Called in list context, returns the Z-score with its normal distribution, two-tailed p-value.

The data to test can already have been loaded, or sent directly as an aref keyed as data.

Optional named arguments tails (1 or 2), ccorr (Boolean for the continuity-correction), precision_s (for the statistic, i.e., Z-score) and precision_p (for the p-value).

The method can all be called with "sufficient" data: giving, instead of actual data, the observed number of turns, and the number of trials, the latter being sufficient to compute the expected number of turns and its variance.

Alias z_score is supported.

p_value

$p = $turns->p_value(); # using loaded data and default args
$p = $turns->p_value(ccorr => 0|1, tails => 1|2); # normal-approximation based on loaded data
$p = $turns->p_value(data => $aref, ccorr => 1, tails => 2); #  using given data (by-passing load and read)

Returns the normal distribution p-value for the deviation ratio (Z-score) of the observed number of turns, 2-tailed and continuity-correct by default (or set tails => 1 and ccorr => 0, respectively). Other arguments are as for z_value.

stats_hash

$href = $turns->stats_hash(values => {observed => 1, expected => 1, variance => 1, z_value => 1, p_value => 1}, ccorr => 1);

Returns a hashref for the counts and stats as specified in its "values" argument, and with any options for calculating them. See "stats_hash" in Statistics::Sequences for details. If calling via a "turns" object, the option "stat => 'turns'" is not needed (unlike when using the parent "sequences" object).

dump

$turns->dump(flag => '1|0', text => '0|1|2');

Print test results to STDOUT. See dump in the Statistics::Sequences manpage for details.

EXAMPLE

Seating at the diner

This is the data from Swed and Eisenhart (1943) also given as an example for the Runs test, Joins test and Vnomes (serial) test. It lists the occupied (O) and empty (E) seats in a row at a lunch counter. Have people taken up their seats on a random basis - or do they show some social phobia (more sparsely seated than "chance"), or are they trying to pick up (more compactly seated than "chance")? What does Kendall's test of turns reveal?

use Statistics::Sequences::Turns;
my $turns = Statistics::Sequences::Turns->new();
# change the nominal data from Swed & Eisenhart (1943) into numerical values:
my @seating = map { $_ eq 'E' ? 1 : 0 } (qw/E O E E O E E E O E E E O E O E/); 
$turns->load(\@seating); # as per Statistics::Data
$turns->dump_vals(delim => q{,}); # via Statistics::Data - prints the 1s and 0s:
# 1,0,1,1,0,1,1,1,0,1,1,1,0,1,0,1
$turns->dump(
   format => 'labline',
   flag => 1,
   precision_s => 3,
   precision_p => 3,
   verbose => 1,
);

This prints:

Turns: observed = 9.000, p_value = 0.050

So, the observed number of turns in the seating arrangements differed from that expected within the bounds of chance, at the .05 level. The Vnomes test for trinomes was similarly marginal (p = .044), as was the result for Runs (p = 0.055), while the Joins test was clearly non-significant (p = .302). Checking the number of turns expected ( = 6) suggests, perhaps, a tendency for people to take their seats further away from each other (leave more unoccupied seats between them) than expected on the basis of chance.

DEPENDENCIES

Statistics::Sequences

Statistics::Zed

REFERENCES

Kendall, M. G. (1973). Time-series. London, UK: Griffin. ISBN 0852642202. [The test is described on pages 22-24 of the 1973 edition; in the Example 2.1 for this test, the expected number of turns should be calculated with the value 52 (i.e., with N - 2), not the misprinted value of 54.]

SEE ALSO

Statistics::Sequences for other tests of sequences, and for sharing data between these tests.

Statistics::Sequences::Joins : another test of consecutive values in a sequence, examining alternations.

Statistics::Sequences::Pot : another trend-type test, examining relatively spaced clustering of particular events.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc Statistics::Sequences::Turns

You can also look for information at:

AUTHOR/LICENSE

rgarton AT cpan DOT org

This program is free software. It may be used, redistributed and/or modified under the same terms as Perl-5.6.1 (or later) (see http://www.perl.com/perl/misc/Artistic.html).

DISCLAIMER

To the maximum extent permitted by applicable law, the author of this module disclaims all warranties, either express or implied, including but not limited to implied warranties of merchantability and fitness for a particular purpose, with regard to the software and the accompanying documentation.