NAME
Statistics::ANOVA::JT - Jonckheere-Terpstra statistics and test
VERSION
Version 0.01
SYNOPSIS
use Statistics::ANOVA::JT;
my $jt = Statistics::ANOVA::JT->new();
$jt->load({1 => [2, 4, 6], 2 => [3, 3, 12], 3 => [5, 7, 11, 16]}); # note ordinal datanames
my $j_value = $jt->observed(); # or expected(), variance()
my ($z_value, $p_value) = $jt->zprob_test(ccorr => 2, tails => 1, correct_ties => 1);
# or without pre-loading:
$j_value = $jt->observed(data => {1 => [2, 4, 6], 2 => [5, 3, 12]});
# or for subset of loaded data:
$j_value = $jt->observed(lab => [1, 3]);
DESCRIPTION
Calculates Jonckheere-Terpstra statistics for sameness (common population) across given orders of independent variables. The statistics are based on a between-groups pooled ranking of the data, like the Kruskal-Wallis test, but, unlike Kruskall-Wallis that returns the same result regardless of order of levels, it takes into account ordinal value of the named data. As ordinal values, numerical intervals between the named values do not matter.
Data-loading and retrieval are as provided in Statistics::Data, on which the JT object is base
d, so its other methods are available here.
Return values are tested on installation against published examples: in Hollander and Wolfe (1999), for sample MStat output on mcardle.wisc.edu, and for the final Z-value in the wikipedia example.
SUBROUTINES/METHODS
new
$jt = Statistics::ANOVA::JT->new();
New object for accessing methods and storing results. This "isa" Statistics::Data object.
observed
$val = $jt->observed(); # data pre-loaded
$val = $jt->observed(data => $hashref_of_arefs);
Returns the statistic J: From between-group rankings of all possible pairwise splits of the data, accumulates J as the sum of k(k - 1)/2 Mann-Whitney U counts.
Optionally, if the data have not been pre-loaded, send as named argument data.
expected
$val = $jt->expected(); # data pre-loaded
$val = $jt->expected(data => $hashref_of_arefs);
Returns the expected value of the J statistic for the given data.
variance
$val = $jt->variance(); # data pre-loaded
$val = $jt->variance(data => $hashref_of_arefs);
Return the variance expected to occur in the J values for the given data.
By default, the method accounts for and corrects for ties, but if correct_ties
= 0, the returned value is the usual "null" distribution variance, otherwise with an elaborate correction accounting for the number of tied variables and each of their sizes, as offered by Hollander & Wolfe (1999) Eq 6.19, p. 204.
zprob_test
$p_val = $jt->zprob_test(); # data pre-loaded
$p_val = $jt->zprob_test(data => $hashref_of_arefs);
($z_val, $p_val) = $jt->zprob_test(); # get z-score too
Performs a z-test on the data and returns the associated probability; or, if called in array context, the z-value itself and then the probability value.
Rather than calculating the exact p-value, calculates an expected J value and variance, to provide a normalized J for which the p-value is read off the normal distribution. This is appropriate for "large" samples, e.g., greater-than 3 levels, with more than eight observations per level. Otherwise, read the value returned from $jt->observed()
and look it up in a table of j-values, such as in Hollander & Wolfe (1999), p. 649ff.
Optional arguments include correct_ties (as above), and tails and ccorr as in Statistics::Zed. For example, to continuity correct by reducing the observed J-value by 1 (recommended in some texts), set ccorr => 2 (for half on either side of the expected value; if ccorr => 1, then 0.5 is taken off the observed deviation, and so on). The default is not to continuity correct.
REFERENCES
Hollander, M., & Wolfe, D. A. (1999). Nonparametric statistical methods. New York, NY, US: Wiley.
DEPENDENCIES
Statistics::Data : used as a base
for caching and retrieving data.
Statistics::Data::Rank : used to implement between-sample ranking.
Statistics::Zed : for z-testing with optional continuity correction and tailing.
Algorithm::Combinatorics : provides the combinations
algorithm to provide all possible pairs of data-names to loop thru in calculating the observed J value.
List::AllUtils : provides the handy sum0() function
BUGS
Please report any bugs or feature requests to bug-statistics-anova-jt-0.01 at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Statistics-ANOVA-JT-0.01. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Statistics::ANOVA::JT
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
http://rt.cpan.org/NoAuth/Bugs.html?Dist=Statistics-ANOVA-JT-0.01
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
AUTHOR
Roderick Garton, <rgarton at cpan.org>
LICENSE AND COPYRIGHT
Copyright 2015 Roderick Garton.
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.