NAME
Statistics::Descriptive::Discrete - Compute descriptive statistics for discrete data sets.
SYNOPSIS
use Statistics::Descriptive::Discrete;
my $stats = new Statistics::Descriptive::Discrete;
$stats->add_data(1,10,2,0,1,4,5,1,10,8,7);
print "count = ",$stats->count(),"\n";
print "uniq = ",$stats->uniq(),"\n";
print "sum = ",$stats->sum(),"\n";
print "min = ",$stats->min(),"\n";
print "max = ",$stats->max(),"\n";
print "mean = ",$stats->mean(),"\n";
print "standard_deviation = ",$stats->standard_deviation(),"\n";
print "variance = ",$stats->variance(),"\n";
print "sample_range = ",$stats->sample_range(),"\n";
print "mode = ",$stats->mode(),"\n";
print "median = ",$stats->median(),"\n";
DESCRIPTION
This module provides basic functions used in descriptive statistics. It borrows very heavily from Statistics::Descriptive::Full (which is included with Statistics::Descriptive) with one major difference. This module is optimized for discretized data e.g. data from an A/D conversion that has a discrete set of possible values. E.g. if your data is produced by an 8 bit A/D then you'd have only 256 possible values in your data set. Even though you might have a million data points, you'd only have 256 different values in those million points. Instead of storing the entire data set as Statistics::Descriptive does, this module only stores the values it's seen and the number of times it's seen each value.
For very large data sets, this storage method results in significant speed and memory improvements. In a test case with 2.6 million data points from a real world application, Statistics::Descriptive::Discrete took 40 seconds to calculate a set of statistics instead of the 561 seconds required by Statistics::Descriptive::Full. It also required only 4MB of RAM instead of the 400MB used by Statistics::Descriptive::Full for the same data set.
NOTE
Until I get a chance to add documentation for the method calls, look at the Statistics::Descriptive documentation. The interface for this module is almost identical to Statistics::Descriptive. This module is incomplete and not fully tested. It's currently only alpha code so use at your own risk.
BUGS
Code for calculating mode is not as robust as it should be.
Other bugs are lurking I'm sure.
TODO
Finish the documentation for each method
Make test suite more robust
Add rest of methods (at least ones that don't depend on original order of data) from Statistics::Descriptive
AUTHOR
Rhet Turnbull, RhetTbull on perlmonks.org, rhettbull at hotmail.com
COPYRIGHT
Copyright (c) 2002 Rhet Turnbull. All rights reserved. This
program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
Portions of this code is from Statistics::Descriptive which is under
the following copyrights:
Copyright (c) 1997,1998 Colin Kuskie. All rights reserved. This
program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
Copyright (c) 1998 Andrea Spinelli. All rights reserved. This program
is free software; you can redistribute it and/or modify it under the
same terms as Perl itself.
Copyright (c) 1994,1995 Jason Kastner. All rights
reserved. This program is free software; you can redistribute it
and/or modify it under the same terms as Perl itself.
SEE ALSO
Statistics::Descriptive