File Coverage

blib/lib/Statistics/SerialCorrelation.pm

Criterion	Covered	Total	%
statement	20	20	100.0
branch	4	4	100.0
condition			n/a
subroutine	4	4	100.0
pod	1	1	100.0
total	29	29	100.0

line	stmt	bran	sub	pod	time	code
1						package Statistics::SerialCorrelation;
2
3	1		1		10041	use base 'Exporter';
	1				2
	1				142
4						@EXPORT_OK = qw(serialcorrelation);
5
6						$VERSION = '1.1';
7
8	1		1		5	use strict;
	1				1
	1				35
9	1		1		5	use warnings;
	1				7
	1				264
10
11						=head1 NAME
12
13						Statistics::SerialCorrelation - calculate the serial correlation
14						co-efficient for an array
15
16						=head1 SYNOPSIS
17
18						use Statistics::SerialCorrelation;
19
20						print Statistics::SerialCorrelation::serialcorrelation(1..6);
21
22						Or if you don't mind polluting your namespace, you may import the
23						serialcorrelation function like so:
24
25						use Statistics::SerialCorrelation 'serialcorrelation';
26
27						=head1 DESCRIPTION
28
29						This module does just one thing, it calculates Serial Correlation
30						Co-efficients, which are a measure of how predictable a series of
31						values is. For example, the sequence:
32
33						1 2 3 4 5 6 7 8 9 10
34
35						is very predictable, and will have a high serial correlation
36						co-efficient. The sequence
37
38						10 1 3 2 6 7 7 9 2
39
40						is less predictable and so has a correlation co-efficient nearer 0.
41
42						In general, random data has a co-efficient close to zero, highly-ordered
43						data doesn't. Note that the co-efficient may be negative.
44
45						There is just one function.
46
47						=over 4
48
49						=item serialcorrelation
50
51						This takes either a list of numbers or an array reference. If given
52						an array reference, this is first turned into an array. It then
53						calculates the correlation co-efficient and returns it.
54
55						See your copy of Knuth for the formula.
56
57						=back
58
59						=cut
60
61						sub serialcorrelation {
62	4		4	1	113	my @U = @_;
63
64	4	100			17	@U = @{$U[0]} if(ref($U[0]) =~ /^ARRAY/);
	2				8
65	4				7	my $n = $#U + 1;
66
67	4				10	my($sum_of_products_of_pairs, $sum_of_squares, $sum) = (
68						$U[$n - 1] * $U[0],
69						$U[$n - 1] * $U[$n - 1],
70						$U[$n - 1]
71						);
72	4				10	foreach(0 .. $n - 2) {
73	31				37	$sum_of_products_of_pairs += $U[$_] * $U[$_ + 1];
74	31				172	$sum_of_squares += $U[$_] * $U[$_];
75	31				36	$sum += $U[$_]
76						}
77
78	4	100			15	return undef if($n * $sum_of_squares == $sum * $sum);
79	3				16	(($n * $sum_of_products_of_pairs) - ($sum * $sum)) /
80						(($n * $sum_of_squares) - ($sum * $sum));
81						}
82
83						=head1 BUGS
84
85						To avoid divide-by-zero errors, we return undef if the square of the sum of
86						your values is equal to the number of values multiplied by the sum of the
87						squares of the values. undef was chosen because it will never otherwise be
88						returned and so you can easily detected.
89
90						The results are not particularly meaningful for small data sets.
91
92						No other bugs are known, but if you find any please let me know, and send a
93						test case.
94
95						=head1 FEEDBACK
96
97						I welcome feedback about my code, including constructive criticism.
98
99						=head1 AUTHOR
100
101						David Cantrell EFE
102
103						=head1 COPYRIGHT
104
105						Copyright 2003 David Cantrell
106
107						This module is free-as-in-speech software, and may be used, distributed,
108						and modified under the same terms as Perl itself.
109
110						=cut
111
112						1;