File Coverage

blib/lib/Devel/Git/MultiBisect.pm

Criterion	Covered	Total	%
statement	29	127	22.8
branch	0	50	0.0
condition	0	3	0.0
subroutine	10	19	52.6
pod	5	5	100.0
total	44	204	21.5

line	stmt	bran	cond	sub	pod	time	code
1							package Devel::Git::MultiBisect;
2	4			4		2099	use strict;
	4					11
	4					117
3	4			4		22	use warnings;
	4					8
	4					108
4	4			4		42	use v5.14.0;
	4					16
5	4			4		1816	use Devel::Git::MultiBisect::Init;
	4					15
	4					204
6	4					316	use Devel::Git::MultiBisect::Auxiliary qw(
7							clean_outputfile
8							hexdigest_one_file
9							validate_list_sequence
10	4			4		2116	);
	4					14
11	4			4		29	use Carp;
	4					9
	4					235
12	4			4		28	use Cwd;
	4					9
	4					211
13	4			4		24	use File::Spec;
	4					9
	4					77
14	4			4		23	use File::Temp;
	4					7
	4					286
15	4			4		23	use List::Util qw(sum);
	4					6
	4					5857
16
17							our $VERSION = '0.16';
18
19							=head1 NAME
20
21							Devel::Git::MultiBisect - Study build and test output over a range of F commits
22
23							=head1 SYNOPSIS
24
25							You will typically construct an object of a class which is a child of
26							F, such as F or
27							F. All methods documented in this
28							parent package may be called from either child class.
29
30							use Devel::Git::MultiBisect::AllCommits;
31							$self = Devel::Git::MultiBisect::AllCommits->new(\%parameters);
32
33							... or
34
35							use Devel::Git::MultiBisect::Transitions;
36							$self = Devel::Git::MultiBisect::Transitions->new(\%parameters);
37
38							... and then:
39
40							$commit_range = $self->get_commits_range();
41
42							$full_targets = $self->set_targets(\@target_args);
43
44							$outputs = $self->run_test_files_on_one_commit($commit_range->[0]);
45
46							... followed by methods specific to the child class.
47
48							... and then perhaps also:
49
50							$timings = $self->get_timings();
51
52							=head1 DESCRIPTION
53
54							Given a Perl library or application kept in F for version control, it is
55							often useful to be able to compare the output collected from running one or
56							more test files over a range of F commits. If that range is sufficiently
57							large, a test may fail in B over that range.
58
59							If that is the case, then simply asking, I<"When did this file start to
60							fail?"> -- a question which C is designed to answer -- is
61							insufficient. In order to identify more than one point of failure, we may
62							need to (a) capture the test output for each commit; or, (b) capture the test
63							output only at those commits where the output changed. The output of a run of
64							a test file may change for a variety of reasons: test failures, segfaults,
65							changes in the number or content of tests, etc.
66
67							F provides methods to achieve that objective. Its
68							child classes, F and
69							F, provide different flavors of that
70							functionality for objectives (a) and (b), respectively. Please refer to their
71							documentation for further discussion.
72
73							=head2 GLOSSARY
74
75							=over 4
76
77							=item * B
78
79							A source code change set entered ("committed") to a F repository. Each
80							commit is denoted by a SHA. In this library, whenever a commit is called for
81							as the argument to a function, you can also use a F.
82
83							=item * B
84
85							The range of sequential commits (determined by F) requested for analysis.
86
87							=item * B
88
89							A test file from the test suite of the application or library under study.
90
91							=item * B
92
93							What is sent to STDOUT or STDERR as a result of calling a test program such as
94							F or F on an individual target file. Currently we assume
95							that all such test programs are written based on the
96							L.
97
98							=item * B
99
100							A commit at which the test output for a given target changes from that of the
101							commit immediately preceding.
102
103							=item * B
104
105							A string holding the output of a cryptographic process run on test output
106							which uniquely identifies that output. (Currently, we use the
107							C algorithm.) We assume that if the test output does
108							not change between one or more commits, then that commit is not a transitional
109							commit.
110
111							Note: Before taking a digest on a particular test output, we exclude text
112							such as timings which are highly likely to change from one run to the next and
113							which would introduce spurious variability into the digest calculations.
114
115							=item * B or B
116
117							A series of configure-build-test process sequences at those commits within the
118							commit range which are selected by a bisection algorithm.
119
120							Normally, when we bisect (via F, F or
121							otherwise), we are seeking a single point where a Boolean result -- yes/no,
122							true/false, pass/fail -- is returned. What the test run outputs to STDOUT or
123							STDERR is a lesser concern.
124
125							In multisection we bisect repeatedly to determine all points where the output
126							of the test command changes -- regardless of whether that change is a C,
127							C or whatever. We capture the output for later human examination.
128
129							=back
130
131							=head1 METHODS
132
133							=head2 C
134
135							=over 4
136
137							=item * Purpose
138
139							Constructor.
140
141							=item * Arguments
142
143							$self = Devel::Git::MultiBisect::AllCommits->new(\%params);
144
145							or
146
147							$self = Devel::Git::MultiBisect::Transitions->new(\%params);
148
149							Reference to a hash, typically the return value of
150							C.
151
152							The hashref passed as argument must contain key-value pairs for C,
153							C and C. C tests for the existence of each of
154							these directories.
155
156							=item * Return Value
157
158							Object of Devel::Git::MultiBisect child class.
159
160							=back
161
162							=cut
163
164							sub new {
165	0			0	1		my ($class, $params) = @_;
166
167	0						my $data = Devel::Git::MultiBisect::Init::init($params);
168
169	0						return bless $data, $class;
170							}
171
172							=head2 C
173
174							=over 4
175
176							=item * Purpose
177
178							Identify the SHAs of each F commit identified by C.
179
180							=item * Arguments
181
182							$commit_range = $self->get_commits_range();
183
184							None; all data needed is already in the object.
185
186							=item * Return Value
187
188							Array reference, each element of which is a SHA.
189
190							=back
191
192							=cut
193
194							sub get_commits_range {
195	0			0	1		my $self = shift;
196	0						return [ map { $_->{sha} } @{$self->{commits}} ];
	0
	0
197							}
198
199							=head2 C
200
201							=over 4
202
203							=item * Purpose
204
205							Identify the test files which will be run at different points in the commits
206							range. We shall assume that the test file has existed with its name unchanged
207							over the entire commit range.
208
209							=item * Arguments
210
211							$target_args = [
212							't/44_func_hashes_mult_unsorted.t',
213							't/45_func_hashes_alt_dual_sorted.t',
214							];
215							$full_targets = $self->set_targets($target_args);
216
217							Reference to an array holding the relative paths beneath the C to the
218							test files selected for examination.
219
220							=item * Return Value
221
222							Reference to an array holding hash references with these elements:
223
224							=over 4
225
226							=item * C
227
228							Absolute paths to the test files selected for examination. Test file is
229							tested for its existence.
230
231							=item * C
232
233							String composed by taking an element in the array ref passed as argument and
234							substituting underscores C(<_>) for forward slash (C) and dot (C<.>)
235							characters. So,
236
237							t/44_func_hashes_mult_unsorted.t
238
239							... becomes:
240
241							t_44_func_hashes_mult_unsorted_t
242
243							=back
244
245							=back
246
247							=cut
248
249							sub set_targets {
250	0			0	1		my ($self, $explicit_targets) = @_;
251
252	0						my @raw_targets = @{$self->{targets}};
	0
253
254							# If set_targets() is provided with an appropriate argument
255							# ($explicit_targets), override whatever may have been stored in the
256							# object by new().
257
258	0	0					if (defined $explicit_targets) {
259	0	0					croak "Explicit targets passed to set_targets() must be in array ref"
260							unless ref($explicit_targets) eq 'ARRAY';
261	0						@raw_targets = @{$explicit_targets};
	0
262							}
263
264	0						my @full_targets = ();
265	0						my @missing_files = ();
266	0						for my $rt (@raw_targets) {
267	0						my $ft = File::Spec->catfile($self->{gitdir}, $rt);
268	0	0					if (! -e $ft) { push @missing_files, $ft; next }
	0
	0
269	0						my $stub;
270	0						($stub = $rt) =~ s{[./]}{_}g;
271	0						push @full_targets, {
272							path => $ft,
273							stub => $stub,
274							};
275							}
276	0	0					if (@missing_files) {
277	0						croak "Cannot find file(s) to be tested: @missing_files";
278							}
279	0						$self->{targets} = [ @full_targets ];
280	0						return \@full_targets;
281							}
282
283							=head2 C
284
285							=over 4
286
287							=item * Purpose
288
289							Capture the output from running the selected test files at one specific F checkout.
290
291							=item * Arguments
292
293							$outputs = $self->run_test_files_on_one_commit("2a2e54a");
294
295							or
296
297							$excluded_targets = [
298							't/45_func_hashes_alt_dual_sorted.t',
299							];
300							$outputs = $self->run_test_files_on_one_commit("2a2e54a", $excluded_targets);
301
302							=over 4
303
304							=item 1
305
306							String holding the SHA from a single commit in the repository. This string
307							would typically be one of the elements in the array reference returned by
308							C<$self->get_commits_range()>. If no argument is provided, the method will
309							default to using the first element in the array reference returned by
310							C<$self->get_commits_range()>.
311
312							=item 2
313
314							Reference to array of target test files to be excluded from a particular
315							invocation of this method. Optional, but will die if argument is not an array
316							reference.
317
318							=back
319
320							=item * Return Value
321
322							Reference to an array, each element of which is a hash reference with the
323							following elements:
324
325							=over 4
326
327							=item * C
328
329							String holding the SHA from the commit passed as argument to this method (or
330							the default described above).
331
332							=item * C
333
334							String holding the value of C (above) to the number of characters
335							specified in the C element passed to the constructor; defaults to 7.
336
337							=item * C
338
339							String holding a rewritten version of the relative path beneath C of
340							the test file being run. In this relative path forward slash (C) and dot
341							(C<.>) characters are changed to underscores C(<_>). So,
342
343							t/44_func_hashes_mult_unsorted.t
344
345							... becomes:
346
347							t_44_func_hashes_mult_unsorted_t'
348
349							=item * C
350
351							String holding the full path to the file holding the TAP output collected
352							while running one test file at the given commit. The following example shows
353							how that path is calculated. Given:
354
355							output directory (outputdir) => '/tmp/DQBuT_SRAY/'
356							SHA (commit) => '2a2e54af709f17cc6186b42840549c46478b6467'
357							shortened SHA (commit_short) => '2a2e54a'
358							test file (target->[$i]) => 't/44_func_hashes_mult_unsorted.t'
359
360							... the file is placed in the directory specified by C. We then
361							join C (the shortened SHA), C (the rewritten relative
362							path) and the strings C and C with a dot to yield this value for
363							the C element:
364
365							2a2e54a.t_44_func_hashes_mult_unsorted_t.output.txt
366
367							=item * C
368
369							String holding the return value of
370							C run with the file
371							designated by the C element as an argument. (More precisely, the file
372							as modified by C.)
373
374							=back
375
376							Example:
377
378							[
379							{
380							commit => "2a2e54af709f17cc6186b42840549c46478b6467",
381							commit_short => "2a2e54a",
382							file => "/tmp/1mVnyd59ee/2a2e54a.t_44_func_hashes_mult_unsorted_t.output.txt",
383							file_stub => "t_44_func_hashes_mult_unsorted_t",
384							md5_hex => "31b7c93474e15a16d702da31989ab565",
385							},
386							{
387							commit => "2a2e54af709f17cc6186b42840549c46478b6467",
388							commit_short => "2a2e54a",
389							file => "/tmp/1mVnyd59ee/2a2e54a.t_45_func_hashes_alt_dual_sorted_t.output.txt",
390							file_stub => "t_45_func_hashes_alt_dual_sorted_t",
391							md5_hex => "6ee767b9d2838e4bbe83be0749b841c1",
392							},
393							]
394
395							=item * Comment
396
397							In this method's current implementation, we start with a C from
398							the repository at the specified C. We configure (I C
399							Makefile.PL>) and build (I C) the source code. We then test each
400							of the test files we have targeted (I C
401							relative/path/to/test_file.t>). We redirect both STDOUT and STDERR to
402							C, clean up the outputfile to remove the line containing timings
403							(as that introduces unwanted variability in the C values) and compute
404							the digest.
405
406							This implementation is very much subject to change.
407
408							If a true value for C has been passed to the constructor, the method
409							prints C to STDOUT before returning.
410
411							B While this method is publicly documented, in actual use you probably
412							will not need to call it directly. Instead, you will probably use either
413							C or
414							C.
415
416							=back
417
418							=cut
419
420							sub run_test_files_on_one_commit {
421	0			0	1		my ($self, $commit, $excluded_targets) = @_;
422	0		0				$commit //= $self->{commits}->[0]->{sha};
423	0	0					say "Testing commit: $commit" if ($self->{verbose});
424
425	0	0					if (defined $excluded_targets) {
426	0	0					if (ref($excluded_targets) ne 'ARRAY') {
427	0						croak "excluded_targets, if defined, must be in array reference";
428							}
429							}
430							else {
431	0						$excluded_targets = [];
432							}
433	0						my %excluded_targets;
434	0						for my $t (@{$excluded_targets}) {
	0
435	0						my $ft = File::Spec->catfile($self->{gitdir}, $t);
436	0						$excluded_targets{$ft}++;
437							}
438
439							my $current_targets = [
440	0						grep { ! exists $excluded_targets{$_->{path}} }
441	0						@{$self->{targets}}
	0
442							];
443
444	0						my $starting_branch = $self->_configure_build_one_commit($commit);
445
446	0						my $outputsref = $self->_test_one_commit($commit, $current_targets);
447							say "Tested commit: $commit; returning to: $starting_branch"
448	0	0					if ($self->{verbose});
449
450							# We want to return to our basic branch (e.g., 'master', 'blead')
451							# before checking out a new commit.
452
453	0	0					system(qq\|git checkout --quiet $starting_branch\|)
454							and croak "Unable to 'git checkout --quiet $starting_branch";
455
456	0						$self->{commit_counter}++;
457	0	0					say "Commit counter: $self->{commit_counter}" if $self->{verbose};
458
459	0						return $outputsref;
460							}
461
462							sub _configure_one_commit {
463	0			0			my ($self, $commit) = @_;
464	0	0					chdir $self->{gitdir} or croak "Unable to change to $self->{gitdir}";
465	0	0					system(qq\|git clean --quiet -dfx\|) and croak "Unable to 'git clean --quiet -dfx'";
466	0						my $starting_branch = $self->{branch};
467
468	0	0					system(qq\|git checkout --quiet $commit\|) and croak "Unable to 'git checkout --quiet $commit'";
469	0	0					say "Running '$self->{configure_command}'" if $self->{verbose};
470	0	0					system($self->{configure_command}) and croak "Unable to run '$self->{configure_command})'";
471	0						return $starting_branch;
472							}
473
474							sub _configure_build_one_commit {
475	0			0			my ($self, $commit) = @_;
476
477	0						my $starting_branch = $self->_configure_one_commit($commit);
478
479	0	0					say "Running '$self->{make_command}'" if $self->{verbose};
480	0	0					system($self->{make_command}) and croak "Unable to run '$self->{make_command})'";
481
482	0						return $starting_branch;
483							}
484
485							sub _test_one_commit {
486	0			0			my ($self, $commit, $current_targets) = @_;
487	0						my $short = substr($commit,0,$self->{short});
488	0						my @outputs;
489	0						for my $target (@{$current_targets}) {
	0
490							my $outputfile = File::Spec->catfile(
491							$self->{outputdir},
492							join('.' => (
493							$short,
494							$target->{stub},
495	0						'output',
496							'txt'
497							)),
498							);
499	0						my $command_raw = $self->{test_command};
500	0						my $cmd;
501	0	0					unless ($command_raw eq 'harness') {
502	0						$cmd = qq\|$command_raw $target->{path} >$outputfile 2>&1\|;
503							}
504							else {
505	0						$cmd = qq\|cd t; ./perl harness -v $target->{path} >$outputfile 2>&1; cd -\|;
506							}
507	0	0					say "Running '$cmd'" if $self->{verbose};
508	0	0					system($cmd) and croak "Unable to run test_command";
509	0						$outputfile = clean_outputfile($outputfile);
510							push @outputs, {
511							commit => $commit,
512							commit_short => $short,
513							file => $outputfile,
514							file_stub => $target->{stub},
515	0						md5_hex => hexdigest_one_file($outputfile),
516							};
517	0	0					say "Created $outputfile" if $self->{verbose};
518							}
519	0						return \@outputs;
520							}
521
522							sub _bisection_decision {
523	0			0			my ($self, $target_h_md5_hex, $current_start_md5_hex, $h, $relevant_self,
524							$overall_end_md5_hex, $current_start_idx, $current_end_idx, $max_idx, $n) = @_;
525	0	0					if ($target_h_md5_hex ne $current_start_md5_hex) {
526	0						my $g = $h - 1;
527	0						$self->_run_one_commit_and_assign($g);
528	0						my $target_g_md5_hex = $relevant_self->[$g]->{md5_hex};
529	0	0					if ($target_g_md5_hex eq $current_start_md5_hex) {
530	0	0					if ($target_h_md5_hex eq $overall_end_md5_hex) {
531							}
532							else {
533	0						$current_start_idx = $h;
534	0						$current_end_idx = $max_idx;
535							}
536	0						$n++;
537							}
538							else {
539							# Bisection should continue downwards
540	0						$current_end_idx = $h;
541	0						$n++;
542							}
543							}
544							else {
545							# Bisection should continue upwards
546	0						$current_start_idx = $h;
547	0						$n++;
548							}
549	0						return ($current_start_idx, $current_end_idx, $n);
550							}
551
552							=head2 C
553
554							=over 4
555
556							=item * Purpose
557
558							Get information on the time a multisection took to run.
559
560							=item * Arguments
561
562							None; all data needed is already in the object.
563
564							=item * Return Value
565
566							Hash reference. The selection of elements in this hashref will depend on
567							which subclass of F you are using and may differ among
568							subclasses. Example:
569
570							{ elapsed => 4297, mean => 186.83, runs => 23 }
571
572							In this example (taken from a run of one test file over 220 commits in Perl 5
573							blead), 23 runs were needed to achieve a result. These took 4297 seconds
574							(approximately 71 minutes) with a mean run time of approximately 3 minutes
575							each.
576
577							Method will return undefined value if timings are not yet available within the
578							object.
579
580							=back
581
582							=cut
583
584							sub get_timings {
585	0			0	1		my $self = shift;
586	0	0					return unless exists $self->{timings};
587	0						return $self->{timings};
588							}
589
590							=head1 SUPPORT
591
592							Please report any bugs by mail to C
593							or through the web interface at L.
594
595							=head1 AUTHOR
596
597							James E. Keenan (jkeenan at cpan dot org). When sending correspondence, please
598							include 'Devel::Git::MultiBisect' or 'Devel-Git-MultiBisect' in your subject line.
599
600							Creation date: October 12 2016. Last modification date: August 25 2021.
601
602							Development repository: L
603
604							=head1 ACKNOWLEDGEMENTS
605
606							Thanks to the following contributors and reviewers:
607
608							=over 4
609
610							=item * Smylers
611
612							For naming suggestion: L
613
614							=item * Ricardo Signes
615
616							For feedback during initial development.
617
618							=item * Eily and Monk::Thomas
619
620							For diagnosis of regex problems in http://perlmonks.org/?node_id=1175983.
621
622							=back
623
624							=head1 COPYRIGHT
625
626							Copyright (c) 2016-2019 James E. Keenan. United States. All rights reserved.
627							This is free software and may be distributed under the same terms as Perl
628							itself.
629
630							=cut
631
632							1;
633