File Coverage

blib/lib/Devel/Git/MultiBisect.pm
Criterion Covered Total %
statement 29 127 22.8
branch 0 50 0.0
condition 0 3 0.0
subroutine 10 19 52.6
pod 5 5 100.0
total 44 204 21.5


line stmt bran cond sub pod time code
1             package Devel::Git::MultiBisect;
2 4     4   2099 use strict;
  4         11  
  4         117  
3 4     4   22 use warnings;
  4         8  
  4         108  
4 4     4   42 use v5.14.0;
  4         16  
5 4     4   1816 use Devel::Git::MultiBisect::Init;
  4         15  
  4         204  
6 4         316 use Devel::Git::MultiBisect::Auxiliary qw(
7             clean_outputfile
8             hexdigest_one_file
9             validate_list_sequence
10 4     4   2116 );
  4         14  
11 4     4   29 use Carp;
  4         9  
  4         235  
12 4     4   28 use Cwd;
  4         9  
  4         211  
13 4     4   24 use File::Spec;
  4         9  
  4         77  
14 4     4   23 use File::Temp;
  4         7  
  4         286  
15 4     4   23 use List::Util qw(sum);
  4         6  
  4         5857  
16              
17             our $VERSION = '0.16';
18              
19             =head1 NAME
20              
21             Devel::Git::MultiBisect - Study build and test output over a range of F commits
22              
23             =head1 SYNOPSIS
24              
25             You will typically construct an object of a class which is a child of
26             F, such as F or
27             F. All methods documented in this
28             parent package may be called from either child class.
29              
30             use Devel::Git::MultiBisect::AllCommits;
31             $self = Devel::Git::MultiBisect::AllCommits->new(\%parameters);
32              
33             ... or
34              
35             use Devel::Git::MultiBisect::Transitions;
36             $self = Devel::Git::MultiBisect::Transitions->new(\%parameters);
37              
38             ... and then:
39              
40             $commit_range = $self->get_commits_range();
41              
42             $full_targets = $self->set_targets(\@target_args);
43              
44             $outputs = $self->run_test_files_on_one_commit($commit_range->[0]);
45              
46             ... followed by methods specific to the child class.
47              
48             ... and then perhaps also:
49              
50             $timings = $self->get_timings();
51              
52             =head1 DESCRIPTION
53              
54             Given a Perl library or application kept in F for version control, it is
55             often useful to be able to compare the output collected from running one or
56             more test files over a range of F commits. If that range is sufficiently
57             large, a test may fail in B over that range.
58              
59             If that is the case, then simply asking, I<"When did this file start to
60             fail?"> -- a question which C is designed to answer -- is
61             insufficient. In order to identify more than one point of failure, we may
62             need to (a) capture the test output for each commit; or, (b) capture the test
63             output only at those commits where the output changed. The output of a run of
64             a test file may change for a variety of reasons: test failures, segfaults,
65             changes in the number or content of tests, etc.
66              
67             F provides methods to achieve that objective. Its
68             child classes, F and
69             F, provide different flavors of that
70             functionality for objectives (a) and (b), respectively. Please refer to their
71             documentation for further discussion.
72              
73             =head2 GLOSSARY
74              
75             =over 4
76              
77             =item * B
78              
79             A source code change set entered ("committed") to a F repository. Each
80             commit is denoted by a SHA. In this library, whenever a commit is called for
81             as the argument to a function, you can also use a F.
82              
83             =item * B
84              
85             The range of sequential commits (determined by F) requested for analysis.
86              
87             =item * B
88              
89             A test file from the test suite of the application or library under study.
90              
91             =item * B
92              
93             What is sent to STDOUT or STDERR as a result of calling a test program such as
94             F or F on an individual target file. Currently we assume
95             that all such test programs are written based on the
96             L.
97              
98             =item * B
99              
100             A commit at which the test output for a given target changes from that of the
101             commit immediately preceding.
102              
103             =item * B
104              
105             A string holding the output of a cryptographic process run on test output
106             which uniquely identifies that output. (Currently, we use the
107             C algorithm.) We assume that if the test output does
108             not change between one or more commits, then that commit is not a transitional
109             commit.
110              
111             Note: Before taking a digest on a particular test output, we exclude text
112             such as timings which are highly likely to change from one run to the next and
113             which would introduce spurious variability into the digest calculations.
114              
115             =item * B or B
116              
117             A series of configure-build-test process sequences at those commits within the
118             commit range which are selected by a bisection algorithm.
119              
120             Normally, when we bisect (via F, F or
121             otherwise), we are seeking a single point where a Boolean result -- yes/no,
122             true/false, pass/fail -- is returned. What the test run outputs to STDOUT or
123             STDERR is a lesser concern.
124              
125             In multisection we bisect repeatedly to determine all points where the output
126             of the test command changes -- regardless of whether that change is a C,
127             C or whatever. We capture the output for later human examination.
128              
129             =back
130              
131             =head1 METHODS
132              
133             =head2 C
134              
135             =over 4
136              
137             =item * Purpose
138              
139             Constructor.
140              
141             =item * Arguments
142              
143             $self = Devel::Git::MultiBisect::AllCommits->new(\%params);
144              
145             or
146              
147             $self = Devel::Git::MultiBisect::Transitions->new(\%params);
148              
149             Reference to a hash, typically the return value of
150             C.
151              
152             The hashref passed as argument must contain key-value pairs for C,
153             C and C. C tests for the existence of each of
154             these directories.
155              
156             =item * Return Value
157              
158             Object of Devel::Git::MultiBisect child class.
159              
160             =back
161              
162             =cut
163              
164             sub new {
165 0     0 1   my ($class, $params) = @_;
166              
167 0           my $data = Devel::Git::MultiBisect::Init::init($params);
168              
169 0           return bless $data, $class;
170             }
171              
172             =head2 C
173              
174             =over 4
175              
176             =item * Purpose
177              
178             Identify the SHAs of each F commit identified by C.
179              
180             =item * Arguments
181              
182             $commit_range = $self->get_commits_range();
183              
184             None; all data needed is already in the object.
185              
186             =item * Return Value
187              
188             Array reference, each element of which is a SHA.
189              
190             =back
191              
192             =cut
193              
194             sub get_commits_range {
195 0     0 1   my $self = shift;
196 0           return [ map { $_->{sha} } @{$self->{commits}} ];
  0            
  0            
197             }
198              
199             =head2 C
200              
201             =over 4
202              
203             =item * Purpose
204              
205             Identify the test files which will be run at different points in the commits
206             range. We shall assume that the test file has existed with its name unchanged
207             over the entire commit range.
208              
209             =item * Arguments
210              
211             $target_args = [
212             't/44_func_hashes_mult_unsorted.t',
213             't/45_func_hashes_alt_dual_sorted.t',
214             ];
215             $full_targets = $self->set_targets($target_args);
216              
217             Reference to an array holding the relative paths beneath the C to the
218             test files selected for examination.
219              
220             =item * Return Value
221              
222             Reference to an array holding hash references with these elements:
223              
224             =over 4
225              
226             =item * C
227              
228             Absolute paths to the test files selected for examination. Test file is
229             tested for its existence.
230              
231             =item * C
232              
233             String composed by taking an element in the array ref passed as argument and
234             substituting underscores C(<_>) for forward slash (C) and dot (C<.>)
235             characters. So,
236              
237             t/44_func_hashes_mult_unsorted.t
238              
239             ... becomes:
240              
241             t_44_func_hashes_mult_unsorted_t
242              
243             =back
244              
245             =back
246              
247             =cut
248              
249             sub set_targets {
250 0     0 1   my ($self, $explicit_targets) = @_;
251              
252 0           my @raw_targets = @{$self->{targets}};
  0            
253              
254             # If set_targets() is provided with an appropriate argument
255             # ($explicit_targets), override whatever may have been stored in the
256             # object by new().
257              
258 0 0         if (defined $explicit_targets) {
259 0 0         croak "Explicit targets passed to set_targets() must be in array ref"
260             unless ref($explicit_targets) eq 'ARRAY';
261 0           @raw_targets = @{$explicit_targets};
  0            
262             }
263              
264 0           my @full_targets = ();
265 0           my @missing_files = ();
266 0           for my $rt (@raw_targets) {
267 0           my $ft = File::Spec->catfile($self->{gitdir}, $rt);
268 0 0         if (! -e $ft) { push @missing_files, $ft; next }
  0            
  0            
269 0           my $stub;
270 0           ($stub = $rt) =~ s{[./]}{_}g;
271 0           push @full_targets, {
272             path => $ft,
273             stub => $stub,
274             };
275             }
276 0 0         if (@missing_files) {
277 0           croak "Cannot find file(s) to be tested: @missing_files";
278             }
279 0           $self->{targets} = [ @full_targets ];
280 0           return \@full_targets;
281             }
282              
283             =head2 C
284              
285             =over 4
286              
287             =item * Purpose
288              
289             Capture the output from running the selected test files at one specific F checkout.
290              
291             =item * Arguments
292              
293             $outputs = $self->run_test_files_on_one_commit("2a2e54a");
294              
295             or
296              
297             $excluded_targets = [
298             't/45_func_hashes_alt_dual_sorted.t',
299             ];
300             $outputs = $self->run_test_files_on_one_commit("2a2e54a", $excluded_targets);
301              
302             =over 4
303              
304             =item 1
305              
306             String holding the SHA from a single commit in the repository. This string
307             would typically be one of the elements in the array reference returned by
308             C<$self->get_commits_range()>. If no argument is provided, the method will
309             default to using the first element in the array reference returned by
310             C<$self->get_commits_range()>.
311              
312             =item 2
313              
314             Reference to array of target test files to be excluded from a particular
315             invocation of this method. Optional, but will die if argument is not an array
316             reference.
317              
318             =back
319              
320             =item * Return Value
321              
322             Reference to an array, each element of which is a hash reference with the
323             following elements:
324              
325             =over 4
326              
327             =item * C
328              
329             String holding the SHA from the commit passed as argument to this method (or
330             the default described above).
331              
332             =item * C
333              
334             String holding the value of C (above) to the number of characters
335             specified in the C element passed to the constructor; defaults to 7.
336              
337             =item * C
338              
339             String holding a rewritten version of the relative path beneath C of
340             the test file being run. In this relative path forward slash (C) and dot
341             (C<.>) characters are changed to underscores C(<_>). So,
342              
343             t/44_func_hashes_mult_unsorted.t
344              
345             ... becomes:
346              
347             t_44_func_hashes_mult_unsorted_t'
348              
349             =item * C
350              
351             String holding the full path to the file holding the TAP output collected
352             while running one test file at the given commit. The following example shows
353             how that path is calculated. Given:
354              
355             output directory (outputdir) => '/tmp/DQBuT_SRAY/'
356             SHA (commit) => '2a2e54af709f17cc6186b42840549c46478b6467'
357             shortened SHA (commit_short) => '2a2e54a'
358             test file (target->[$i]) => 't/44_func_hashes_mult_unsorted.t'
359              
360             ... the file is placed in the directory specified by C. We then
361             join C (the shortened SHA), C (the rewritten relative
362             path) and the strings C and C with a dot to yield this value for
363             the C element:
364              
365             2a2e54a.t_44_func_hashes_mult_unsorted_t.output.txt
366              
367             =item * C
368              
369             String holding the return value of
370             C run with the file
371             designated by the C element as an argument. (More precisely, the file
372             as modified by C.)
373              
374             =back
375              
376             Example:
377              
378             [
379             {
380             commit => "2a2e54af709f17cc6186b42840549c46478b6467",
381             commit_short => "2a2e54a",
382             file => "/tmp/1mVnyd59ee/2a2e54a.t_44_func_hashes_mult_unsorted_t.output.txt",
383             file_stub => "t_44_func_hashes_mult_unsorted_t",
384             md5_hex => "31b7c93474e15a16d702da31989ab565",
385             },
386             {
387             commit => "2a2e54af709f17cc6186b42840549c46478b6467",
388             commit_short => "2a2e54a",
389             file => "/tmp/1mVnyd59ee/2a2e54a.t_45_func_hashes_alt_dual_sorted_t.output.txt",
390             file_stub => "t_45_func_hashes_alt_dual_sorted_t",
391             md5_hex => "6ee767b9d2838e4bbe83be0749b841c1",
392             },
393             ]
394              
395             =item * Comment
396              
397             In this method's current implementation, we start with a C from
398             the repository at the specified C. We configure (I C
399             Makefile.PL>) and build (I C) the source code. We then test each
400             of the test files we have targeted (I C
401             relative/path/to/test_file.t>). We redirect both STDOUT and STDERR to
402             C, clean up the outputfile to remove the line containing timings
403             (as that introduces unwanted variability in the C values) and compute
404             the digest.
405              
406             This implementation is very much subject to change.
407              
408             If a true value for C has been passed to the constructor, the method
409             prints C to STDOUT before returning.
410              
411             B While this method is publicly documented, in actual use you probably
412             will not need to call it directly. Instead, you will probably use either
413             C or
414             C.
415              
416             =back
417              
418             =cut
419              
420             sub run_test_files_on_one_commit {
421 0     0 1   my ($self, $commit, $excluded_targets) = @_;
422 0   0       $commit //= $self->{commits}->[0]->{sha};
423 0 0         say "Testing commit: $commit" if ($self->{verbose});
424              
425 0 0         if (defined $excluded_targets) {
426 0 0         if (ref($excluded_targets) ne 'ARRAY') {
427 0           croak "excluded_targets, if defined, must be in array reference";
428             }
429             }
430             else {
431 0           $excluded_targets = [];
432             }
433 0           my %excluded_targets;
434 0           for my $t (@{$excluded_targets}) {
  0            
435 0           my $ft = File::Spec->catfile($self->{gitdir}, $t);
436 0           $excluded_targets{$ft}++;
437             }
438              
439             my $current_targets = [
440 0           grep { ! exists $excluded_targets{$_->{path}} }
441 0           @{$self->{targets}}
  0            
442             ];
443              
444 0           my $starting_branch = $self->_configure_build_one_commit($commit);
445              
446 0           my $outputsref = $self->_test_one_commit($commit, $current_targets);
447             say "Tested commit: $commit; returning to: $starting_branch"
448 0 0         if ($self->{verbose});
449              
450             # We want to return to our basic branch (e.g., 'master', 'blead')
451             # before checking out a new commit.
452              
453 0 0         system(qq|git checkout --quiet $starting_branch|)
454             and croak "Unable to 'git checkout --quiet $starting_branch";
455              
456 0           $self->{commit_counter}++;
457 0 0         say "Commit counter: $self->{commit_counter}" if $self->{verbose};
458              
459 0           return $outputsref;
460             }
461              
462             sub _configure_one_commit {
463 0     0     my ($self, $commit) = @_;
464 0 0         chdir $self->{gitdir} or croak "Unable to change to $self->{gitdir}";
465 0 0         system(qq|git clean --quiet -dfx|) and croak "Unable to 'git clean --quiet -dfx'";
466 0           my $starting_branch = $self->{branch};
467              
468 0 0         system(qq|git checkout --quiet $commit|) and croak "Unable to 'git checkout --quiet $commit'";
469 0 0         say "Running '$self->{configure_command}'" if $self->{verbose};
470 0 0         system($self->{configure_command}) and croak "Unable to run '$self->{configure_command})'";
471 0           return $starting_branch;
472             }
473              
474             sub _configure_build_one_commit {
475 0     0     my ($self, $commit) = @_;
476              
477 0           my $starting_branch = $self->_configure_one_commit($commit);
478              
479 0 0         say "Running '$self->{make_command}'" if $self->{verbose};
480 0 0         system($self->{make_command}) and croak "Unable to run '$self->{make_command})'";
481              
482 0           return $starting_branch;
483             }
484              
485             sub _test_one_commit {
486 0     0     my ($self, $commit, $current_targets) = @_;
487 0           my $short = substr($commit,0,$self->{short});
488 0           my @outputs;
489 0           for my $target (@{$current_targets}) {
  0            
490             my $outputfile = File::Spec->catfile(
491             $self->{outputdir},
492             join('.' => (
493             $short,
494             $target->{stub},
495 0           'output',
496             'txt'
497             )),
498             );
499 0           my $command_raw = $self->{test_command};
500 0           my $cmd;
501 0 0         unless ($command_raw eq 'harness') {
502 0           $cmd = qq|$command_raw $target->{path} >$outputfile 2>&1|;
503             }
504             else {
505 0           $cmd = qq|cd t; ./perl harness -v $target->{path} >$outputfile 2>&1; cd -|;
506             }
507 0 0         say "Running '$cmd'" if $self->{verbose};
508 0 0         system($cmd) and croak "Unable to run test_command";
509 0           $outputfile = clean_outputfile($outputfile);
510             push @outputs, {
511             commit => $commit,
512             commit_short => $short,
513             file => $outputfile,
514             file_stub => $target->{stub},
515 0           md5_hex => hexdigest_one_file($outputfile),
516             };
517 0 0         say "Created $outputfile" if $self->{verbose};
518             }
519 0           return \@outputs;
520             }
521              
522             sub _bisection_decision {
523 0     0     my ($self, $target_h_md5_hex, $current_start_md5_hex, $h, $relevant_self,
524             $overall_end_md5_hex, $current_start_idx, $current_end_idx, $max_idx, $n) = @_;
525 0 0         if ($target_h_md5_hex ne $current_start_md5_hex) {
526 0           my $g = $h - 1;
527 0           $self->_run_one_commit_and_assign($g);
528 0           my $target_g_md5_hex = $relevant_self->[$g]->{md5_hex};
529 0 0         if ($target_g_md5_hex eq $current_start_md5_hex) {
530 0 0         if ($target_h_md5_hex eq $overall_end_md5_hex) {
531             }
532             else {
533 0           $current_start_idx = $h;
534 0           $current_end_idx = $max_idx;
535             }
536 0           $n++;
537             }
538             else {
539             # Bisection should continue downwards
540 0           $current_end_idx = $h;
541 0           $n++;
542             }
543             }
544             else {
545             # Bisection should continue upwards
546 0           $current_start_idx = $h;
547 0           $n++;
548             }
549 0           return ($current_start_idx, $current_end_idx, $n);
550             }
551              
552             =head2 C
553              
554             =over 4
555              
556             =item * Purpose
557              
558             Get information on the time a multisection took to run.
559              
560             =item * Arguments
561              
562             None; all data needed is already in the object.
563              
564             =item * Return Value
565              
566             Hash reference. The selection of elements in this hashref will depend on
567             which subclass of F you are using and may differ among
568             subclasses. Example:
569              
570             { elapsed => 4297, mean => 186.83, runs => 23 }
571              
572             In this example (taken from a run of one test file over 220 commits in Perl 5
573             blead), 23 runs were needed to achieve a result. These took 4297 seconds
574             (approximately 71 minutes) with a mean run time of approximately 3 minutes
575             each.
576              
577             Method will return undefined value if timings are not yet available within the
578             object.
579              
580             =back
581              
582             =cut
583              
584             sub get_timings {
585 0     0 1   my $self = shift;
586 0 0         return unless exists $self->{timings};
587 0           return $self->{timings};
588             }
589              
590             =head1 SUPPORT
591              
592             Please report any bugs by mail to C
593             or through the web interface at L.
594              
595             =head1 AUTHOR
596              
597             James E. Keenan (jkeenan at cpan dot org). When sending correspondence, please
598             include 'Devel::Git::MultiBisect' or 'Devel-Git-MultiBisect' in your subject line.
599              
600             Creation date: October 12 2016. Last modification date: August 25 2021.
601              
602             Development repository: L
603              
604             =head1 ACKNOWLEDGEMENTS
605              
606             Thanks to the following contributors and reviewers:
607              
608             =over 4
609              
610             =item * Smylers
611              
612             For naming suggestion: L
613              
614             =item * Ricardo Signes
615              
616             For feedback during initial development.
617              
618             =item * Eily and Monk::Thomas
619              
620             For diagnosis of regex problems in http://perlmonks.org/?node_id=1175983.
621              
622             =back
623              
624             =head1 COPYRIGHT
625              
626             Copyright (c) 2016-2019 James E. Keenan. United States. All rights reserved.
627             This is free software and may be distributed under the same terms as Perl
628             itself.
629              
630             =cut
631              
632             1;
633