File Coverage

lib/Log/Unrotate.pm
Criterion Covered Total %
statement 210 217 96.7
branch 102 118 86.4
condition 39 52 75.0
subroutine 27 27 100.0
pod 7 7 100.0
total 385 421 91.4


line stmt bran cond sub pod time code
1             package Log::Unrotate;
2             {
3             $Log::Unrotate::VERSION = '1.32';
4             }
5              
6 1     1   385657 use strict;
  1         3  
  1         42  
7 1     1   8 use warnings;
  1         1  
  1         147  
8              
9             =head1 NAME
10              
11             Log::Unrotate - Incremental log reader with a transparent rotation handling
12              
13             =head1 VERSION
14              
15             version 1.32
16              
17             =head1 SYNOPSIS
18              
19             use Log::Unrotate;
20              
21             my $reader = Log::Unrotate->new({
22             log => 'xxx.log',
23             pos => 'xxx.pos',
24             });
25              
26             my $line = $reader->read();
27             my $another_line = $reader->read();
28              
29             $reader->commit(); # serialize the position on disk into 'pos' file
30              
31             my $position = $reader->position();
32             $reader->read();
33             $reader->read();
34             $reader->commit($position); # rollback the last 2 reads
35              
36             my $lag = $reader->lag();
37              
38             =head1 DESCRIPTION
39              
40             C allows you to read any log file incrementally and transparently.
41              
42             B means that you can store store the reading position to the special file ("pos-file") using C, restart the process, and then continue from where you left.
43              
44             B means that C automatically jumps from one log to the next. For example, if you were reading I, then stored the position and left for a day, and then while you were away, I got renamed to I, while the new I got some new content in it, C will find the right log and give you the remaining lines from I before moving to I.
45              
46             Even better, it will do the right thing even if the log rotation happens while you were reading the log.
47              
48             C tries really hard to never skip any data from logs. If it's not sure about what to do, it throws an exception. This is an extremely rare situation, and it is a good default for building a simple and robust message queue on top of this class, but if you prefer a quick-and-dirty recovering, you can enable I option.
49              
50             =cut
51              
52 1     1   7 use Carp;
  1         2  
  1         68  
53              
54 1     1   5 use IO::Handle;
  1         1  
  1         40  
55 1     1   7 use Digest::MD5 qw(md5_hex);
  1         1  
  1         73  
56              
57 1     1   5 use Fcntl qw(SEEK_SET SEEK_CUR SEEK_END);
  1         2  
  1         59  
58 1     1   7402 use Log::Unrotate::Cursor::File;
  1         4  
  1         312  
59 1     1   7737 use Log::Unrotate::Cursor::Null;
  1         3  
  1         22777  
60              
61             sub _defaults ($) {
62 96     96   196 my ($class) = @_;
63             return {
64 96         2220 start => 'begin',
65             lock => 'none',
66             end => 'fixed',
67             check_inode => 0,
68             check_lastline => 1,
69             check_log => 0,
70             autofix_cursor => 0,
71             rollback_period => 300,
72             };
73             }
74              
75             our %_start_values = map { $_ => 1 } qw(begin end first);
76             our %_end_values = map { $_ => 1 } qw(fixed future);
77              
78             =head1 METHODS
79              
80             =over
81              
82             =cut
83              
84             =item B<< new($params) >>
85              
86             Creates a new unrotate object.
87              
88             =over
89              
90             =item I
91              
92             Name of a file to store log reading position. Will be created automatically if missing.
93              
94             Value '-' means not to use a position file. I.e., pretend it doesn't exist at the start and ignore commit calls.
95              
96             =item I
97              
98             Instead of C file, you can specify any custom cursor. See C for the cursor API details.
99              
100             =item I
101              
102             Recreate a cursor if it's broken.
103              
104             Warning will be printed on recovery.
105              
106             =item I
107              
108             Time period in seconds.
109             If I is greater than 0, C method will save some positions history (at least one previous position older then I would be preserved)
110             to allow recovery when the last position is broken for some reason. (Position may sometimes become invalid because of the host's hard reboot.)
111              
112             The feature is enabled by default (with value 300), set to 0 to disable, or set to some greater value if your heavily-loaded host is not flushing its filesystem buffers on disk this often.
113              
114             =item I
115              
116             Name of a log file. Value C<-> means standard input stream.
117              
118             =item I
119              
120             Describes the initialization behavior of new cursors. Allowed values: C (default), C, C.
121              
122             =over 4
123              
124             =item *
125              
126             When I is C, we'll read current I from the beginning.
127              
128             =item *
129              
130             When I is C, we'll put current position in C at the end (useful for big files when some new script don't need to read everything).
131              
132             =item *
133              
134             When I is C, C will start from the oldest log file available.
135              
136             =back
137              
138             I.e., if there are I, I, and I, C will start from the top of I, C will skip to the bottom of I, while C will start from the top of I.
139              
140             =item I
141              
142             Describes the reading behavior when we reach the end of a log. Allowed values: C (default), C.
143              
144             =over 4
145              
146             =item *
147              
148             When I is C, the log is read up to the position where it ended when the reader object was created. This is the default, so you don't wait in a reading loop indefinitely because somebody keeps adding new lines to the log.
149              
150             =item *
151              
152             When I is C, it allows the reading of the part of the log that was appended after the reader was created (useful for reading from stdin).
153              
154             =back
155              
156             =item I
157              
158             Describes the locking behaviour. Allowed values: C (default), C, C.
159              
160             =over 4
161              
162             =item *
163              
164             When I is C, lock named I.lock will be acquired in the blocking mode.
165              
166             =item *
167              
168             When I is C, lock named I.lock will be acquired in the nonblocking mode; if lock file is already locked, exception will be raised.
169              
170             =back
171              
172             =item I
173              
174             This flag is set by default. It enables content checks when detecting log rotations. There is actually no reason to disable this option.
175              
176             =item I
177              
178             Enable inode checks when detecting log rotations. This option should not be enabled when retrieving logs via rsync or some other way which modifies inodes.
179              
180             This flag is disabled by default, because I is superior and should be enough for finding the right file.
181              
182             =back
183              
184             =cut
185             sub new ($$)
186             {
187 96     96 1 431436 my ($class, $args) = @_;
188 96         386 my $self = {
189 96         373 %{$class->_defaults()},
190             %$args,
191             };
192              
193 96 100       944 croak "unknown start value: '$self->{start}'" unless $_start_values{$self->{start}};
194 95 100       716 croak "unknown end value: '$self->{end}'" unless $_end_values{$self->{end}};
195 94 100 100     2284 croak "either check_inode or check_lastline should be on" unless $self->{check_inode} or $self->{check_lastline};
196              
197 93         264 bless $self => $class;
198              
199 93 50 33     1546 if ($self->{pos} and $self->{cursor}) {
200 0         0 croak "only one of 'pos' and 'cursor' should be specified";
201             }
202 93 50 33     796 unless ($self->{pos} or $self->{cursor}) {
203 0         0 croak "one of 'pos' and 'cursor' should be specified";
204             }
205              
206 93         285 my $posfile = delete $self->{pos};
207 93 50       485 if ($posfile) {
208 93 100       1377 if ($posfile eq '-') {
209 3 100       34 croak "Log not specified and posfile is '-'" if not defined $self->{log};
210 2         19 $self->{cursor} = Log::Unrotate::Cursor::Null->new();
211             }
212             else {
213 90 50 66     16152 croak "Log not specified and posfile is not found" if not defined $self->{log} and not -e $posfile;
214 90         6810 $self->{cursor} = Log::Unrotate::Cursor::File->new($posfile, { lock => $self->{lock}, rollback_period => $self->{rollback_period} });
215             }
216             }
217              
218 87         444 my $pos = $self->{cursor}->read();
219 87 100       568 if ($pos) {
220 40         431 my $logfile = delete $pos->{LogFile};
221 40 100       119 if ($self->{log}) {
222 38 50 66     320 die "logfile mismatch: $logfile ne $self->{log}" if $self->{check_log} and $logfile and $self->{log} ne $logfile;
      66        
223             } else {
224 2 100       91 $self->{log} = $logfile or die "'logfile:' not found in cursor $self->{cursor} and log not specified";
225             }
226             }
227              
228 85         378 $self->_set_last_log_number();
229 85         271 $self->_set_eof();
230              
231 85 100       291 if ($pos) {
232 38         59 my $error;
233 38         71 while () {
234 40         74 eval {
235 40         638 $self->_find_log($pos);
236             };
237 40         77 $error = $@;
238 40 100       101 last unless $error;
239 8 100       44 last unless $self->{cursor}->rollback();
240 2         7 $pos = $self->{cursor}->read();
241             }
242 38 100       110 if ($error) {
243 6 100       16 if ($self->{autofix_cursor}) {
244 1         10 warn $error;
245 1         8 warn "autofix_cursor is enabled, cleaning $self->{cursor}";
246 1         9 $self->{cursor}->clean();
247 1         7 $self->_start();
248             }
249             else {
250 5         104 die $error;
251             }
252             }
253             } else {
254 47         247 $self->_start();
255             }
256              
257 80         599 return $self;
258             }
259              
260             sub _seek_end_pos ($$) {
261 62     62   86 my $self = shift;
262 62         83 my ($handle) = @_;
263              
264 62         359 seek $handle, -1, SEEK_END;
265 62         469 read $handle, my $last_byte, 1;
266 62 100       208 if ($last_byte eq "\n") {
267 54         334 return tell $handle;
268             }
269              
270 8         19 my $position = tell $handle;
271 8         12 while (1) {
272             # we have reached beginning of the file and haven't found "\n"
273 8 100       26 return 0 if $position == 0;
274              
275 4         7 my $read_portion = 1024;
276 4 50       15 $read_portion = $position if ($position < $read_portion);
277 4         23 seek $handle, -$read_portion, SEEK_CUR;
278 4         5 my $data;
279 4         25 read $handle, $data, $read_portion;
280 4 50       25 if ($data =~ /\n(.*)\z/) { # match *last* \n
281 4         11 my $len = length $1;
282 4         21 seek $handle, $position, SEEK_SET;
283 4         14 return $position - $len;
284             }
285 0         0 seek $handle, -$read_portion, SEEK_CUR;
286 0         0 $position -= $read_portion;
287             }
288             }
289              
290             sub _find_end_pos ($$) {
291 61     61   92 my $self = shift;
292 61         78 my ($handle) = @_;
293              
294 61         108 my $tell = tell $handle;
295 61         298 my $end = $self->_seek_end_pos($handle);
296 61         466 seek $handle, $tell, SEEK_SET;
297 61         587 return $end;
298             }
299              
300             sub _get_last_line ($) {
301 79     79   171 my ($self) = @_;
302 79         125 my $handle = $self->{Handle};
303 79         216 my $number = $self->{LogNumber};
304 79 50       227 my $position = tell $handle if $handle;
305              
306 79 100       166 unless ($position) { # 'if' not 'while'!
307 2         2 $number++;
308 2         6 my $log = $self->_log_file($number);
309 2         4 undef $handle; # need this to keep $self->{Handle} unmodified!
310 2 50       62 open $handle, '<', $log or return ""; # missing prev log
311 0         0 $position = $self->_seek_end_pos($handle);
312             }
313              
314 77         100 my $backstep = 256; # 255 + "\n"
315 77 100       189 $backstep = $position if $backstep > $position;
316 77         465 seek $handle, -$backstep, SEEK_CUR;
317 77         100 my $last_line;
318 77         834 read $handle, $last_line, $backstep;
319 77         346 return $last_line;
320             }
321              
322             sub _last_line ($) {
323 122     122   189 my ($self) = @_;
324 122   100     973 my $last_line = $self->{LastLine} || $self->_get_last_line();
325 122 50       948 $last_line =~ /(.{0,255})$/ and $last_line = $1;
326 122         552 return $last_line;
327             }
328              
329             # pos not found, reading log for the first time
330             sub _start($)
331             {
332 48     48   91 my $self = shift;
333 48         109 $self->{LogNumber} = 0;
334 48 100       442 if ($self->{start} eq 'end') { # move to the end of file
    100          
    50          
335 2         6 $self->_reopen(0);
336 2 100       73 $self->_seek_end_pos($self->{Handle}) if $self->{Handle};
337             } elsif ($self->{start} eq 'begin') { # move to the beginning of last file
338 43         192 $self->_reopen(0);
339             } elsif ($self->{start} eq 'first') { # find oldest file
340 3         8 $self->{LogNumber} = $self->{LastLogNumber};
341 3         8 $self->_reopen(0);
342             } else {
343 0         0 die; # impossible
344             }
345             }
346              
347             sub _reopen ($$)
348             {
349 185     185   326 my ($self, $position) = @_;
350              
351 185         461 my $log = $self->_log_file();
352              
353 185 100       11050 if (open my $FILE, "<$log") {
    50          
354 159         5312 my @stat = stat $FILE;
355 159 100       1233 return 0 if $stat[7] < $position;
356 140 100 100     811 return 0 if $stat[7] == 0 and $self->{LogNumber} == 0 and $self->{end} eq 'fixed';
      66        
357 134         1102 seek $FILE, $position, SEEK_SET;
358 134         473 $self->{Handle} = $FILE;
359 134         933 $self->{Inode} = $stat[1];
360 134         1501 return 1;
361              
362             } elsif (-e $log) {
363 0         0 die "log '$log' exists but is unreadable";
364             } else {
365 26         172 return;
366             }
367             }
368              
369             sub _set_last_log_number ($)
370             {
371 137     137   308 my ($self) = @_;
372 137         311 my $log = $self->{log};
373 137 100       14375 my @numbers = sort { $b <=> $a } map { /\.(\d+)$/ ? $1 : () } glob "$log.*";
  24         23763  
  100         1123  
374 137   100     1115 $self->{LastLogNumber} = $numbers[0] || 0;
375             }
376              
377             sub _set_eof ($)
378             {
379 85     85   138 my ($self) = @_;
380 85 100       1495 return unless $self->{end} eq 'fixed';
381 83         2290 my @stat = stat $self->{log};
382 83         604 my $eof = $stat[7];
383 83   100     612 $self->{EOF} = $eof || 0;
384             }
385              
386             sub _log_file ($;$)
387             {
388 191     191   247 my ($self, $number) = @_;
389 191 100       1960 $number = $self->{LogNumber} unless defined $number;
390 191         295 my $log = $self->{log};
391 191 100       800 $log .= ".$number" if $number;
392 191         1289 return $log;
393             }
394              
395             sub _print_position ($$)
396             {
397 10     10   18 my ($self, $pos) = @_;
398 10 50       33 my $lastline = defined $pos->{LastLine} ? $pos->{LastLine} : "[unknown]";
399 10 50       34 my $inode = defined $pos->{Inode} ? $pos->{Inode} : "[unknown]";
400 10 50       32 my $position = defined $pos->{Position} ? $pos->{Position} : "[unknown]";
401 10         16 my $logfile = $self->{log};
402 10         21 my $cursor = $self->{cursor};
403 10         791 return "Cursor: $cursor, LogFile: $logfile, Inode: $inode, Position: $position, LastLine: $lastline";
404             }
405              
406             # look through .log .log.1 .log.2, etc., until we'll find log with correct inode and/or checksum.
407             sub _find_log ($$)
408             {
409 52     52   210 my ($self, $pos) = @_;
410              
411 52         113 undef $self->{LastLine};
412 52         119 $self->_set_last_log_number();
413              
414 52         712 for ($self->{LogNumber} = 0; $self->{LogNumber} <= $self->{LastLogNumber}; $self->{LogNumber}++) {
415 110 100       318 next unless $self->_reopen($pos->{Position});
416 77 100 66     1032 next if ($self->{check_inode} and $pos->{Inode} and $self->{Inode} and $pos->{Inode} ne $self->{Inode});
      66        
      66        
417 73 100 66     972 next if ($self->{check_lastline} and $pos->{LastLine} and $pos->{LastLine} ne $self->_last_line());
      100        
418 42         69 while () {
419             # check if we're at the end of file
420 61 100       222 return 1 if $self->_find_end_pos($self->{Handle}) > tell $self->{Handle};
421              
422 27         218 while () {
423 35 100       169 return 0 if $self->{LogNumber} <= 0;
424 27         58 $self->{LogNumber}--;
425 27 100       70 last if $self->_reopen(0);
426             }
427             }
428             }
429              
430 10         43 die "unable to find the log: ", $self->_print_position($pos);
431             }
432              
433             ################################################# Public methods ######################################################
434              
435             =item B<< read() >>
436              
437             Read a line from the log file.
438              
439             =cut
440             sub read {
441 83     83 1 19500 my $self = shift;
442              
443 83         320 my $line;
444 83         104 while (1) {
445 89         239 my $FILE = $self->{Handle};
446 89 100       237 return undef unless defined $FILE;
447 86 100 100     697 if (defined $self->{EOF} and $self->{LogNumber} == 0) {
448 55         116 my $position = tell $FILE;
449 55 100       262 return undef if $position >= $self->{EOF};
450             }
451 81         1610 $line = <$FILE>;
452 81 100       245 if (defined $line) {
453 70 100       578 if ($line =~ /\n$/) {
454 69         126 last;
455             }
456 1         8 seek $FILE, - length $line, SEEK_CUR;
457             }
458 12 100       587 return undef unless $self->_find_log($self->position());
459             }
460              
461 69         210 $self->{LastLine} = $line;
462 69         393 return $line;
463             }
464              
465             =item B<< position() >>
466              
467             Get your current position in I as an object passible to C.
468              
469             =cut
470             sub position($)
471             {
472 55     55 1 450 my $self = shift;
473 55         131 my $pos = {};
474              
475 55 100       192 if ($self->{Handle}) {
476 52         184 $pos->{Position} = tell $self->{Handle};
477 52         128 $pos->{Inode} = $self->{Inode};
478 52         165 $pos->{LastLine} = $self->_last_line(); # undefined LastLine forces _last_line to backstep
479 52         470 $pos->{LogFile} = $self->{log}; # always .log, not .log.N
480             }
481              
482 55         1395 return $pos;
483             }
484              
485             =item B<< commit() >>
486              
487             =item B<< commit($position) >>
488              
489             Save the current position to the pos-file. You can also save some other position, previosly obtained with C.
490              
491             Pos-file gets commited using a temporary file, so it won't be lost if disk space is depleted.
492              
493             =cut
494             sub commit($;$)
495             {
496 39     39 1 4831 my ($self, $pos) = @_;
497 39   66     216 $pos ||= $self->position();
498 39 100       126 return unless defined $pos->{Position}; # pos is missing and log either => do nothing
499              
500 38         228 $self->{cursor}->commit($pos);
501             }
502              
503             =item B<< lag() >>
504              
505             Get the size of data remaining to be read, in bytes.
506              
507             It takes all log files into account, so if you're in the middle of I, it will return the size of remaining data in it, plus the size of I (if it exists).
508              
509             =cut
510             sub lag ($)
511             {
512 3     3 1 55 my ($self) = @_;
513 3 100       30 die "lag failed: missing log file" unless defined $self->{Handle};
514              
515 2         3 my $lag = 0;
516              
517 2         4 my $number = $self->{LogNumber};
518 2         4 while () {
519 4         10 my @stat = stat $self->_log_file($number);
520 4 50       13 $lag += $stat[7] if @stat;
521 4 100       12 last if $number <= 0;
522 2         4 $number--;
523             }
524              
525 2         6 $lag -= tell $self->{Handle};
526 2         12 return $lag;
527             }
528              
529             =item B<< log_number() >>
530              
531             Get the current log's number.
532              
533             =cut
534             sub log_number {
535 3     3 1 10 my ($self) = @_;
536 3         16 return $self->{LogNumber};
537             }
538              
539             =item B<< log_name() >>
540              
541             Get the log's name. Doesn't contain C<< .N >> postfix even if cursor points to old log file.
542              
543             =cut
544             sub log_name {
545 4     4 1 14 my ($self) = @_;
546 4         94 return $self->{log};
547             }
548              
549             =back
550              
551             =head1 BUGS & CAVEATS
552              
553             To find and open correct log is a race-condition-prone task.
554              
555             This module was used in production environment for many years, and many bugs were found and fixed. The only known case when position file can become broken is when logrotate is invoked twice in *very* short amount of time, which should never be a case.
556              
557             Don't set the I option on virtual hosts, especially on openvz-based ones. If you move your data, inodes of files will change and your position file will become broken. In fact, don't set I at all, it's deprecated.
558              
559             The logrotate config should not use the C option to make that module function properly. If you need to compress logs, set C option too.
560              
561             This module expects the logs to be named I, I, I, etc. Skipping some numbers in the sequence is ok, but postfixes should be *positive integers* to be properly sorted. If you use some other naming scheme, for example, I, you're out of luck. Patches welcome!
562              
563             =head1 AUTHORS
564              
565             Andrei Mishchenko C, Vyacheslav Matjukhin C.
566              
567             =head1 SEE ALSO
568              
569             L - another implementation of the same idea.
570              
571             L - console script for reading logs.
572              
573             =head1 COPYRIGHT
574              
575             Copyright (c) 2006-2013 Yandex LTD. All rights reserved.
576              
577             This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
578              
579             See
580              
581             =cut
582              
583             1;