File Coverage

blib/lib/ETL/Pipeline/Output.pm
Criterion Covered Total %
statement 11 11 100.0
branch n/a
condition n/a
subroutine 4 4 100.0
pod n/a
total 15 15 100.0


line stmt bran cond sub pod time code
1             =pod
2              
3             =head1 NAME
4              
5             ETL::Pipeline::Output - Role for ETL::Pipeline output destinations
6              
7             =head1 SYNOPSIS
8              
9             use Moose;
10             with 'ETL::Pipeline::Output';
11              
12             sub open {
13             # Add code to open the output destination
14             ...
15             }
16             sub write {
17             # Add code to save your data here
18             ...
19             }
20             sub close {
21             # Add code to close the destination
22             ...
23             }
24              
25             =head1 DESCRIPTION
26              
27             An I<output destination> fulfills the B<load> part of B<ETL>. This is where the
28             data ends up. These are the outputs of the process.
29              
30             A destination can be anything - database, file, or anything. Destinations are
31             customized to your environment. And you will probably only have a few.
32              
33             L<ETL::Pipeline> interacts with the output destination is 3 stages...
34              
35             =over
36              
37             =item 1. Open - connect to the database, open the file, whatever setup is appropriate for your destination.
38              
39             =item 2. Write - called once per record. This is the part that actually performs the output.
40              
41             =item 3. Close - finished processing and cleanly shut down the destination.
42              
43             =back
44              
45             This role sets the requirements for these 3 methods. It should be consumed by
46             B<all> output destination classes. L<ETL::Pipeline> relies on the destination
47             having this role.
48              
49             =head2 How do I create an output destination?
50              
51             L<ETL::Pipeline> provides a couple generic output destinations as exmaples or
52             for very simple uses. The real value of L<ETL::Pipeline> comes from adding your
53             own, business specific, destinations...
54              
55             =over
56              
57             =item 1. Start a new Perl module. I recommend putting it in the C<ETL::Pipeline::Output> namespace. L<ETL::Pipeline> will pick it up automatically.
58              
59             =item 2. Make your module a L<Moose> class - C<use Moose;>.
60              
61             =item 3. Consume this role - C<with 'ETL::Pipeline::Output';>.
62              
63             =item 4. Write the L</open>, L</close>, and L</write> methods.
64              
65             =item 5. Add any attributes for your class.
66              
67             =back
68              
69             The new destination is ready to use, like this...
70              
71             $etl->output( 'YourNewDestination' );
72              
73             You can leave off the leading B<ETL::Pipeline::Output::>.
74              
75             When L<ETL::Pipeline> calls L</open> or L</close>, it passes the
76             L<ETL::Pipeline> object as the only parameter. When L<ETL::Pipeline> calls
77             L</write>, it passed two parameters - the L<ETL::Pipeline> object and the
78             record. The record is a Perl hash.
79              
80             =head2 Example destinations
81              
82             L<ETL::Pipeline> comes with a couple of generic output destinations...
83              
84             =over
85              
86             =item L<ETL::Pipeline::Output::Hash>
87              
88             Stores records in a Perl hash. Useful for loading support files and tying
89             them together later.
90              
91             =item L<ETL::Pipeline::Output::Perl>
92              
93             Executes a subroutine against the record. Useful for debugging data issues.
94              
95             =back
96              
97             =head2 Why this way?
98              
99             My work involves a small number of destinations that rarely change and a greater
100             number of sources that do change. So I designed L<ETL::Pipeline> to minimize
101             time writing new input sources. The trade off was slightly more complex output
102             destinations.
103              
104             =head2 Upgrading from older versions
105              
106             L<ETL::Pipeline> version 3 is not compatible with output destinations from older
107             versions. You will need to rewrite your custom output destinations.
108              
109             =over
110              
111             =item Change the C<configure> to L</open>.
112              
113             =item Change C<finish> to L</close>.
114              
115             =item Change C<write_record> to L</write>.
116              
117             =item Remove C<set> and C<new_record>. All records are Perl hashes.
118              
119             =item Adjust attributes as necessary.
120              
121             =back
122              
123             =cut
124              
125             package ETL::Pipeline::Output;
126              
127 10     10   7773 use 5.014000;
  10         38  
128 10     10   55 use warnings;
  10         23  
  10         330  
129              
130 10     10   129 use Moose::Role;
  10         25  
  10         97  
131              
132              
133             our $VERSION = '3.00';
134              
135              
136             =head1 METHODS & ATTRIBUTES
137              
138             =head3 close
139              
140             Shut down the ouput destination. This method may close files, disconnect from
141             the database, or anything else required to cleanly terminate the output.
142              
143             B<close> receives one parameter - the L<ETL::Pipeline> object.
144              
145             The output destination is closed B<after> the input source, at the end of the
146             B<ETL> process.
147              
148             =cut
149              
150             requires 'close';
151              
152              
153             =head3 open
154              
155             Prepare the output destination for use. It can open files, make database
156             connections, or anything else required to access the destination.
157              
158             B<open> receives one parameter - the L<ETL::Pipeline> object.
159              
160             The output destination is opened B<before> the input source, at the beginning
161             of the B<ETL> process.
162              
163             =cut
164              
165             requires 'open';
166              
167              
168             =head3 write
169              
170             Send a single record to the destination. The ETL process calls this method in a
171             loop. It receives two parameters - the L<ETL::Pipeline> object, and the current
172             record as a Perl hash.
173              
174             If your code encounters an error, B<write> can call L<ETL::Pipeline/error> with
175             the error message. L<ETL::Pipeline/error> automatically includes the record
176             count with the error message. You should add any other troubleshooting
177             information such as file names or key fields.
178              
179             sub write {
180             my ($self, $etl, $record) = @_;
181             my $id = $record->{ID};
182             $etl->error( "Error message here for id $id" );
183             }
184              
185             For fatal errors, I recommend using the C<croak> command from L<Carp>.
186              
187             =cut
188              
189             requires 'write';
190              
191              
192             =head1 SEE ALSO
193              
194             L<ETL::Pipeline>, L<ETL::Pipeline::Input>, L<ETL::Pipeline::Output::Hash>,
195             L<ETL::Pipeline::Output::Perl>, L<ETL::Pipeline::Output::UnitTest>
196              
197             =head1 AUTHOR
198              
199             Robert Wohlfarth <robert.j.wohlfarth@vumc.org>
200              
201             =head1 LICENSE
202              
203             Copyright 2021 (c) Vanderbilt University
204              
205             This program is free software; you can redistribute it and/or modify it under
206             the same terms as Perl itself.
207              
208             =cut
209              
210 10     10   32337 no Moose;
  10         26  
  10         183  
211              
212             # Required by Perl to load the module.
213             1;