File Coverage

blib/lib/Excel/Template/XLSX.pm
Criterion Covered Total %
statement 22 24 91.6
branch n/a
condition n/a
subroutine 8 8 100.0
pod n/a
total 30 32 93.7


line stmt bran cond sub pod time code
1             package Excel::Template::XLSX;
2            
3 1     1   13609 use strict;
  1         2  
  1         24  
4 1     1   3 use warnings;
  1         1  
  1         20  
5 1     1   3 use base 'Excel::Writer::XLSX';
  1         4  
  1         1278  
6            
7 1     1   159424 use version; our $VERSION = version->declare("v1.0.7");
  1         1350  
  1         5  
8            
9 1     1   67 use Archive::Zip;
  1         1  
  1         39  
10 1     1   594 use Graphics::ColorUtils 'rgb2hls', 'hls2rgb';
  1         5170  
  1         64  
11 1     1   5 use Scalar::Util 'openhandle';
  1         2  
  1         38  
12 1     1   1874 use XML::Twig;
  0            
  0            
13            
14             1; # Return True from module
15            
16             ###############################################################################
17             sub __podhead {
18            
19             =for pod
20            
21             =head1 NAME
22            
23             Excel-Template-XLSX - Create Excel .xlsx files starting from (one or more) template(s).
24            
25             =head1 SYNOPSIS
26            
27             use Excel::Template::XLSX;
28             my ($self, $workbook) = Excel::Template::XLSX->new('perl.xlsx', 'template1.xlsx', 'template2.xlsx', ...);
29             $self->parse_template();
30            
31             # Add a worksheet, ... and anything else you would do with Excel::Writer::XLSX
32             $worksheet = $workbook->add_worksheet();
33            
34             # Although Excel::Writer::XLSX says the workbook will automatically get
35             # closed during global destruction. This wrapper around Excel::Writer::XLSX may
36             # mess this up, and it is better to specifically close your workbook when you are done.
37             $workbook->close();
38            
39             =head1 DESCRIPTION
40            
41             This module is a companion to
42             L(EWX), or
43             if you prefer, a wrapper to that module. It uses EWX as a base class. It opens
44             an existing spreadsheet file (.xlsx format), and also creates a new EWX object.
45             As it parses the template file(s), it calls EWX methods to re-create the template
46             contents in the EWX object.
47            
48             When parsing is complete, the workbook object is left open for the calling perl
49             script to add additional content.
50            
51             The purpose of this module is to separate the roles of content/presentation vs
52             programming in an Excel document, in much the same way that HTML templating
53             engines work. A user who is knowledgeable in Excel can create an Excel file for
54             use as a template, without requiring the skill set of Perl or
55             Excel::Writer::XLSX. Conversely, the Perl programmer who is creating dynamic
56             content does not need design skills to layout the presentation in the template.
57            
58             =head1 WHAT IT CAN DO
59            
60             Cell Values (strings, numbers, dates, ... )
61             Cell Formulas
62             Cell Hyperlinks
63            
64             Cell Formatting (borders, shading, fonts, font sizes, colors)
65            
66             Column Widths
67             Row Widths
68            
69             Headers and Footers
70            
71             Simple template variables (via callback) See L
72            
73             =head1 WHAT IT CAN NOT DO
74            
75             Excel::Template::Excel can not modify Excel files in place! It is not
76             intended to. Since the parser directly adds content to the EWX workbook object
77             as the contents are parsed, both the template, and the output file must
78             be open at the same time.
79            
80             It may be possible to open the output file to a file handle, and
81             after parsing of the template is complete, write the contents of file
82             over the template. The author has not tried this.
83            
84             It is not the design of this module to faithfully re-create the entire
85             contents of the template file in the EWX output. If you are using this
86             module to rewrite Excel files, you are on your own.
87            
88             These items are completely dropped from the output file:
89            
90             Images in the Sheet
91             Images in Headers/Footers
92             Charts
93             Shapes
94             Themes (gradients, fonts, fills, styles)
95             macros
96             modules (vba code)
97            
98             And probably other things. See the tests (t directory of the distribution)
99             for examples of what does work.
100            
101             =head1 SUBROUTINES AND METHODS
102            
103             =head2 __podhead
104            
105             Dummy subroutine to allow me to hide this pod documentation when using code
106             folding in the editor.
107            
108             =cut
109            
110             }
111             ###############################################################################
112             sub new {
113            
114             =head2 new
115            
116             Creates a new Excel::Template::XLSX object, and also creates a new
117             Excel::Writer::XLSX object. A workbook object is created for the output file.
118            
119             Returns the Template object, and the workbook object. Workbook object is also
120             available as $self->{EWX}; If the caller is only expecting a single
121             return value, then just the $self object is returned.
122            
123             =cut
124            
125             my ( $class, $output_file, @template_files ) = @_;
126             my $self = {
127             FORMATS => [],
128             HYPERLINKS => {},
129             NEED_PROPS => 1,
130             PRINT_AREA => {},
131             PRINT_TITLES => {},
132             MERGED_RANGES => {},
133             SHARED_STRINGS => [],
134             THEMES => [],
135             ZIP => [],
136            
137             template_callback => undef,
138             };
139            
140             # Create a new Excel workbook
141             $self->{EWX} = Excel::Writer::XLSX->new($output_file);
142             if ( defined $self->{EWX} ) {
143             $self->{DEFAULT_FORMAT} = $self->{EWX}->add_format();
144             bless $self, $class;
145             }
146             else {
147             die
148             "Can't create new Excel::Writer::XLSX object using file ($output_file) $!";
149             }
150            
151             foreach my $template_file (@template_files) {
152             my $zip = Archive::Zip->new;
153             if ( openhandle($template_file) ) {
154             bless $template_file, 'IO::File'
155             if ref($template_file) eq 'GLOB'; # sigh
156             my $status = $zip->readFromFileHandle($template_file);
157             unless ( $status == Archive::Zip::AZ_OK ) {
158             warn "Can't open filehandle as a zip file, skipping";
159             $zip = undef;
160             }
161             }
162             elsif ( !ref($template_file) ) {
163             my $status = $zip->read($template_file);
164             unless ( $status == Archive::Zip::AZ_OK ) {
165             $template_file //= '(undef)';
166             warn "Can't open file '$template_file' as a zip file, skipping";
167             $zip = undef;
168             }
169             }
170             else {
171             warn
172             "Argument to 'new' must be a filename or open filehandle. skipping $template_file";
173             $zip = undef;
174             }
175            
176             # Create a list of template files to add to the workbook
177             push @{ $self->{ZIP} }, $zip;
178             }
179             if (wantarray) {
180             return ( $self, $self->{EWX} );
181             }
182             else {
183             return $self;
184             }
185             }
186             ###############################################################################
187             sub parse_template {
188            
189             =head2 parse_template
190            
191             Parses common elements of the Spreadsheet, such as themes, styles, and strings.
192             These are stored in the main object ($self).
193            
194             Finds each sheet in the workbook, and initiates parsing of each sheet.
195            
196             Properties for the created workbook are set from the first template that has
197             properties. Properties in subsequent workbooks are ignored.
198            
199             =cut
200            
201             my $self = shift;
202            
203             my $remap = {
204             title => 'title',
205             subject => 'subject',
206             creator => 'author',
207             keywords => 'keywords',
208             description => 'comments',
209            
210             manager => 'manager',
211             company => 'company',
212             category => 'category',
213             status => 'status',
214             };
215            
216             for my $z ( 0 .. $#{ $self->{ZIP} } ) {
217             my $zip = $self->{ZIP}[$z] // next;
218             $self->{PRINT_TITLES} = {};
219             $self->{SHARED_STRINGS} = [];
220             $self->{FORMATS} = [];
221            
222             my $files = $self->_extract_files($zip);
223            
224             my $callback = $self->{template_callback};
225             my $call = ref($callback) eq 'CODE';
226             if ( $self->{NEED_PROPS} ) {
227             if ( my @core_nodes
228             = $files->{core}->find_nodes('//cp:coreProperties') )
229             {
230             my $core = shift @core_nodes;
231             my %hash = map {
232             my $prop = $core->first_child( "dc:" . $_ )
233             // $core->first_child( "cp:" . $_ );
234             my %pair = ();
235             if ($prop) {
236             my $text = $prop->text();
237             $call and $self->$callback( \$text );
238             %pair = ( $remap->{$_}, $text );
239             }
240             %pair;
241             } keys %$remap;
242             $self->{EWX}->set_properties(%hash);
243             $self->{NEED_PROPS} = 0;
244             }
245             }
246            
247             $self->{THEMES}
248             = $self->_parse_themes( ( values %{ $files->{themes} } )[0] );
249            
250             $self->_parse_styles( $files->{styles} );
251             $self->_parse_shared_strings( $files->{strings} );
252            
253             # Defined Names (includes print area, print titles)
254             map {
255             my $name = $_->att('name') // '';
256             my $address = $_->text();
257            
258             # Print Titles (may contain none, one, or both. Delimited by comma if both supplied)
259             # e.g. Title_Page!$A:$A
260             if ( $name eq '_xlnm.Print_Titles' ) {
261             my @title = split( ',', $address );
262             foreach (@title) {
263             my ( $sheet_name, $range ) = split('!');
264             push @{ $self->{PRINT_TITLES}{$sheet_name} }, $range;
265             }
266            
267             # Print Area (Save it until sheets are processed)
268             }
269             elsif ( $name eq '_xlnm.Print_Area' ) {
270             my @title = split( ',', $address );
271             my ( $sheet_name, $range ) = split( '!', $address );
272             $self->{PRINT_AREA}{$sheet_name} = $range;
273             }
274             else {
275             $self->{EWX}->define_name( $name, $address );
276             }
277             } $files->{workbook}->find_nodes('//definedNames/definedName');
278            
279             # Sheets: Add a worksheet for each sheet in workbook
280             # Rename sheet if template(s) already has a sheet by that name
281             map {
282             my $name = $_->att('name');
283             my $test = $name;
284             for ( my $i = 1; ; $i++ ) {
285             last unless $self->{EWX}->get_worksheet_by_name($test);
286             $test = $name . "($i)";
287             }
288             my $sheet = $self->{EWX}->add_worksheet($test);
289            
290             my $range = $self->{PRINT_AREA}{$name};
291             $sheet->print_area($range) if $range;
292            
293             foreach my $range ( @{ $self->{PRINT_TITLES}{$name} } ) {
294            
295             # Row Range like $1:$1
296             $sheet->repeat_rows($range) if $range =~ m/\d/;
297            
298             # Column Range like $A:$A
299             $sheet->repeat_columns($range) if $range =~ m/[A-Za-z]/;
300             }
301            
302             # Parse the contents of the sheet
303             my $idx = $_->att('r:id');
304             $self->_parse_sheet( $sheet, $files->{sheets}{$idx} );
305             } $files->{workbook}->find_nodes('//sheets/sheet');
306             $self->{ZIP}[$z] = undef;
307             }
308             }
309             ###############################################################################
310             sub template_callback {
311            
312             =head2 template_callback
313            
314             Place holder method for a callback routine to modify the content of the template
315             before being written to the output spreadsheet.
316            
317             This callback is activated for all shared string (both plain and rich text
318             strings), and also for header/footer text.
319            
320             The callback is supplied with the two parameters: The object name (since this is
321             a method), and the text to be processed. This is passed as a reference to single
322             scalar.
323            
324             This method is called numerous times during processing (e.g. once for each
325             unique string in the spreadsheet, so the user is advised to keep it efficient.
326            
327             This callback approach does not force any particular templating system on the
328             user. They are free to use whatever system they choose.
329            
330             Note that templating can only do simple scalars. Complex templating (if-then-
331             else, loops, etc) do not make sense in that the callback is supplied with the
332             contents of a single cell. Having said that, remember that the full power of
333             Excel::Writer::XLSX is available to the user to modify the template after it is
334             processed.
335            
336             # A snippet of code to replace [% template %] in the
337             # template spreadsheet with 'Output'
338            
339             my ($self, $wbk) = Excel::Template::XLSX->new($output_xlsx, $template_xlsx);
340            
341             use Template::Tiny;
342             my $template = Template::Tiny->new( TRIM => 1 );
343             $self->{template_callback} = sub {
344             my ($self, $textref) = @_;
345             $template->process($textref, { template => 'Output' }, $textref );
346             };
347            
348             $self->parse_template();
349            
350             =cut
351            
352             my $self = shift;
353             my ($text) = @_;
354             }
355             ###############################################################################
356             sub _apply_tint {
357            
358             =head2 _apply_tint
359            
360             Applies tinting to a color object, if the tint attribute is encountered in
361             parsing.
362            
363             =cut
364            
365             my $self = shift;
366             my ( $color, $tint ) = @_;
367            
368             my ( $r, $g, $b ) = map { oct("0x$_") } $color =~ /#(..)(..)(..)/;
369             my ( $h, $l, $s ) = rgb2hls( $r, $g, $b );
370            
371             if ( $tint < 0 ) {
372             $l = $l * ( 1.0 + $tint );
373             }
374             else {
375             $l = $l * ( 1.0 - $tint ) + ( 1.0 - 1.0 * ( 1.0 - $tint ) );
376             }
377            
378             return scalar hls2rgb( $h, $l, $s );
379             }
380             ###############################################################################
381             sub _base_path_for {
382            
383             =head2 _base_path_for
384            
385             Manipulates the path to a member in the zip file, to find the associated
386             rels file.
387            
388             =cut
389            
390             my $self = shift;
391             my ($file) = @_;
392            
393             my @path = split '/', $file;
394             pop @path;
395            
396             return join( '/', @path ) . '/';
397             }
398             ###############################################################################
399             sub _cell_to_row_col {
400            
401             =head2 _cell_to_row_col
402            
403             Converts an A1 style cell reference to a row and column index.
404            
405             =cut
406            
407             my $self = shift;
408             my $cell = shift;
409            
410             my ( $col, $row ) = $cell =~ /([A-Z]+)([0-9]+)/;
411            
412             my $ncol = 0;
413             for my $char ( split //, $col ) {
414             $ncol *= 26;
415             $ncol += ord($char) - ord('A') + 1;
416             }
417             $ncol = $ncol - 1;
418             my $nrow = $row - 1;
419             return ( $nrow, $ncol );
420             }
421             ###############################################################################
422             sub _color {
423            
424             =head2 _color
425            
426             Parses color element (rgb, index, theme, and tint)
427            
428             =cut
429            
430             my $self = shift;
431             my ( $color_node, $fill ) = @_;
432            
433             my $themes = $self->{THEMES};
434             my $color;
435             if ( $color_node && !$color_node->att('auto') ) {
436             my $rgb = $color_node->att('rgb');
437             my $theme = $color_node->att('theme');
438             my $index = $color_node->att('indexed');
439             my $tint = $color_node->att('tint');
440            
441             # see https://rt.cpan.org/Public/Bug/Display.html?id=93065 (still needed for XLSX??)
442             # defined $index and $color = ($fill && $index == 64) ? '#FFFFFF' : $index;
443             $rgb and $color = '#' . substr( $rgb, 2, 6 );
444             defined $theme and $color = '#' . $themes->{Color}[$theme];
445             $tint and $color = $self->_apply_tint( $color, $tint );
446             }
447             return $color;
448             }
449             ###############################################################################
450             sub _extract_files {
451            
452             =head2 _extract_files
453            
454             Called by parse_template to fetch the xml strings from the zip file. XML
455             strings are parsed, except for worksheets. Individual worksheets are
456             parsed separately.
457            
458             =cut
459            
460             my $self = shift;
461             my ($zip) = @_;
462            
463             my $type_base
464             = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships';
465            
466             my $rels = $self->_parse_xml( $zip, $self->_rels_for('') );
467            
468             my $node = qq;
469             my $wb_name = ( $rels->find_nodes($node) )[0]->att('Target');
470             my $wb_xml = $self->_parse_xml( $zip, $wb_name );
471            
472             my $path_base = $self->_base_path_for($wb_name);
473             my $wb_rels = $self->_parse_xml( $zip, $self->_rels_for($wb_name) );
474            
475             my $string_xpath = qq;
476             my ($strings_xml) = map {
477             $zip->memberNamed( $path_base . $_->att('Target') )->contents
478             } $wb_rels->find_nodes($string_xpath);
479            
480             my $style_xpath = qq;
481             my $style_target
482             = ( $wb_rels->find_nodes($style_xpath) )[0]->att('Target');
483             my $styles_xml = $self->_parse_xml( $zip, $path_base . $style_target );
484            
485             my %sheet_rels;
486             my $wks_xpath = qq;
487             my %worksheet_xml = map {
488            
489             my $sheet_file = $path_base . $_->att('Target');
490             my $rels_file = $self->_rels_for($sheet_file);
491             my $sheet_rels = '';
492             if ( $zip->memberNamed($rels_file) ) {
493             $sheet_rels = $self->_parse_xml( $zip, $rels_file );
494             }
495            
496             if ( my $contents = $zip->memberNamed($sheet_file)->contents ) {
497             ( $_->att('Id') => { 'xml' => $contents, 'rels' => $sheet_rels } );
498             }
499            
500             } $wb_rels->find_nodes($wks_xpath);
501            
502             my %themes_xml = map {
503             $_->att('Id') =>
504             $self->_parse_xml( $zip, $path_base . $_->att('Target') )
505             } $wb_rels->find_nodes(qq);
506            
507             my $core_base
508             = 'http://schemas.openxmlformats.org/package/2006/relationships/metadata';
509             my $core_full = qq;
510             my $core_name = ( $rels->find_nodes($core_full) )[0]->att('Target');
511             my $core_xml = $self->_parse_xml( $zip, $core_name );
512            
513             return {
514             workbook => $wb_xml,
515             styles => $styles_xml,
516             sheets => \%worksheet_xml,
517             themes => \%themes_xml,
518             core => $core_xml,
519             ( $strings_xml ? ( strings => $strings_xml ) : () ),
520             };
521             }
522             ###############################################################################
523             sub _parse_alignment {
524            
525             =head2 _parse_alignment
526            
527             Parses horizontal and vertical cell alignments in a sheet.
528            
529             =cut
530            
531             my $self = shift;
532             my ($node) = @_;
533            
534             my %align_map = (
535             horizontal => 'align',
536             vertical => 'valign',
537             textRotation => 'rotation',
538             indent => 'indent',
539             wrapText => 'text_wrap',
540             shrinkToFit => 'shrink',
541             );
542             my %align = ();
543             if ( my $alignment = $node->first_child('alignment') ) {
544             map {
545             my $v = $alignment->att($_);
546             if ( defined $v ) {
547             $v = 'vcenter' if ( $_ eq 'vertical' ) and ( $v eq 'center' );
548             $align{ $align_map{$_} } = $v;
549             }
550             } keys %align_map;
551             }
552             return %align;
553             }
554             ###############################################################################
555             sub _parse_borders {
556            
557             =head2 _parse_borders
558            
559             Parses cell borders and diagonal borders.
560            
561             =cut
562            
563             my $self = shift;
564             my ($styles) = @_;
565            
566             my $borders = [];
567             my %border_map = (
568             dashDot => 9,
569             dashDotDot => 11,
570             dashed => 3,
571             dotted => 4,
572             double => 6,
573             hair => 7,
574             medium => 2,
575             mediumDashDot => 10,
576             mediumDashDotDot => 12,
577             mediumDashed => 8,
578             none => 0,
579             slantDashDot => 13,
580             thick => 5,
581             thin => 1,
582             );
583             push @$borders, map {
584             my $border = $_;
585            
586             # XXX specs say "begin" and "end" rather than "left" and "right",
587             # but... that's not what seems to be in the file itself (sigh)
588            
589             my %colors = ();
590             map {
591             my $color
592             = $self->_color(
593             $border->first_child($_)->first_child('color') );
594             $colors{ $_ . '_color' } = $color if $color;
595             } qw(left right top bottom);
596            
597             my %types = ();
598             map {
599             my $style = $border->first_child($_)->att('style');
600             $types{$_} = $border_map{$style} if $style;
601             } qw(left right top bottom);
602            
603             my %diag = ();
604             my $down = $border->att('diagonalDown') // 0;
605             my $up = $border->att('diagonalUp') // 0;
606             $diag{'diag_type'} = 2 * $down + $up if $down + $up;
607             my $dborder = $border->first_child('diagonal')->att('style');
608             $diag{'diag_border'} = $border_map{$dborder} if $dborder;
609             my $dcolor = $border->first_child('diagonal')->first_child('color');
610             $diag{'diag_color'} = $self->_color($dcolor) if $dcolor;
611            
612             my $border_ref = { %colors, %types, %diag };
613             } $styles->find_nodes('//borders/border');
614             return $borders;
615             }
616             ###############################################################################
617             sub _parse_fills {
618            
619             =head2 _parse_fills
620            
621             Parses styles for cell fills (pattern, foreground and background colors.
622             horizontal and horizontal and vertical cell alignments in a sheet.
623            
624             Gradients are parsed, but since EWX does not support gradients, a
625             pattern is substituted.
626            
627             =cut
628            
629             my $self = shift;
630             my ($styles) = @_;
631             my %fill_map = (
632             darkDown => 7,
633             darkGray => 3,
634             darkGrid => 9,
635             darkHorizontal => 5,
636             darkTrellis => 10,
637             darkUp => 8,
638             darkVertical => 6,
639             gray0625 => 18,
640             gray125 => 17,
641             lightDown => 13,
642             lightGray => 4,
643             lightGrid => 15,
644             lightHorizontal => 11,
645             lightTrellis => 16,
646             lightUp => 14,
647             lightVertical => 12,
648             mediumGray => 2,
649             none => 0,
650             solid => 1,
651             );
652            
653             # Pattern Fills / # Gradient Fills
654             # EWX does not support Gradient fills (yet??)
655             # so, substitute a pattern fill to keep indices aligned
656             my $fills = [];
657             push @$fills, map {
658             my ( $fill, @color );
659             my $pat = $_->first_child('patternFill');
660             if ($pat) {
661             for (qw[fg bg]) {
662             my $fgbg = $self->_color( $pat->first_child("${_}Color"), 1 );
663             push @color, ( "${_}_color", $fgbg ) if $fgbg;
664             }
665             $fill = { pattern => $fill_map{ $pat->att('patternType') }, @color };
666             }
667             my $gradient = $_->first_child('gradientFill');
668             if ($gradient) {
669             my @stop_colors = $gradient->find_nodes('stop/color');
670             my $fg = $self->_color( $stop_colors[0], 1 );
671             my $bg = $self->_color( $stop_colors[1], 1 );
672             my %hfg = ( 'fg_color' => ( $fg // 'white' ) );
673             my %hbg = ( 'bg_color' => ( $bg // 'black' ) );
674            
675             ### ?? Create a lightGrid pattern in place of a gradient for now
676             $fill = { pattern => $fill_map{'lightGrid'}, %hfg, %hbg };
677             }
678             $fill;
679             } $styles->find_nodes('//fills/fill');
680             $fills;
681             }
682             ###############################################################################
683             sub _parse_fonts {
684            
685             =head2 _parse_fonts
686            
687             Parses font information (font name, size, super/sub scripts, alignment
688             colors, underline, bold, italic, and strikeout attributes).
689            
690             =cut
691            
692             my $self = shift;
693             my ( $styles, $xpath ) = @_;
694             $xpath //= '//fonts/font';
695            
696             my $fonts = [];
697             @$fonts = map {
698            
699             my $u = $_->first_child('u');
700             my $vert = $_->first_child('vertAlign');
701             my $font;
702            
703             my $size = $_->first_child('sz')->att('val');
704             $font->{'size'} = $size if $size;
705            
706             # XXX if color tag is missing is it black?? '#000000'
707             my $color = $_->first_child('color');
708             $font->{'color'} = $self->_color($color) if $color;
709            
710             my $script_map = {
711             'superscript' => 1,
712             'subscript' => 2,
713             };
714            
715             if ( defined $vert ) {
716             my $script = $vert->att('val');
717             $font->{'font_script'} = $script_map->{$script} if $script;
718             }
719            
720             my $u_map = {
721             'single' => 1,
722             'double' => 2,
723             'singleAccounting' => 33,
724             'doubleAccounting' => 34,
725             };
726             if ( defined $u ) {
727            
728             # XXX sometimes style xml files can contain just with no
729             # val attribute. i think this means single underline, but not sure
730             my $underline = $u->att('val') // 'single';
731             $font->{'underline'} = $u_map->{$underline} if $underline;
732             }
733            
734             my $font_name = $_->first_child('name');
735             $font->{'font'} = $font_name->att('val') if $font_name;
736            
737             # Alternate for rich strings (embedded font)
738             my $rFont = $_->first_child('rFont');
739             $font->{'font'} = $rFont->att('val') if $rFont;
740            
741             my $bold = $_->first_child('b');
742             $font->{'bold'} = 1 if $bold;
743            
744             my $italic = $_->first_child('i');
745             $font->{'italic'} = 1 if $italic;
746            
747             my $strike = $_->first_child('strike');
748             $font->{'font_strikeout'} = 1 if $strike;
749            
750             $font;
751             } $styles->find_nodes($xpath);
752             return $fonts;
753             }
754             ###############################################################################
755             sub _parse_numbers {
756            
757             =head2 _parse_numbers
758            
759             Parses styles for cell number formats (financial, decimal, exponential, date-time, ...)
760            
761             =cut
762            
763             my $self = shift;
764             my ($styles) = @_;
765             my $number_format = { 0 => {} };
766             map {
767             my $id = $_->att('numFmtId') // 0;
768            
769             # defaults are from
770             #http://social.msdn.microsoft.com/Forums/en-US/oxmlsdk/thread/e27aaf16-b900-4654-8210-83c5774a179c
771             # Defaults do not need to be re-created.
772             my $code = $_->att('formatCode') // $id;
773             $number_format->{$id} = { num_format => $code } if $id;
774             } $styles->find_nodes('//numFmts/numFmt');
775             return $number_format;
776             }
777             ###############################################################################
778             sub _parse_protection {
779            
780             =head2 _parse_protection
781            
782             Parses locked and hidden attributes for a cell. These are only
783             useful if the worksheet is locked.
784            
785             This module does not lock the workbook or the worksheet.
786            
787             =cut
788            
789             my $self = shift;
790             my ($node) = @_;
791             my @protection = qw(locked hidden);
792             my %prot = ();
793             if ( my $protection = $_->first_child('protection') ) {
794             map {
795             my $v = $protection->att($_);
796             $prot{$_} = $v if defined $v;
797             } @protection;
798             }
799             return %prot;
800             }
801             ###############################################################################
802             sub _parse_shared_strings {
803            
804             =head2 _parse_shared_strings
805            
806             Parses the shared strings file. Excel does not directly store
807             string values with the cell, but stores an index into the shared
808             strings table instead, to save memory, if a string value is
809             referenced more than once. Shared strings also contain
810             formatting if multiple formats are applied within a cell (See
811             write_rich_string in EWX.
812            
813             =cut
814            
815             my $self = shift;
816             my ($strings) = @_;
817            
818             return unless $strings;
819             my $xml = XML::Twig->new(
820             twig_handlers => {
821             'si' => sub {
822             my ( $twig, $si ) = @_;
823            
824             my $callback = $self->{template_callback};
825             my $call = ref($callback) eq 'CODE';
826            
827             # plain text strings
828             my $t = $si->first_child('t');
829             if ($t) {
830             my $text = $t->text();
831             $call and $self->$callback( \$text );
832             push @{ $self->{SHARED_STRINGS} }, $text;
833             }
834            
835             # rich text strings; String item (si) with multiple
836             # text elements, with optional formatting
837             my $rich = [];
838             for my $r ( $si->find_nodes('r') ) {
839             my $text = $r->first_child('t')->text();
840             $call and $self->$callback( \$text );
841             my $rPr = $r->first_child('rPr');
842            
843             if ($rPr) {
844             my $xml = $r->first_child('rPr')->outer_xml();
845             my $twig = XML::Twig->parse($xml);
846             my $fonts = $self->_parse_fonts( $twig, '//rPr' );
847             my $format = $self->{EWX}->add_format( %{ $fonts->[0] } );
848             push @$rich, $format, $text;
849             }
850             else {
851             push @$rich, $text;
852             }
853             }
854             push( @{ $self->{SHARED_STRINGS} }, $rich ) if scalar(@$rich);
855             $twig->purge;
856             }
857             }
858             ); # } twig_handlers ) new
859             $xml->parse($strings);
860             }
861             ###############################################################################
862             sub _parse_sheet {
863            
864             =head2 _parse_sheet
865            
866             Parses an individual worksheet. This is done in two passes.
867             See _parse_sheet_pass1 and _parse_sheet_pass2 for what elements are
868             parsed. This is necessary because the parse order of XML::Twig callbacks
869             are in the wrong order for some sheet information (header/footer information,
870             hyperlinks, and merged cells).
871            
872             =cut
873            
874             my $self = shift;
875             my ( $sheet, $sheet_file ) = @_;
876            
877             # Hyperlinks are local to each sheet
878             $self->{HYPERLINKS} = {};
879             my $pass1
880             = XML::Twig->new( twig_roots => $self->_parse_sheet_pass1($sheet) );
881             $pass1->parse( $sheet_file->{xml} );
882            
883             # Half time show - track down the URLs for hyperlinks found in pass 1
884             while ( my ( $a1, $rid ) = each %{ $self->{HYPERLINKS} } ) {
885             my $xpath = qq;
886             my $url = ( $sheet_file->{rels}->find_nodes($xpath) )[0];
887             if ($url) {
888             my $target = $url->att('Target');
889             my $mode = lc( $url->att('TargetMode') );
890             $self->{HYPERLINKS}{$a1} = "$target";
891             }
892             }
893            
894             # 2nd pass: cell/row building is dependent on having parsed the merge definitions
895             # beforehand. Also header/footer margins must be parsed before setting header/footer
896             my $pass2
897             = XML::Twig->new( twig_roots => $self->_parse_sheet_pass2($sheet) );
898             $pass2->parse( $sheet_file->{xml} );
899             }
900             ###############################################################################
901             sub _parse_sheet_pass1 {
902            
903             =head2 _parse_sheet_pass1
904            
905             Parses some elements in a worksheet ( pageMargins, headerFooter,
906             hyperlinks, pageSetup, Merged Cells, Sheet Formatting Row and Column
907             heights, Sheet selection, and Tab Color)
908            
909             =cut
910            
911             my $self = shift;
912             my ($sheet) = @_;
913            
914             my $default_row_height = 15;
915             my $default_column_width = 10;
916             my %hf_margin;
917            
918             return {
919             'pageMargins' => sub {
920             my ( $twig, $margin ) = @_;
921             map {
922             my $method = "set_margin_" . $_;
923             $sheet->$method( $margin->att($_) // 0 );
924             } qw( left right top bottom );
925            
926             # Capture header/footer margin, for use with headerFooter callback
927             $hf_margin{Header} = $margin->att('header');
928             $hf_margin{Footer} = $margin->att('footer');
929             $twig->purge;
930             },
931            
932             # Headers/Footers
933             'headerFooter' => sub {
934             my ( $twig, $hf ) = @_;
935            
936             my $callback = $self->{template_callback};
937             my $call = ref($callback) eq 'CODE';
938             for (qw[Header Footer]) {
939             my $child = $hf->first_child( 'odd' . $_ );
940             my $text = $child ? $child->text() : '';
941             $call and $self->$callback( \$text );
942             my $method = 'set_' . lc($_);
943             $sheet->$method( $text, $hf_margin{$_} );
944             }
945            
946             $twig->purge;
947             },
948            
949             # Hyperlinks
950             'hyperlinks/hyperlink ' => sub {
951             my ( $twig, $link ) = @_;
952             my $a1 = $link->att('ref');
953             $self->{HYPERLINKS}{$a1} = $link->att('r:id');
954             $twig->purge;
955             },
956            
957             # Paper/page setup
958             'pageSetup' => sub {
959             my ( $twig, $setup ) = @_;
960             my %lookup = (
961             orientation => => 'set_portrait',
962             firstPageNumber => 'set_start_page',
963             scale => 'set_print_scale',
964             paperSize => 'set_paper'
965            
966             # horizontalDpi ??
967             # verticalDpi
968             );
969            
970             my @page
971             = qw(scale orientation horizontalDpi verticalDpi paperSize firstPageNumber scale);
972             foreach (@page) {
973            
974             # Ignore if we do not have a EWX method for this attribute
975             my $method = $lookup{$_} // next;
976            
977             # Ignore if no value defined for this attribute
978             next unless my $set = $setup->att($_);
979            
980             # Special case; no generic method to set portrait/landscape
981             $method = 'set_landscape' if $set eq 'landscape';
982             $sheet->$method($set);
983             }
984            
985             $twig->purge;
986             },
987            
988             # Merged cells (Create the ranges: content will be added later)
989             'mergeCells/mergeCell' => sub {
990             my ( $twig, $merge_area ) = @_;
991            
992             if ( my $ref = $merge_area->att('ref') ) {
993             my ( $topleft, $bottomright ) = $ref =~ /([^:]+):([^:]+)/;
994             # my ( $tr, $lc ) = $self->_cell_to_row_col($topleft);
995             # my ( $br, $rc ) = $self->_cell_to_row_col($bottomright);
996            
997             # Merged Ranges/Areas: save the address for pass 2.
998             # cells within the merged range will be processed with
999             # merge_range_type(), instead of write()
1000             $self->{MERGED_RANGES}{$topleft} = $ref;
1001             }
1002             $twig->purge;
1003             },
1004            
1005             # Default row height
1006             'sheetFormatPr' => sub {
1007             my ( $twig, $format ) = @_;
1008             $default_row_height //= $format->att('defaultRowHeight');
1009             $default_column_width //= $format->att('baseColWidth');
1010             $sheet->set_default_row($default_row_height);
1011             $twig->purge;
1012             },
1013            
1014             'col' => sub {
1015             my ( $twig, $col ) = @_;
1016            
1017             for my $ci ( $col->att('min') .. $col->att('max') ) {
1018             #set_column($first,$last,$width,$fmt,$hide,$level,$collapsed )
1019             $sheet->set_column( $ci - 1, $ci - 1, $col->att('width') );
1020             #?? just sets width, not $col->att('style')
1021             }
1022             $twig->purge;
1023             },
1024            
1025             'row' => sub {
1026             my ( $twig, $row ) = @_;
1027            
1028             # ?? just sets row height. No formatting yet
1029             # set_row( $row, $height, $format, $hidden, $level, $collapsed )
1030             $sheet->set_row( $row->att('r') - 1, $row->att('ht') );
1031             $twig->purge;
1032             },
1033            
1034             'sheetView/selection' => sub {
1035             my ( $twig, $selection ) = @_;
1036             my $range = $selection->att('sqref')
1037             // $selection->att('activeCell') // 'A1';
1038             $sheet->set_selection($range);
1039             $twig->purge;
1040             },
1041            
1042             'sheetPr/tabColor' => sub {
1043             my ( $twig, $tab_color ) = @_;
1044             $sheet->set_tab_color( $tab_color->att('rgb') );
1045             $twig->purge;
1046             }
1047            
1048             } # return hashref
1049             }
1050             ###############################################################################
1051             sub _parse_sheet_pass2 {
1052            
1053             =head2 _parse_sheet_pass2
1054            
1055             Parses cell contents (first by row, then by column). Cells can contain
1056             inline strings, string references, direct string values, formulas,
1057             and hyperlinks. Each cell may also contain formatting information.
1058             The format is in an index to formatting for borders, shading, alignment,
1059             font, and number formats.
1060            
1061             =cut
1062            
1063             my $self = shift;
1064             my ($sheet) = @_;
1065            
1066             return {
1067             'sheetData/row' => sub {
1068             my ( $twig, $row_elt ) = @_;
1069             for my $cell ( $row_elt->children('c') ) {
1070             my $string_index = 0;
1071             my $a1 = $cell->att('r'); # Cell Address
1072             my $t = $cell->att('t') || 'n'; # Cell Type
1073             my $s = $cell->att('s'); # Cell String Index
1074             my $val_xml
1075             = $t eq 'inlineStr'
1076             ? $cell->first_child('is')->first_child('t')
1077             : $cell->first_child('v');
1078             my $val = $val_xml ? $val_xml->text() : undef;
1079            
1080             my $format_idx = $s // 0;
1081             my $format = $self->{FORMATS}[$format_idx];
1082            
1083             # Formatted cell, no contents
1084             if ( !defined($val) ) {
1085             $sheet->write_blank($a1, $val, $format);
1086             next;
1087             }
1088            
1089             if ( $t eq 's' ) {
1090             $string_index = $val;
1091             $val = $self->{SHARED_STRINGS}[$val];
1092            
1093             my $is_array = ref($val) eq 'ARRAY';
1094             my @aval = $is_array ? @$val : ($val);
1095             if ( my $ref = $self->{MERGED_RANGES}{$a1} ) {
1096             my $type = $is_array ? 'rich_string' : 'string';
1097             $sheet->merge_range_type($type, $ref, @aval, $format );
1098             next;
1099             }
1100            
1101             # Special case for multiple formats in a cell
1102             # see _parse_shared_strings for rPr nodes
1103             if ( $is_array ) {
1104             $sheet->write_rich_string( $a1, @aval );
1105             next;
1106             }
1107             if ( my $url = $self->{HYPERLINKS}{$a1} ) {
1108             $sheet->write_url( $a1, $url, $format, $val );
1109             next;
1110             }
1111             $sheet->write_string( $a1, $val, $format );
1112             next;
1113             }
1114            
1115             # In-line string (not seen in practice)
1116             elsif ( $t eq 'str' ) {
1117             $val = '=' . $cell->first_child('f')->text();
1118             }
1119            
1120             # Formulas
1121             elsif ( $t eq 'n' ) {
1122             if ( my $form_child = $cell->first_child('f') ) {
1123             my $is_array = $form_child->att('t');
1124             my $ref = $form_child->att('ref');
1125             my $formula = $form_child->text() // q[="No Formula Found"];
1126             if ($is_array) {
1127             $sheet->write_array_formula( $ref, "=${formula}",
1128             $format, $val );
1129             }
1130             else {
1131             if ( my $ref = $self->{MERGED_RANGES}{$a1} ) {
1132             $sheet->merge_range_type('formula', $ref, "=${formula}", $format, $val);
1133             } else {
1134             $sheet->write_formula( $a1, "=${formula}", $format,
1135             $val );
1136             }
1137             }
1138             next;
1139             }
1140            
1141             }
1142             elsif ( $t eq 'b' ) {
1143             $val = $val ? "TRUE" : "FALSE";
1144             }
1145             elsif ( $t eq 'e' ) {
1146             }
1147             elsif ( $t eq 'str' || $t eq 'inlineStr' ) {
1148             }
1149             else {
1150             warn "unimplemented type $t found in cell $a1"; # XXX
1151             }
1152            
1153             $sheet->write( $a1, $val, $format );
1154             }
1155            
1156             $twig->purge;
1157             }
1158             };
1159             }
1160             ###############################################################################
1161             sub _parse_styles {
1162            
1163             =head2 _parse_styles
1164            
1165             Parses style information.
1166             Parses number formats directly. Calls subroutines to parse
1167             fonts, fills, and borders, alignment, and protection.
1168            
1169             Finally, parses Cell Xfs elements to Combine fonts, borders, number formats,
1170             alignment, patterns, into a single format specification.
1171            
1172             Calls EWX add_formats to create a format, and stores the format information
1173             in a FORMAT array within the object.
1174            
1175             =cut
1176            
1177             my $self = shift;
1178             my ($styles) = @_;
1179            
1180             # Number Formats
1181             my $numfmt = $self->_parse_numbers($styles);
1182            
1183             # Fonts / Fills / Borders
1184             my $fonts = $self->_parse_fonts( $styles, '//fonts/font' );
1185             my $fills = $self->_parse_fills($styles);
1186             my $borders = $self->_parse_borders($styles);
1187            
1188             # Cell Xfs
1189             # Combine fonts, borders, number formats, alignment, patterns, into a single format spec
1190             map {
1191             # Also has applyAlignment property, which we do not examine
1192             # same for ApplyFont, ApplyBorder ApplyProtection
1193            
1194             my %halign = $self->_parse_alignment($_);
1195             my %hprot = $self->_parse_protection($_);
1196             my %hfont = %{ $fonts->[ $_->att('fontId') // 0 ] };
1197            
1198             my $numFmtId = $_->att('numFmtId') // 0;
1199            
1200             # Use custom format, or built-in if custom not found
1201             my $ref = $numfmt->{$numFmtId} // { num_format => $numFmtId };
1202             my %hnumfmt = %$ref;
1203            
1204             my %hbord = %{ $borders->[ $_->att('borderId') // 0 ] };
1205             my %hfill = %{ $fills->[ $_->att('fillId') // 0 ] };
1206            
1207             my $fmt
1208             = $self->{EWX}
1209             ->add_format( %hfont, %hnumfmt, %hbord, %halign, %hprot, %hfill );
1210             push @{ $self->{FORMATS} }, $fmt;
1211             } $styles->find_nodes('//cellXfs/xf');
1212             }
1213             ###############################################################################
1214             sub _parse_themes {
1215            
1216             =head2 _parse_themes
1217            
1218             Parses theme information. Some color settings are referenced by an
1219             index to the theme.
1220            
1221             =cut
1222            
1223             my $self = shift;
1224             my ($themes) = @_;
1225            
1226             return {} unless $themes;
1227            
1228             my @color
1229             = map { $_->name eq 'a:sysClr' ? $_->att('lastClr') : $_->att('val') }
1230             $themes->find_nodes('//a:clrScheme/*/*');
1231            
1232             # this shouldn't be necessary, but the documentation is wrong here
1233             # see http://stackoverflow.com/questions/2760976/theme-confusion-in-spreadsheetml
1234             ( $color[0], $color[1] ) = ( $color[1], $color[0] );
1235             ( $color[2], $color[3] ) = ( $color[3], $color[2] );
1236            
1237             return { Color => \@color };
1238             }
1239             ###############################################################################
1240             sub _parse_xml {
1241            
1242             =head2 _parse_xml
1243            
1244             Low level subroutine to parse an entire member of a zip file. Used
1245             for small files, such as xxx.xml.rels, where the entire file is parsed.
1246            
1247             For larger files, XML::Twig::twig_handlers are used.
1248            
1249             =cut
1250            
1251             my $self = shift;
1252             my ( $zip, $subfile ) = @_;
1253            
1254             my $member = $zip->memberNamed($subfile);
1255             die "no subfile named $subfile" unless $member;
1256            
1257             my $xml = XML::Twig->new;
1258             $xml->parse( scalar $member->contents );
1259             return $xml;
1260             }
1261             ###############################################################################
1262             sub _rels_for {
1263            
1264             =head2 _rels_for
1265            
1266             Returns the .rels file name for a sibling workbook or worksheet.
1267            
1268             =cut
1269            
1270             my $self = shift;
1271             my ($file) = @_;
1272            
1273             my @path = split '/', $file;
1274             my $name = pop @path;
1275             $name = '' unless defined $name;
1276             push @path, '_rels';
1277             push @path, "$name.rels";
1278            
1279             return join '/', @path;
1280             }
1281             ###############################################################################
1282             sub zzpodtail {
1283            
1284             =for pod
1285            
1286             =head2 zzpodtail
1287            
1288             Dummy subroutine to allow me to hide pod documentation when using code
1289             folding in the editor.
1290            
1291             =head1 INSTALLATION
1292            
1293             Install with CPAN
1294            
1295             cpan Excel::Template::XLSX
1296            
1297             or, use the standard Unix style installation.
1298            
1299             Unzip and untar the module as follows:
1300            
1301             tar -zxvf Excel::Template::XLSX-nnn.tar.gz
1302            
1303             The module can be installed using the standard Perl procedure:
1304            
1305             perl Makefile.PL
1306             make
1307             make test
1308             make install # As sudo/root
1309            
1310             =head1 BUGS
1311            
1312             =over 4
1313            
1314             =item Large spreadsheets may cause segfaults on perl 5.14 and earlier
1315            
1316             This module internally uses XML::Twig, which makes it potentially subject to
1317             L
1318             on perl versions 5.14 and below (the underlying bug with perl weak references
1319             was fixed in perl 5.15.5). The larger and more complex the spreadsheet, the
1320             more likely to be affected, but the actual size at which it segfaults is
1321             platform dependent. On a 64-bit perl with 7.6gb memory, it was seen on
1322             spreadsheets about 300mb and above. You can work around this adding
1323             C to your code before parsing the spreadsheet,
1324             although this may have other consequences such as memory leaks.
1325            
1326             Please report any bugs to GitHub Issues at
1327             L.
1328            
1329             =back
1330            
1331             =head1 SUPPORT
1332            
1333             You can find this documentation for this module with the perldoc command.
1334            
1335             perldoc Excel::Template::XLSX
1336            
1337             You can also look for information at:
1338            
1339             =over 4
1340            
1341             =item * MetaCPAN
1342            
1343             L
1344            
1345             =item * RT: CPAN's request tracker
1346            
1347             L
1348            
1349             =item * Github
1350            
1351             L
1352            
1353             =item * CPAN Ratings
1354            
1355             L
1356            
1357             =back
1358            
1359             =head1 DEBUGGING TIPS
1360            
1361             Using the Perl debugger gets complicated because of XML::Twig. The objects
1362             created by XML::Twig are HUGE. Also, stepping through the code often results
1363             in exceeding a stack depth of >100. The author found it helpful to take
1364             advantage of the simplify() method in XML::Twig when using the debugger 'x'
1365             command to examine variables.
1366            
1367             x $node->simplify()
1368            
1369             Also, it is helpful to use the 'c' command to jump over XML::Twig subroutine calls and callbacks.
1370            
1371             =head1 BUGS
1372            
1373             Please report any bugs or feature requests to the author.
1374            
1375             =head1 TO DO
1376            
1377             Worksheet Activation
1378             Table Formatting/Styles
1379             Calculation Mode
1380            
1381             =head1 REPOSITORY
1382            
1383             The Excel::Template::XLSX source code is hosted on github:
1384             L.
1385            
1386             =head1 SEE ALSO
1387            
1388             Excel::Writer::XLSX
1389            
1390             This module does not provide much documentation on the capabilites of methods
1391             for creating Excel content. The documentation provided with EWX is excellent,
1392             and also has numerous examples included.
1393            
1394             Spreadsheet::ParseXLSX
1395            
1396             Although this module does not use Spreadsheet::ParseXLSX, the parsing and
1397             comments regarding issues involved with parsing spreadsheets came from this module.
1398            
1399             XML::Twig and Archive::Zip
1400            
1401             Excel .xlsx files are zippped .xml files. These two modules are used to
1402             unzip the .xlsx file, extract the members, and parse the relative portions
1403             of the .xml files inside.
1404            
1405             =head1 ACKNOWLEDGEMENTS
1406            
1407             This module leverages the methods in L, maintained by L
1408             to recreate the template.
1409            
1410             The parser was developed using L as a starting point, maintained by L.
1411             This parser calls methods in EWX directly when a token is resolved rather than building
1412             up an object representing the parsed content.
1413            
1414             =head1 LICENSE AND COPYRIGHT
1415            
1416             Either the Perl Artistic Licence L
1417             or the GPL L.
1418            
1419             AUTHOR
1420            
1421             David Clarke dclarke@cpan.org
1422            
1423             =cut
1424            
1425             }
1426