File Coverage

blib/lib/Data/Tab.pm

Criterion	Covered	Total	%
statement	16	159	10.0
branch	0	68	0.0
condition	0	21	0.0
subroutine	6	42	14.2
pod			n/a
total	22	290	7.5

line	stmt	bran	cond	sub	time	code
1						package Data::Tab;
2
3	1			1	39669	use 5.006;
	1				3
	1				49
4	1			1	12	use strict;
	1				3
	1				46
5	1			1	6	use warnings FATAL => 'all';
	1				7
	1				48
6	1			1	14	use Carp;
	1				2
	1				1994
7
8						=head1 NAME
9
10						Data::Tab - Iterators as tabular data structures
11
12						=head1 VERSION
13
14						Version 0.01
15
16						=cut
17
18						our $VERSION = '0.01';
19
20
21						=head1 SYNOPSIS
22
23						C is inspired by L, in that the central data
24						structure is a two-dimensional matrix of data values with named headers.
25						However, there are some significant differences, chief of which is that the data sources
26						can be lazily evaluated, that is, they can be either iterators or static arrays.
27
28						BE WARNED: this module defines a lot of API that isn't actually implemented yet.
29						It is a work in slow progress. (By "slow" I mean I need something every year or so and put it in.)
30
31						=head1 METHODS
32
33						=head2 new (data, [headers], [types], [rowheaders], [underlying])
34
35						Creates a new table, with data being either an arrayref of arrayrefs, or a coderef that should iterate a series of arrayrefs,
36						or an arrayref of arrayrefs with a terminal coderef. Case 1 is a set of static data. Case 2 is an unbuffered iterator; it
37						cannot be rewound, just read. Finally, Case 3 is a buffered iterator; during a read series the arrayref rows will be returned
38						until the coderef is encountered, after which the coderef will be asked for more rows, which will be inserted into the buffer,
39						until there are no more rows and the table is left as a static dataset.
40
41						The headers are an optional arrayref of strings to be used as the column names.
42
43						The types are an optional arrayref of (advisory) types to be used for formatting the data, or a single scalar indicating
44						the datatype for I the elements in the table.
45
46						The rowheaders are either an arrayref of names for each row (primarily useful for static datasets, obviously) or a coderef
47						for generating a name based on the data in the row. Rowheaders are not yet implemented.
48
49						The "underlying" parameter is an object inheriting from that is responsible for passing
50						changes to the data table through to the underlying object, if this is applicable. This makes the data table a live view.
51						It's not yet implemented.
52
53						=cut
54
55						sub new {
56	0			0		my $class = shift;
57	0					my $self = {
58						data => shift, # The buffered data, if any
59						more => undef, # The iterator, if any
60						headers => shift,
61						types => shift,
62						rowheaders => shift,
63						underlying => shift,
64						buffer => undef, # 0 if we don't want to keep a buffer
65						cursor => 0,
66						};
67
68	0	0				if (ref $self->{data} eq 'CODE') {
		0
69	0					$self->{more} = $self->{data};
70	0					$self->{data} = undef;
71						} elsif (ref $self->{data} eq 'ARRAY') {
72						# TODO: split out the iterator now, if there is one.
73						} else {
74						# TODO: consider other handy ways of wrapping these.
75	0					croak "invalid data type creating $class instance";
76						}
77
78	0					bless ($self, $class);
79						}
80
81						=head2 query (dbh, sql, [parameters])
82
83						If you have DBI installed - and who doesn't? - then you can query the database with this function in a single step and have a
84						lazy table returned. Clearly, if you don't have DBI installed, you can't have a dbh handle, so no error checking is done.
85
86						At some point there will be an underlying class for SQL that will allow changes made to the lazy table to be reflected in the
87						database, but that's a fight for another day.
88
89						By default, an SQL query is unbuffered, a pure iterator. Use query()->buffer() to turn the buffer on before retrieval if that's
90						what you want.
91
92						Suggested usage:
93
94						my $query = Data::Tab->query($dbh, "select * from my_table where name like ?", '%this%');
95						while ($query->get) {
96						my ($col1, $col2) = @$_;
97						...
98						}
99
100						Or this:
101
102						print Data::Tab->query($dbh, "select * from my table where customer=?", $customer)->read->show;
103
104						Magic SQL query formatting!
105
106						=cut
107
108						sub query {
109	0			0		my $class = shift;
110	0					my $dbh = shift;
111	0					my $sql = shift;
112
113	0		0			my $sth = $dbh->prepare ($sql) \|\| croak $dbh->errstr();
114	0					$sth->execute(@_);
115
116						my $self = {
117						data => undef, # no buffer by default
118	0			0		more => sub { $sth->fetchrow_arrayref },
119	0					headers => $sth->{NAME_lc},
120						types => undef, # TODO: later
121						rowheaders => undef,
122						underlying => undef,
123						buffer => 0, # 0 if we don't want to keep a buffer
124						cursor => 0,
125						};
126
127	0					bless ($self, $class);
128						}
129
130						=head2 buffer, unbuffer
131
132						Switches the table from buffered mode to unbuffered mode, or vice versa. If a table is currently buffered, the buffer is
133						discarded when unbuffering. Switching from buffered to unbuffered and back again is a good way to free up memory for longer
134						queries that still need buffering.
135
136						The buffer can also be set to a given number of rows with e.g. C. Then any incoming rows will discard old rows
137						from the beginning of the buffer. A call to C is equivalent to C.
138
139						=cut
140
141						sub buffer {
142	0			0		my $self = shift;
143	0					$self->{buffer} = shift;
144	0	0				$self->{data} = [] unless defined $self->{data};
145	0	0				return unless defined $self->{buffer};
146	0	0				if ($self->{buffer} == 0) {
147	0					$self->{data} = undef;
148	0					return;
149						}
150	0					shift @{$self->{data}} while scalar @{$self->{data}} > $self->{buffer};
	0
	0
151						}
152	0			0		sub unbuffer { shift->buffer(0); }
153
154						=head2 headers, rowheaders, types, header(n), rowheader(n), type(n)
155
156						Getter/setter functions for the various parameters of the table. That is, C will retrieve the header for the
157						n-th row, while C will set it. Similarly, C will set the scalar type array
158						for all rows.
159
160						Returns a list in list context, an arrayref in scalar context.
161
162						Of these, only C is implemented.
163
164						=cut
165
166						sub headers {
167	0			0		my $self = shift;
168	0	0				if (@_) {
169	0					my @headers = ();
170	0					push @headers, @_;
171	0					$self->{headers} = \@headers;
172						}
173	0	0				return unless defined wantarray;
174	0	0				my @return = @{$self->{headers} \|\| []};
	0
175	0	0				return @return if wantarray;
176	0					return \@return;
177						}
178	0			0		sub rowheaders {
179						}
180	0			0		sub types {
181						}
182
183	0			0		sub header {
184						}
185	0			0		sub rowheader {
186						}
187	0			0		sub type {
188						}
189
190						=head2 dimensions (not yet implemented)
191
192						Another getter/setter. So C will retrieve the current dimensions of the buffered or static data,
193						C will discard rows numbered 10 and up, C is a way to truncate the buffer,
194						C will discard columns numbered 5 and up, and C will force the dimensions
195						of the table to be 5x5. If the data is currently smaller in a dimension specified, then "blank" data will be filled
196						in; the "blank" value is the type value of the column, or C.
197
198						=cut
199
200	0			0		sub dimensions {
201						}
202
203						=head2 truncate
204
205						If there is a coderef at the end of the data, removes it. This converts an iterated, buffered data table into a
206						static one. Used on an unbuffered iterator, renders the table useless.
207
208						=cut
209
210	0			0		sub truncate { shift->{more} = undef; }
211
212						=head2 reiterate
213
214						Tacks a new iterator on the end of a static table, converting it into an iterated table.
215
216						=cut
217
218	0			0		sub reiterate { shift->{more} = shift; }
219
220						=head2 get, rewind
221
222						The table has a cursor row that starts at 0. The C function resets that row to 0 if it's been changed.
223						The C function with no parameters gets the cursor row and advances the cursor. If there's no buffer, it
224						just gets the next row from the iterator; if there is a buffer, then the cursor advances along the buffer until
225						it gets to the iterator (if there is one) and then returns/buffers rows as it goes.
226
227						However, C will get row 1 in the buffer as an arrayref, and C will get the value from row 1,
228						column 3 in the buffer. A call with an C row (e.g. C) will get the numbered column.
229
230						A call to C will get the column headed taxes, as an arrayref, while C will get the
231						I labeled 'taxes', if the rowheaders are defined and there is such a row. If these fail, the call will croak.
232						There is no column cursor, so there is no need for syntax for a columnar get with unspecified column.
233
234						The return value is always a scalar or arrayref.
235
236						=cut
237
238						sub get {
239	0			0		my $self = shift;
240	0					my $row = shift;
241	0					my $col = shift;
242
243	0	0				if (not defined $self->{data}) {
244	0	0				return undef unless defined $self->{more};
245	0					my $ret = $self->{more}->($self);
246	0	0				$self->{more} = undef unless defined $ret;
247	0	0				return undef unless defined $ret;
248	0	0				return ref $ret? $ret : [$ret];
249						}
250
251	0	0	0			if (not defined $row and not defined $col) {
252	0					$row = $self->{cursor};
253	0					$self->{cursor} += 1;
254						}
255
256	0	0				if (defined $row) {
257						# TODO: ignoring dimensions for the moment.
258	0					my $therow = $self->{data}->[$row];
259	0	0	0			if (not defined $therow and $self->{more}) {
260	0					my $bufsize = scalar @{$self->{data}};
	0
261	0					while ($bufsize < $row+1) {
262	0					$bufsize += 1;
263	0					$therow = $self->{more}->($self);
264	0	0				if (not defined $therow) {
265	0					$self->{more} = undef;
266	0					return undef;
267						}
268	0	0				my @values = ref $therow ? @$therow : ($therow);
269	0					$therow = \@values; # Have to take a copy, not reuse the same arrayref.
270	0					push @{$self->{data}}, $therow;
	0
271	0		0			shift @{$self->{data}} while defined $self->{buffer} and scalar @{$self->{data}} > $self->{buffer};
	0
	0
272						}
273						}
274
275	0	0				return $therow unless defined $col;
276	0	0				return undef unless defined $therow;
277	0					return $therow->[$col];
278						}
279	0	0				if (defined $col) {
280	0					my @values;
281	0					foreach my $r (@{$self->{data}}) {
	0
282	0					push @values, $r->[$col];
283						}
284	0					return \@values;
285						}
286						}
287
288	0			0		sub rewind { shift->{cursor} = 0; }
289
290						=head2 read (limit)
291
292						The C method, called on a buffered table with an iterator, reads the entire iterated query result list
293						into the buffer, then truncates the table to render it static. Pass a number to limit the read.
294
295						If the table is unbuffered, read turns on buffering before it starts retrieval.
296
297						=cut
298
299						sub read {
300	0			0		my ($self, $limit) = @_;
301	0					$self->buffer;
302	0					while ($self->get()) {
303	0	0				if (defined $limit) {
304	0					$limit -= 1;
305	0	0				last unless $limit > 0;
306						}
307						}
308	0					$self;
309						}
310
311						=head2 set, setrow, setcol (not yet implemented)
312
313						A call to C sets an entire row in the table, while a single element can be set with
314						C. To set a column, use C on a buffered or static table.
315						As usual, row and col can be numbers or labels.
316
317						If an C is defined for the table, then it will be notified of the change and can take appropriate
318						action to update the table's underlying object.
319
320						=cut
321
322	0			0		sub set {
323						}
324	0			0		sub setrow {
325						}
326	0			0		sub setcol {
327						}
328
329						=head2 show, show_generic
330
331						Calling C returns the table as text, with C<+-----+> type delineation. (This method only works if
332						Text::Table is installed.) This only shows the rows actually in the buffer; it will not retrieve iterator
333						rows; this allows you to set up a paged display.
334
335						The column delimiters only appear for a table with headers; this is because Text::Table is easier to
336						use this way - but think of a table with headers as a database table, and one without as a simple
337						matrix.
338
339						The C method is actually implemented using C, which takes as parameters the separator, a flag
340						whether the headers should be shown (if the 'flag' is an arrayref, you can simply specify your own headers here),
341						and a flag whether a rule should be shown at the top and bottom of the table and between the header and body - by
342						default, this rule is of the form +----+----+, but again, the 'flag' can be an arrayref of any two other characters
343						to be used instead (in the order '-' and '+' in the example).
344
345						The C method is thus C.
346
347						Unfortunately, C isn't generic enough to express an HTML table, and I considered putting a show_html
348						method here as well (L has one) - but honestly, it's rare to use undecorated HTML these days, so I
349						elected to remove temptation from your path. To generate HTML, you should use a template engine to generate
350						I HTML. Eventually I'll write one that works with D::T::Lazy out of the box - drop me a line if you'd like me
351						to accelerate that.
352
353						=cut
354
355	0			0		sub show { shift->show_generic ('\|', 1, 1); }
356						sub show_generic {
357	0			0		eval { require Text::Table; };
	0
358	0	0				croak "Text::Table not installed" if $@;
359
360	0					my $self = shift;
361	0	0				return '' unless defined $self->{data};
362	0					my $sep = shift;
363	0					my $headers = shift;
364	0					my $rule = shift;
365
366	0					my @headers = ();
367	0	0				if ($headers) {
368	0	0				if (ref $headers eq 'ARRAY') {
369	0					@headers = @$headers;
370						} else {
371	0					@headers = $self->headers;
372						}
373						}
374	0					my @c = ();
375	0	0	0			if (defined $sep and @headers) {
376	0	0				push @c, \$sep if defined $sep;
377	0					foreach my $h (@headers) {
378	0					push @c, $h, \$sep;
379						}
380						}
381	0	0				my $t = Text::Table->new($sep ? @c : @headers);
382	0					$t->load (@{$self->{data}});
	0
383
384	0					my @rule_p = ('-', '+');
385	0	0	0			@rule_p = @$rule if $rule and ref $rule eq 'ARRAY';
386	0					my $rule_text = '';
387	0	0	0			$rule_text = $t->rule(@rule_p) if @headers and $rule;
388
389	0					my $text = '';
390	0					$text .= $rule_text;
391	0	0				$text .= $t->title() if @headers;
392	0					$text .= $rule_text;
393	0					$text .= $t->body();
394	0					$text .= $rule_text;
395	0					return $text;
396						}
397
398						=head2 report (not yet implemented)
399
400						The C method is a little different from C - it's essentially good for formatting things with a sprintf
401						and suppressing repeat values, making it useful for simple presentation of things like dated entries (the date appears
402						only when it changes). I use this kind of thing a lot in my everyday utilities, so it's convenient to bundle it
403						here in generalized form.
404
405						=cut
406
407	0			0		sub report {
408						}
409
410						=head2 add, glue, insert (not yet implemented)
411
412						On the other hand, maybe you just want to append one or more rows. To do that, just C to add
413						a single row, or C to append all the rows from another table object. The second parameter,
414						if provided, is the header for the new row.
415
416						To do the same thing on the column dimension, use C to tack a new column onto the left of the
417						table, or C to glue all the columns of a table onto the left. To insert the columns somewhere
418						other than the left, use C and they'll be inserted to the right of
419						the column with that number. Or, if the column isn't a number, then the column headers will be used.
420
421						If C or C is passed a coderef, then the new row will be made by repeated calls to the coderef, each
422						call passing the table and the column for arbitrary calculation. (Or the new column will be made with each row.)
423
424						=cut
425
426	0			0		sub add {
427						}
428	0			0		sub glue {
429						}
430	0			0		sub insert {
431						}
432
433						=head2 copy (not yet implemented)
434
435						The C method copies an entire table's contents into a segment of the current table. We specify the upper
436						left corner of the target, so to copy the contents without further ado, simply C. The
437						dimensions of the target will be expanded to match. This doesn't affect the headers.
438
439						=cut
440
441	0			0		sub copy {
442						}
443
444						=head2 slice (row, rows, col, cols) (not yet implemented)
445
446						The C method is used to extract sections from a table to make a new table object. For example,
447						C slices out a two-by-two chunk from row/column 2,2. (0-indexed).
448						C slices out the "total" row only.
449
450						The return is always a new table object.
451
452						=cut
453
454	0			0		sub slice {
455						}
456
457						=head2 crop (row, rows, col, cols) (not yet implemented)
458
459						Does the same as C, but destructively in place.
460
461						=cut
462
463	0			0		sub crop {
464						}
465
466						=head2 sort (function), sortcols (not yet implemented)
467
468						Returns a sorting array for the entire table (if buffered, just the part in the buffer) that is produced
469						by applying the coderef C to an array [0, 1, 2, ... n]. The C method does this for the
470						rows, with C doing the same for the columns.
471
472						=cut
473
474	0			0		sub sort {
475						}
476	0			0		sub sortcols {
477						}
478
479						=head2 shuffle, shufflecols (not yet implemented)
480
481						Given an array of row numbers, builds a new table with those rows. (Or, column numbers for columns for C.)
482						Yes, you can just pass sort's return into shuffle to produce a sorted array - but you don't I to.
483						If there are row/column numbers missing, then those rows/columns won't appear in the new table. Finally, all header,
484						type, and rowheader data will be shuffled appropriately as well.
485
486						=cut
487
488	0			0		sub shuffle {
489						}
490	0			0		sub shufflecols {
491						}
492
493						=head2 filter (function), filtercols (not yet implemented)
494
495						Returns an array of rows (or columns) that return a positive result from a coderef. Again, this can then be used with
496						shuffle or shufflecols to produce a new array.
497
498						=cut
499
500	0			0		sub filter {
501						}
502	0			0		sub filtercols {
503						}
504
505						=head2 flip (not yet implemented)
506
507						Flips the entire array rows for columns. If there was a coderef iterator after the last row, it is discarded. (That is,
508						the table is truncated first.)
509
510						If there's an underlying object, the link will be broken. If you really want to flip the underlying object, say you want
511						to flip a section of an Excel spreadsheet, then read it in, flip the array, and write out a new sheet segment - which is
512						probably going to be messy unless the section was square to start with. In the case of an SQL database, what does flipping
513						even mean? Probably nothing. You probably want to rethink your strategy.
514
515						=cut
516
517	0			0		sub flip {
518						}
519
520						=head1 AUTHOR
521
522						Michael Roberts, C<< >>
523
524						=head1 BUGS
525
526						Please report any bugs or feature requests to C, or through
527						the web interface at L. I will be notified, and then you'll
528						automatically be notified of progress on your bug as I make changes.
529
530
531						=head1 Data::Tab::db
532
533						Just for simplicity's sake (and to save typing) we also provide a simple wrapper for the DBI class that allows us to say:
534
535						use Data::Tab::db;
536
537						my $db = Data::Tab::db->connect (my connection parameters)
538						$db->query("select * from my_table")->read->show;
539
540						Done.
541
542						=cut
543
544						package Data::Tab::db;
545
546	1			1	14	use Data::Tab;
	1				2
	1				32
547	1			1	2988	use DBI;
	0
	0
548
549						sub connect {
550						my $self = bless ({}, shift);
551						$self->{dbh} = DBI->connect(@_);
552						$self;
553						}
554
555						sub query {
556						my $self = shift;
557						Data::Tab->query($self->{dbh}, @_);
558						}
559
560						sub dbh { shift->{dbh}; }
561
562
563						=head1 SUPPORT
564
565						You can find documentation for this module with the perldoc command.
566
567						perldoc Data::Tab
568
569
570						You can also look for information at:
571
572						=over 4
573
574						=item * RT: CPAN's request tracker (report bugs here)
575
576						L
577
578						=item * AnnoCPAN: Annotated CPAN documentation
579
580						L
581
582						=item * CPAN Ratings
583
584						L
585
586						=item * Search CPAN
587
588						L
589
590						=back
591
592
593						=head1 ACKNOWLEDGEMENTS
594
595
596						=head1 LICENSE AND COPYRIGHT
597
598						Copyright 2012 Michael Roberts.
599
600						This program is free software; you can redistribute it and/or modify it
601						under the terms of the the Artistic License (2.0). You may obtain a
602						copy of the full license at:
603
604						L
605
606						Any use, modification, and distribution of the Standard or Modified
607						Versions is governed by this Artistic License. By using, modifying or
608						distributing the Package, you accept this license. Do not use, modify,
609						or distribute the Package, if you do not accept this license.
610
611						If your Modified Version has been derived from a Modified Version made
612						by someone other than you, you are nevertheless required to ensure that
613						your Modified Version complies with the requirements of this license.
614
615						This license does not grant you the right to use any trademark, service
616						mark, tradename, or logo of the Copyright Holder.
617
618						This license includes the non-exclusive, worldwide, free-of-charge
619						patent license to make, have made, use, offer to sell, sell, import and
620						otherwise transfer the Package with respect to any patent claims
621						licensable by the Copyright Holder that are necessarily infringed by the
622						Package. If you institute patent litigation (including a cross-claim or
623						counterclaim) against any party alleging that the Package constitutes
624						direct or contributory patent infringement, then this Artistic License
625						to you shall terminate on the date that such litigation is filed.
626
627						Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER
628						AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES.
629						THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
630						PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY
631						YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR
632						CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR
633						CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE,
634						EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
635
636
637						=cut
638
639						1; # End of Data::Tab