File Coverage

blib/lib/Bio/Phylo/NeXML/DOM.pm
Criterion Covered Total %
statement 21 45 46.6
branch 0 8 0.0
condition 0 3 0.0
subroutine 7 17 41.1
pod 8 8 100.0
total 36 81 44.4


line stmt bran cond sub pod time code
1             package Bio::Phylo::NeXML::DOM;
2 51     51   284 use strict;
  51         93  
  51         1261  
3 51     51   228 use warnings;
  51         94  
  51         1119  
4 51     51   226 use base 'Bio::Phylo';
  51         163  
  51         4554  
5 51     51   315 use Bio::Phylo::Util::CONSTANT qw'_DOMCREATOR_ looks_like_class';
  51         120  
  51         2537  
6 51     51   286 use Bio::Phylo::Util::Exceptions 'throw';
  51         99  
  51         1817  
7 51     51   265 use Bio::Phylo::Factory;
  51         85  
  51         243  
8 51     51   228 use File::Spec::Unix;
  51         98  
  51         20550  
9              
10             # store DOM factory object as a global here, to avoid proliferation of
11             # function arguments
12             our $DOM;
13             {
14             my $CONSTANT_TYPE = _DOMCREATOR_;
15             my (%format);
16             my $fac = Bio::Phylo::Factory->new;
17              
18             =head1 NAME
19              
20             Bio::Phylo::NeXML::DOM - XML DOM support for Bio::Phylo
21              
22             =head1 SYNOPSIS
23              
24             use Bio::Phylo::NeXML::DOM;
25             use Bio::Phylo::IO qw( parse );
26             Bio::Phylo::NeXML::DOM->new(-format => 'twig');
27             my $project = parse( -file=>'my.nex', -format=>'nexus' );
28             my $nex_twig = $project->doc();
29              
30             =head1 DESCRIPTION
31              
32             This module adds C<to_dom> methods to L<Bio::Phylo::NeXML::Writable>
33             classes, which provide NeXML-valid objects for document object model
34             manipulation. DOM formats currently available are C<XML::Twig> and
35             C<XML::LibXML>. For any C<XMLWritable> object, use C<to_dom> in place
36             of C<to_xml> to create DOM nodes.
37              
38             The C<doc()> method is also added to the C<Bio::Phylo::Project> class. It
39             returns a NeXML document as a DOM object populated by the current contents
40             of the C<Bio::Phylo::Project> object.
41              
42             =head1 MOTIVATION
43              
44             The NeXML parsing/writing capability of C<Bio::Phylo> goes a long way
45             towards wider adoption of this useful standard.
46              
47             However, while C<Bio::Phylo> can write NeXML-valid XML, the way in
48             which it does this natively is somewhat hard-coded and therefore
49             restricted, and is essentially oriented toward text file output. As
50             such, there is a mismatch between the sophisticated C<Bio::Phylo> data
51             structure and its own ability to manipulate and serialize that
52             structure in sophisticated but interoperable ways. Finer manipulations
53             of XML-represented data are possible via through a variety of Perl
54             packages that can store and control XML according to a document
55             object model (DOM). Many of these packages allow extremely flexible
56             computation over large datasets stored in XML format, and admit the
57             use of XML-related facilities such as XPath and XSLT programmatically.
58              
59             The purpose of C<Bio::Phylo::NeXML::DOM> is to introduce integrated DOM
60             object creation and manipulation to C<Bio::Phylo>, both to make DOM
61             computation in C<Bio::Phylo> more convenient, and also to provide a
62             platform for potentially more sophisticated C<Bio::Phylo> modules to
63             come.
64              
65             =head1 DESIGN
66              
67             Besides the notion that DOM capability should be optional for the user,
68             there are two main design ideas. First, for each C<Bio::Phylo> object
69             that can be parsed/written as NeXML (i.e., for each
70             C<Bio::Phylo::NeXML::Writable> object), we provide analogous method
71             for creating a representative DOM object, or element. These elements
72             are aggregatable in a DOM document object, whose native stringifying
73             method can be used to generate valid NeXML.
74              
75             Second, we allow flexibility and extensibility in the choice of the
76             underlying DOM package, while maintaining a consistent DOM interface
77             that is similar in semantic and syntactic style to the accessors and
78             mutators that act on the C<Bio::Phylo> objects themselves. This is
79             achieved through the DOM::DocumentI and DOM::ElementI interfaces,
80             which define a minimal subset of DOM accessors and mutators, their
81             inputs and outputs. Concrete instances of these interface classes
82             provide the bindings between the abstract methods and their
83             counterparts in the desired DOM implementation. Currently, there are
84             bindings for two popular packages, C<XML::Twig> and C<XML::LibXML>.
85              
86             Another priority was simplicity of use; most of the details remain
87             under the hood in practice. The C<Bio/Phylo/Util/DOM.pm> file defines the
88             C<to_dom()> method for each C<XMLWritable> package, as well as the
89             C<Bio::Phylo::NeXML::DOM> package proper. The C<DOM> object is a
90             factory that is used to create Element and Document objects; it is an
91             inside-out object that subclasses C<Bio::Phylo>. To curb the
92             proliferation of method arguments, a DOM factory instance (set by the
93             latest invocation of C<Bio::Phylo::NeXML::DOM-E<gt>new()>) is maintained in
94             a package global. This is used by default for object creation with DOM
95             methods if a DOM factory object is not explicitly provided in the
96             argument list.
97              
98             The underlying DOM implementation is set with the C<DOM> factory
99             constructor's single argument, C<-format>. Even this can be left out;
100             the default implementation is C<XML::Twig>, which is already required
101             by C<Bio::Phylo>. Thus, for example, one can use the DOM to convert
102             a Nexus file to a DOM representation as follows:
103              
104             use Bio::Phylo::NeXML::DOM;
105             use Bio::Phylo::IO qw( parse );
106             Bio::Phylo::NeXML::DOM->new();
107             my $project = parse( -file=>'my.nex', -format=>'nexus' );
108             my $nex_twig = $project->doc();
109             # The end.
110              
111             Underlying DOM packages are loaded at runtime as specified by the
112             C<-format> argument. Packages for unused formats do not need to be
113             installed.
114              
115             =head1 INTERFACE METHODS
116              
117             The minimal DOM interface specifies the following methods. Details can be
118             obtained from the C<Element> and C<Document> POD.
119              
120             =head2 Bio::Phylo::NeXML::DOM::Element - DOM Element abstract class
121              
122             get_tagname()
123             set_tagname()
124             get_attributes()
125             set_attributes()
126             clear_attributes()
127             get_text()
128             set_text()
129             clear_text()
130              
131             get_parent()
132             get_children()
133             get_first_child()
134             get_last_child()
135             get_next_sibling()
136             get_prev_sibling()
137             get_elements_by_tagname()
138              
139             set_child()
140             prune_child()
141              
142             to_xml_string()
143              
144             =head2 Bio::Phylo::NeXML::DOM::Document - DOM Document
145              
146             get_encoding()
147             set_encoding()
148              
149             get_root()
150             set_root()
151              
152             get_element_by_id()
153             get_elements_by_tagname()
154              
155             to_xml_string()
156             to_xml_file()
157              
158             =head1 METHODS
159              
160             =head2 CONSTRUCTOR
161              
162             =over
163              
164             =item new()
165              
166             Type : Constructor
167             Title : new
168             Usage : $dom = Bio::Phylo::NeXML::DOM->new(-format=>$format)
169             Function: Create a new DOM factory
170             Returns : DOM object
171             Args : optional: -format => DOM format (defaults to 'twig')
172              
173             =cut
174              
175             sub new {
176 0     0 1   my $self = shift->SUPER::new( '-format' => 'twig', @_ );
177 0           return $DOM = $self;
178             }
179              
180             =back
181              
182             =head2 FACTORY METHODS
183              
184             =over
185              
186             =item create_element()
187              
188             Type : Factory method
189             Title : create_element
190             Usage : $elt = $dom->create_element()
191             Function: Create a new XML DOM element
192             Returns : DOM element
193             Args : Optional:
194             -tag => $tag_name
195             -attr => \%attr_hash
196              
197             =cut
198              
199             sub create_element {
200 0 0   0 1   if ( my $format = shift->get_format ) {
201 0           return $fac->create_element( '-format' => $format, @_ );
202             }
203             else {
204 0           throw 'BadArgs' => 'DOM creator format not set';
205             }
206             }
207              
208             =item parse_element()
209              
210             Type : Factory method
211             Title : parse_element
212             Usage : $elt = $dom->parse_element($text)
213             Function: Create a new XML DOM element from XML text
214             Returns : DOM element
215             Args : An XML String
216              
217             =cut
218              
219             sub parse_element {
220 0 0   0 1   if ( my $f = shift->get_format ) {
221 0           return looks_like_class( __PACKAGE__ . '::Element::' . $f )
222             ->parse_element(shift);
223             }
224             else {
225 0           throw 'BadArgs' => 'DOM creator format not set';
226             }
227             }
228              
229             =item create_document()
230              
231             Type : Creator
232             Title : create_document
233             Usage : $doc = $dom->create_document()
234             Function: Create a new XML DOM document
235             Returns : DOM document
236             Args : Package-specific args
237              
238             =cut
239              
240             sub create_document {
241 0 0   0 1   if ( my $format = shift->get_format ) {
242 0           return $fac->create_document( '-format' => $format, @_ );
243             }
244             else {
245 0           throw 'BadArgs' => 'DOM creator format not set';
246             }
247             }
248              
249             =item parse_document()
250              
251             Type : Factory method
252             Title : parse_document
253             Usage : $doc = $dom->parse_document($text)
254             Function: Create a new XML DOM document from XML text
255             Returns : DOM document
256             Args : An XML String
257              
258             =cut
259              
260             sub parse_document {
261 0 0   0 1   if ( my $format = shift->get_format ) {
262 0           my $implementation = __PACKAGE__ . '::' . $format;
263 0           return $implementation->parse_document(shift);
264             }
265             else {
266 0           throw 'BadArgs' => 'DOM creator format not set';
267             }
268             }
269              
270             =back
271              
272             =head2 MUTATORS
273              
274             =over
275              
276             =item set_format()
277              
278             Type : Mutator
279             Title : set_format
280             Usage : $dom->set_format($format)
281             Function: Set the format (underlying DOM package bindings) for this object
282             Returns : format designator as string
283             Args : format designator as string
284              
285             =cut
286              
287             sub set_format {
288 0     0 1   my $self = shift;
289 0           $format{ $self->get_id } = shift;
290 0           return $self;
291             }
292              
293             =back
294              
295             =head2 ACCESSORS
296              
297             =over
298              
299             =item get_format()
300              
301             Type : Accessor
302             Title : get_format
303             Usage : $dom->get_format()
304             Function: Get the format designator for this object
305             Returns : format designator as string
306             Args : none
307              
308             =cut
309              
310             sub get_format {
311 0     0 1   my $self = shift;
312 0           return ucfirst( lc( $format{ $self->get_id } ) );
313             }
314              
315             =item get_dom()
316              
317             Type : Static accessor
318             Title : get_dom
319             Usage : __PACKAGE__->get_dom()
320             Function: Get the singleton DOM object
321             Returns : instance of this __PACKAGE__
322             Args : none
323              
324             =cut
325              
326 0   0 0 1   sub get_dom { $DOM ||= __PACKAGE__->new }
327              
328             =begin comment
329              
330             Type : Internal method
331             Title : _type
332             Usage : $node->_type;
333             Function:
334             Returns : CONSTANT
335             Args :
336              
337             =end comment
338              
339             =cut
340              
341 0     0     sub _type { $CONSTANT_TYPE }
342              
343             =begin comment
344              
345             Type : Internal method
346             Title : _cleanup
347             Usage : $node->_cleanup;
348             Function:
349             Returns : CONSTANT
350             Args :
351              
352             =end comment
353              
354             =cut
355              
356             sub _cleanup {
357 0     0     my $self = shift;
358 0           delete $format{ $self->get_id };
359             }
360              
361             =back
362              
363             =cut
364              
365             =head1 SEE ALSO
366              
367             There is a mailing list at L<https://groups.google.com/forum/#!forum/bio-phylo>
368             for any user or developer questions and discussions.
369              
370             The DOM creator abstract classes: L<Bio::Phylo::NeXML::DOM::Element>,
371             L<Bio::Phylo::NeXML::DOM::Document>
372              
373             =head1 CITATION
374              
375             If you use Bio::Phylo in published research, please cite it:
376              
377             B<Rutger A Vos>, B<Jason Caravas>, B<Klaas Hartmann>, B<Mark A Jensen>
378             and B<Chase Miller>, 2011. Bio::Phylo - phyloinformatic analysis using Perl.
379             I<BMC Bioinformatics> B<12>:63.
380             L<http://dx.doi.org/10.1186/1471-2105-12-63>
381              
382             =head1 AUTHOR
383              
384             Mark A. Jensen (maj -at- fortinbras -dot- us), refactored by Rutger Vos
385              
386             =head1 TODO
387              
388             The C<Bio::Phylo::Annotation> class is not yet DOMized.
389              
390             =cut
391              
392             }
393             1;