File Coverage

blib/lib/Bio/Phylo/NeXML/DOM.pm
Criterion Covered Total %
statement 18 42 42.8
branch 0 8 0.0
condition 0 3 0.0
subroutine 6 16 37.5
pod 8 8 100.0
total 32 77 41.5


line stmt bran cond sub pod time code
1             package Bio::Phylo::NeXML::DOM;
2 51     51   285 use strict;
  51         90  
  51         1278  
3 51     51   225 use base 'Bio::Phylo';
  51         85  
  51         4458  
4 51     51   299 use Bio::Phylo::Util::CONSTANT qw'_DOMCREATOR_ looks_like_class';
  51         94  
  51         2388  
5 51     51   267 use Bio::Phylo::Util::Exceptions 'throw';
  51         92  
  51         1829  
6 51     51   262 use Bio::Phylo::Factory;
  51         102  
  51         269  
7 51     51   247 use File::Spec::Unix;
  51         112  
  51         20710  
8              
9             # store DOM factory object as a global here, to avoid proliferation of
10             # function arguments
11             our $DOM;
12             {
13             my $CONSTANT_TYPE = _DOMCREATOR_;
14             my (%format);
15             my $fac = Bio::Phylo::Factory->new;
16              
17             =head1 NAME
18              
19             Bio::Phylo::NeXML::DOM - XML DOM support for Bio::Phylo
20              
21             =head1 SYNOPSIS
22              
23             use Bio::Phylo::NeXML::DOM;
24             use Bio::Phylo::IO qw( parse );
25             Bio::Phylo::NeXML::DOM->new(-format => 'twig');
26             my $project = parse( -file=>'my.nex', -format=>'nexus' );
27             my $nex_twig = $project->doc();
28              
29             =head1 DESCRIPTION
30              
31             This module adds C<to_dom> methods to L<Bio::Phylo::NeXML::Writable>
32             classes, which provide NeXML-valid objects for document object model
33             manipulation. DOM formats currently available are C<XML::Twig> and
34             C<XML::LibXML>. For any C<XMLWritable> object, use C<to_dom> in place
35             of C<to_xml> to create DOM nodes.
36              
37             The C<doc()> method is also added to the C<Bio::Phylo::Project> class. It
38             returns a NeXML document as a DOM object populated by the current contents
39             of the C<Bio::Phylo::Project> object.
40              
41             =head1 MOTIVATION
42              
43             The NeXML parsing/writing capability of C<Bio::Phylo> goes a long way
44             towards wider adoption of this useful standard.
45              
46             However, while C<Bio::Phylo> can write NeXML-valid XML, the way in
47             which it does this natively is somewhat hard-coded and therefore
48             restricted, and is essentially oriented toward text file output. As
49             such, there is a mismatch between the sophisticated C<Bio::Phylo> data
50             structure and its own ability to manipulate and serialize that
51             structure in sophisticated but interoperable ways. Finer manipulations
52             of XML-represented data are possible via through a variety of Perl
53             packages that can store and control XML according to a document
54             object model (DOM). Many of these packages allow extremely flexible
55             computation over large datasets stored in XML format, and admit the
56             use of XML-related facilities such as XPath and XSLT programmatically.
57              
58             The purpose of C<Bio::Phylo::NeXML::DOM> is to introduce integrated DOM
59             object creation and manipulation to C<Bio::Phylo>, both to make DOM
60             computation in C<Bio::Phylo> more convenient, and also to provide a
61             platform for potentially more sophisticated C<Bio::Phylo> modules to
62             come.
63              
64             =head1 DESIGN
65              
66             Besides the notion that DOM capability should be optional for the user,
67             there are two main design ideas. First, for each C<Bio::Phylo> object
68             that can be parsed/written as NeXML (i.e., for each
69             C<Bio::Phylo::NeXML::Writable> object), we provide analogous method
70             for creating a representative DOM object, or element. These elements
71             are aggregatable in a DOM document object, whose native stringifying
72             method can be used to generate valid NeXML.
73              
74             Second, we allow flexibility and extensibility in the choice of the
75             underlying DOM package, while maintaining a consistent DOM interface
76             that is similar in semantic and syntactic style to the accessors and
77             mutators that act on the C<Bio::Phylo> objects themselves. This is
78             achieved through the DOM::DocumentI and DOM::ElementI interfaces,
79             which define a minimal subset of DOM accessors and mutators, their
80             inputs and outputs. Concrete instances of these interface classes
81             provide the bindings between the abstract methods and their
82             counterparts in the desired DOM implementation. Currently, there are
83             bindings for two popular packages, C<XML::Twig> and C<XML::LibXML>.
84              
85             Another priority was simplicity of use; most of the details remain
86             under the hood in practice. The C<Bio/Phylo/Util/DOM.pm> file defines the
87             C<to_dom()> method for each C<XMLWritable> package, as well as the
88             C<Bio::Phylo::NeXML::DOM> package proper. The C<DOM> object is a
89             factory that is used to create Element and Document objects; it is an
90             inside-out object that subclasses C<Bio::Phylo>. To curb the
91             proliferation of method arguments, a DOM factory instance (set by the
92             latest invocation of C<Bio::Phylo::NeXML::DOM-E<gt>new()>) is maintained in
93             a package global. This is used by default for object creation with DOM
94             methods if a DOM factory object is not explicitly provided in the
95             argument list.
96              
97             The underlying DOM implementation is set with the C<DOM> factory
98             constructor's single argument, C<-format>. Even this can be left out;
99             the default implementation is C<XML::Twig>, which is already required
100             by C<Bio::Phylo>. Thus, for example, one can use the DOM to convert
101             a Nexus file to a DOM representation as follows:
102              
103             use Bio::Phylo::NeXML::DOM;
104             use Bio::Phylo::IO qw( parse );
105             Bio::Phylo::NeXML::DOM->new();
106             my $project = parse( -file=>'my.nex', -format=>'nexus' );
107             my $nex_twig = $project->doc();
108             # The end.
109              
110             Underlying DOM packages are loaded at runtime as specified by the
111             C<-format> argument. Packages for unused formats do not need to be
112             installed.
113              
114             =head1 INTERFACE METHODS
115              
116             The minimal DOM interface specifies the following methods. Details can be
117             obtained from the C<Element> and C<Document> POD.
118              
119             =head2 Bio::Phylo::NeXML::DOM::Element - DOM Element abstract class
120              
121             get_tagname()
122             set_tagname()
123             get_attributes()
124             set_attributes()
125             clear_attributes()
126             get_text()
127             set_text()
128             clear_text()
129              
130             get_parent()
131             get_children()
132             get_first_child()
133             get_last_child()
134             get_next_sibling()
135             get_prev_sibling()
136             get_elements_by_tagname()
137              
138             set_child()
139             prune_child()
140              
141             to_xml_string()
142              
143             =head2 Bio::Phylo::NeXML::DOM::Document - DOM Document
144              
145             get_encoding()
146             set_encoding()
147              
148             get_root()
149             set_root()
150              
151             get_element_by_id()
152             get_elements_by_tagname()
153              
154             to_xml_string()
155             to_xml_file()
156              
157             =head1 METHODS
158              
159             =head2 CONSTRUCTOR
160              
161             =over
162              
163             =item new()
164              
165             Type : Constructor
166             Title : new
167             Usage : $dom = Bio::Phylo::NeXML::DOM->new(-format=>$format)
168             Function: Create a new DOM factory
169             Returns : DOM object
170             Args : optional: -format => DOM format (defaults to 'twig')
171              
172             =cut
173              
174             sub new {
175 0     0 1   my $self = shift->SUPER::new( '-format' => 'twig', @_ );
176 0           return $DOM = $self;
177             }
178              
179             =back
180              
181             =head2 FACTORY METHODS
182              
183             =over
184              
185             =item create_element()
186              
187             Type : Factory method
188             Title : create_element
189             Usage : $elt = $dom->create_element()
190             Function: Create a new XML DOM element
191             Returns : DOM element
192             Args : Optional:
193             -tag => $tag_name
194             -attr => \%attr_hash
195              
196             =cut
197              
198             sub create_element {
199 0 0   0 1   if ( my $format = shift->get_format ) {
200 0           return $fac->create_element( '-format' => $format, @_ );
201             }
202             else {
203 0           throw 'BadArgs' => 'DOM creator format not set';
204             }
205             }
206              
207             =item parse_element()
208              
209             Type : Factory method
210             Title : parse_element
211             Usage : $elt = $dom->parse_element($text)
212             Function: Create a new XML DOM element from XML text
213             Returns : DOM element
214             Args : An XML String
215              
216             =cut
217              
218             sub parse_element {
219 0 0   0 1   if ( my $f = shift->get_format ) {
220 0           return looks_like_class( __PACKAGE__ . '::Element::' . $f )
221             ->parse_element(shift);
222             }
223             else {
224 0           throw 'BadArgs' => 'DOM creator format not set';
225             }
226             }
227              
228             =item create_document()
229              
230             Type : Creator
231             Title : create_document
232             Usage : $doc = $dom->create_document()
233             Function: Create a new XML DOM document
234             Returns : DOM document
235             Args : Package-specific args
236              
237             =cut
238              
239             sub create_document {
240 0 0   0 1   if ( my $format = shift->get_format ) {
241 0           return $fac->create_document( '-format' => $format, @_ );
242             }
243             else {
244 0           throw 'BadArgs' => 'DOM creator format not set';
245             }
246             }
247              
248             =item parse_document()
249              
250             Type : Factory method
251             Title : parse_document
252             Usage : $doc = $dom->parse_document($text)
253             Function: Create a new XML DOM document from XML text
254             Returns : DOM document
255             Args : An XML String
256              
257             =cut
258              
259             sub parse_document {
260 0 0   0 1   if ( my $format = shift->get_format ) {
261 0           my $implementation = __PACKAGE__ . '::' . $format;
262 0           return $implementation->parse_document(shift);
263             }
264             else {
265 0           throw 'BadArgs' => 'DOM creator format not set';
266             }
267             }
268              
269             =back
270              
271             =head2 MUTATORS
272              
273             =over
274              
275             =item set_format()
276              
277             Type : Mutator
278             Title : set_format
279             Usage : $dom->set_format($format)
280             Function: Set the format (underlying DOM package bindings) for this object
281             Returns : format designator as string
282             Args : format designator as string
283              
284             =cut
285              
286             sub set_format {
287 0     0 1   my $self = shift;
288 0           $format{ $self->get_id } = shift;
289 0           return $self;
290             }
291              
292             =back
293              
294             =head2 ACCESSORS
295              
296             =over
297              
298             =item get_format()
299              
300             Type : Accessor
301             Title : get_format
302             Usage : $dom->get_format()
303             Function: Get the format designator for this object
304             Returns : format designator as string
305             Args : none
306              
307             =cut
308              
309             sub get_format {
310 0     0 1   my $self = shift;
311 0           return ucfirst( lc( $format{ $self->get_id } ) );
312             }
313              
314             =item get_dom()
315              
316             Type : Static accessor
317             Title : get_dom
318             Usage : __PACKAGE__->get_dom()
319             Function: Get the singleton DOM object
320             Returns : instance of this __PACKAGE__
321             Args : none
322              
323             =cut
324              
325 0   0 0 1   sub get_dom { $DOM ||= __PACKAGE__->new }
326              
327             =begin comment
328              
329             Type : Internal method
330             Title : _type
331             Usage : $node->_type;
332             Function:
333             Returns : CONSTANT
334             Args :
335              
336             =end comment
337              
338             =cut
339              
340 0     0     sub _type { $CONSTANT_TYPE }
341              
342             =begin comment
343              
344             Type : Internal method
345             Title : _cleanup
346             Usage : $node->_cleanup;
347             Function:
348             Returns : CONSTANT
349             Args :
350              
351             =end comment
352              
353             =cut
354              
355             sub _cleanup {
356 0     0     my $self = shift;
357 0           delete $format{ $self->get_id };
358             }
359              
360             =back
361              
362             =cut
363              
364             =head1 SEE ALSO
365              
366             There is a mailing list at L<https://groups.google.com/forum/#!forum/bio-phylo>
367             for any user or developer questions and discussions.
368              
369             The DOM creator abstract classes: L<Bio::Phylo::NeXML::DOM::Element>,
370             L<Bio::Phylo::NeXML::DOM::Document>
371              
372             =head1 CITATION
373              
374             If you use Bio::Phylo in published research, please cite it:
375              
376             B<Rutger A Vos>, B<Jason Caravas>, B<Klaas Hartmann>, B<Mark A Jensen>
377             and B<Chase Miller>, 2011. Bio::Phylo - phyloinformatic analysis using Perl.
378             I<BMC Bioinformatics> B<12>:63.
379             L<http://dx.doi.org/10.1186/1471-2105-12-63>
380              
381             =head1 AUTHOR
382              
383             Mark A. Jensen (maj -at- fortinbras -dot- us), refactored by Rutger Vos
384              
385             =head1 TODO
386              
387             The C<Bio::Phylo::Annotation> class is not yet DOMized.
388              
389             =cut
390              
391             }
392             1;