Xanimal: Easy XML parsing in Ruby
I was playing around with some XML parsing the other day, and allowed myself to become sidetracked. This is what resulted: a really easy way to extract data from an XML document. It’s called Xanimal for no better reason than that it contains X, M and L in that order.
It works by catching messages and building them up into XPath
expressions, so that, for example, document.foo.bar[0]
translates into '/foo/bar[1]'
(note that it translates
Ruby 0-indexing into XPath 1-indexing). It also allows iteration
over nodes using each
(or any other
Enumerable
method).
There’s also some potentially controversial functionality: if a
non-alphabetic method is called on a node (e.g. +
),
the node is automatically coerced into a String
,
Float
, or Integer
, depending on its
format.
Here’s an example of how to use it:
require 'xanimal' xml = %{ <root> <a name="alpha"> <b>1</b> </a> <a name="beta"> <b>2</b> <b>3</b> </a> </root> } doc = Xanimal::Document.string(xml) # Attributes doc.root.a.attr(:name) # => "alpha" # Specifying nodes by index doc.root.a[1].b[1].to_i # => 3 # Enumeration with automatic coercion doc.root.a[1].b.inject(0){ |sum, value| sum + value } # => 5 # Unanchored search doc.any.b.to_i # => 1 # Doesn't work ... yet # doc.root.a.b.inject(0){ |sum, value| sum + value } # => 6
Even in its basic state, for the subset of situations where its features are adequate, I think it provides a few advantages over using an XML parser directly:
- It looks like Ruby
- It behaves like Ruby (e.g. by translating indexing offsets)
- The XML parser is abstracted, so, whilst I’ve used LibXML2, it would also be possible to switch it for REXML or something else without altering client code.
The code is available via subversion:
svn co http://paulbattley.googlecode.com/svn/xanimal/trunk xanimal
Contributions would be very welcome!