FastXML 0.1.93
May 6th, 2008
A long while ago I started work on a simple hpricot-like interface to libxml for ruby. I've finally push out an initial version, and it's available on the gem servers now just gem install fastxml. You can find the code on github.
Here's a sample of it's usage:
1 2 3 4 |
require 'fastxml' doc = FastXml( open( 'test.xml' ) ) (doc/"//a").each { |a| puts a.attr['href'] } |
Here's a simple synthetic benchmark. We just load a simple xml file attempt to do an xpath search. It's worth noting that the regular libxml binding is very speedy, because it doesn't actually match anything. Libxml annoyingly want's all xpath queries namespace qualified, that doesn't work out well if you have no root namespace.
(in /Users/segfault/Devel/fastxml)
ruby ./benchmarks/speedtest.rb
user system total real
fastxml.new 0.000000 0.000000 0.000000 ( 0.001211)
fastxml.to_s 0.000000 0.000000 0.000000 ( 0.000537)
fastxml.search 0.000000 0.000000 0.000000 ( 0.000315)
hpricot.new 0.020000 0.000000 0.020000 ( 0.021583)
hpricot.to_s 0.000000 0.000000 0.000000 ( 0.002366)
hpricot.search 0.000000 0.000000 0.000000 ( 0.000462)
libxml.new 0.000000 0.000000 0.000000 ( 0.001274)
libxml.to_s 0.000000 0.000000 0.000000 ( 0.000421)
libxml.search 0.000000 0.000000 0.000000 ( 0.000175)
REXML.new 0.020000 0.000000 0.020000 ( 0.018574)
REXML.to_s 0.010000 0.000000 0.010000 ( 0.003909)
REXML.xpath 0.000000 0.000000 0.000000 ( 0.001838)
fastxml progress
September 6th, 2007
My little pet xml parser interface is coming along well. I'm starting to implement all the little detail bits of the Hpricot api. As well as can be done using libxml that is. If anyone wants to test it out or help out, the code is available from svn://hasno.info/fastxml. I've also gone ahead and created a trac instance. Now for some random stats:
(in /Users/segfault/Devel/fastxml)
ruby ./benchmarks/unicode.rb
user system total real
fastxml.new 0.040000 0.000000 0.040000 ( 0.048961)
fastxml.to_s 0.020000 0.010000 0.030000 ( 0.021063)
fastxml.search 0.000000 0.000000 0.000000 ( 0.002023)
hpricot.new 0.700000 0.030000 0.730000 ( 0.753549)
hpricot.to_s 0.140000 0.010000 0.150000 ( 0.154857)
hpricot.search 0.280000 0.000000 0.280000 ( 0.294201)
libxml.new 0.040000 0.000000 0.040000 ( 0.038258)
libxml.to_s 0.010000 0.010000 0.020000 ( 0.024945)
libxml.search 0.010000 0.000000 0.010000 ( 0.002113)
REXML.new 1.390000 0.030000 1.420000 ( 1.452444)
REXML.to_s 0.440000 0.010000 0.450000 ( 0.464809)
REXML.xpath 103.720000 0.500000 104.220000 (107.125149)
xpath expression: //p
fastxml nodes: 10577
libxml nodes: 10577
hpricot nodes: 10577
REXML nodes: 10577The unicode benchmark just run's (new,to_s and an xpath query) on a well formed xml file (~900k). It's apparent that everything is faster than rexml. I wonder if anyone's game to add a REXML wrapper onto one of these libraries in order to speed up existing apps...
easy fast ruby libxml interface
July 26th, 2007
I've been a working on a little project for a while now, an hpricot-styled ruby libxml library. I started the project in order to learn the ruby c extension api and create an easy to use xml library for ruby. Hpricot is great but it is not a full fledged xml library and isn't intended as such. Libxml has ruby bindings, but they provide the same libxml api which is very un-ruby (imho). Rexml is just plain old slow. So my current hacking has left me with a library capable of loading xml strings/arrays whatever into an object (from what I've seen libxml-ruby doesn't support loading from memory/strings). I can run xpath searches and do xslt. I'm working on cleaning up the api to make it match hpricot and then I'll probably release the parse/read-only version as v0.1 in the next few weeks.
Here's a snippet of benchmark output comparing the different libraries in use (run on a late 2k6 Macbook w/ 2gb of ram):
Here's a snippet of benchmark output comparing the different libraries in use (run on a late 2k6 Macbook w/ 2gb of ram):
(in /Users/segfault/Devel/fastxml)
ruby ./benchmarks/speedtest.rb
user system total real
fastxml.new 0.000000 0.000000 0.000000 ( 0.001102)
fastxml.to_s 0.000000 0.000000 0.000000 ( 0.000629)
fastxml.search 0.000000 0.000000 0.000000 ( 0.000207)
hpricot.new 0.010000 0.000000 0.010000 ( 0.012319)
hpricot.to_s 0.000000 0.000000 0.000000 ( 0.003164)
hpricot.search 0.010000 0.000000 0.010000 ( 0.000603)
libxml.new 0.000000 0.000000 0.000000 ( 0.001287)
libxml.to_s 0.000000 0.000000 0.000000 ( 0.000698)
libxml.search 0.000000 0.000000 0.000000 ( 0.000073)
REXML.new 0.020000 0.000000 0.020000 ( 0.024030)
REXML.to_s 0.010000 0.000000 0.010000 ( 0.011971)
REXML.xpath 0.000000 0.000000 0.000000 ( 0.001092)
xpath expression: /feed/entry
fastxml nodes: 15
libxml nodes: 0
hpricot nodes: 15
REXML nodes: 15