Overview
Features
Download
Documentation
Community
Add-Ons & Services

XML parser speed

A general discussion forum.

XML parser speed

Postby softweyr » 25 Jul 2012, 21:43

I have a little compiler that I've written that uses the Poco DOM parser to parse it's input. I initially wrote this using the Pugi XML library, and then ported it to Poco because we're using Poco in our product, might as well use it for the tools too, right? The port was very straightforward, with a nearly one-to-one mapping of Pugi to Poco calls. But here's the rub:

Code: Select all
schemaCompiler wpeters$ time ./build/Debug/schemaCompiler schema.xml > /tmp/schema.sql

real   0m40.409s
user   0m40.138s
sys   0m0.119s
schemaCompiler wpeters$ time ../pugi/schemaCompiler/build/Debug/schemaCompiler schema.xml > /tmp/schema.sql

real   0m0.108s
user   0m0.065s
sys   0m0.034s


Ugh. Why would the Poco code be 400 times slower? The input file is relatively large, about 8.7M, but that's not all that big on a system with 8GB ram. I can post the code and the data file if anyone is interested in looking at it. The file is loaded into memory using a Poco::XML::DOMParser then parsed by using getElementsByName, getAttributeNode, and getChild.
softweyr
 
Posts: 4
Joined: 24 Jul 2012, 02:32

Re: XML parser speed

Postby guenter » 26 Jul 2012, 18:52

Try the latest code in the 1.4.4 SVN repository. It may solve this issue.
guenter
 
Posts: 1107
Joined: 11 Jul 2006, 16:27
Location: Austria

Re: XML parser speed

Postby marlowabnp » 08 Aug 2012, 10:05

It might also be worthwhile to compare the speed of Poco with libxml2. I have found libxml2 to be very fast, compared to Xerces. The library is rather large, and is C rather than C++, but it is quite easy to wrap some of the libxml2 functions into a little utility class to make it more convenient for OO development (e.g. mgmt of libxml2 allocated memory).
marlowabnp
 
Posts: 80
Joined: 08 Nov 2010, 17:29


Return to General Discussion

Who is online

Users browsing this forum: No registered users and 3 guests

cron