logo
  • Jobs
  • About Me
  • Contact
  • Home
« Reading binary data in C#
VS.NET 2005, ASP.NET and inheriting custom classes »

XmlDocument vs XPathNavigator

Posted June 16th, 2004 by Matt Berther

I have been happily using the XmlDocument object, as it was a natural progression from the MSXML4 object model, which Ive used for years.

However, today, I was unbelievably amazed with the performance differences between XmlDocument and XPathNavigator. For a little demo project, I had a directory full of xml files (11,250 to be exact). The goal was to iterate these files, load each one up, and pull relevant information out of each one to populate a Lucene.Net index.

I implemented this using an XmlDocument that was loaded up with the contents of each file, and used SelectNodes and SelectSingleNode to get the information I wanted out of the xml file, and then placed those pieces in the index. This took approximately 8.5 minutes to complete.

This just seemed to be way to long, so after doing a bit of looking around online, I came across Dare’s article regarding Best Practices for Representing Xml in the .NET framework.

Since I had no need to update the xml, I was left with three APIs. System.Xml.XmlReader, System.Xml.XPath.XPathNavigator, and System.Xml.XmlDocument. I was already using System.Xml.XmlDocument, so that was out of the picture. System.Xml.XmlReader, was also out of the picture, since I needed to be able to use XPath queries to get items out.

That left the XPathNavigator. I went through and updated the code to use the XPathDocument to load the xml, and then passed it off to the method that actually did the parsing. For what its worth, I left the parameter to this method as IXPathNavigable, since both XmlDocument and XPathDocument implement this interface. This way, I could revert back in case the test failed.

I then updated all the code in the parsing method to use the XPathNavigator methods, and eagerly ran the test. All data came back the same, an identical index was created, however, this time it ran in 5.75 minutes.

Absolutely amazing that using a different object model could make *that* big of a difference (right around 2 minutes and 45 seconds).

The key thing to take away from this is that unless you *absolutely* need the editing capabilities of XmlDocument, performance wise, you would be much better to use the XPathNavigator.

5 Comments

This entry was posted on Wednesday, June 16th, 2004 at 1:23 pm and is filed under Uncategorized. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

DonXML Demsak
June 16th, 2004

See my post Checklist: XML Performance – The Contradiction ( http://donxml.com/allthingstechie/archive/2004/06/16/828.aspx ) which documents one of the major flaws (IMHO) with XPathNavigator, always executing the XPathNavigator.Select() method from the root context. It also links out to kzu’s great post on how to compile XPath statements that use variables (dynamic XPath).

Matt Berther
June 17th, 2004

Thanks for the heads-up. This is something that is worth knowing, although, in this particular case, that did not turn out to be an issue. Your link to kzu’s post did help me shave another 25 seconds off the run time. I was unaware that you could compile an XPathExpression out of any document. I assumed it had to be created in the context of the document it would be used in, so moving those out into an XPathExpressionCache class as suggested helped a little more with performance.

Brian Lyttle
June 17th, 2004

Would XPathReader be an even quicker option for you?

http://www.microsoft.com/downloads/details.aspx?FamilyID=db0c5fae-111d-4b24-b10c-e4cdb13705da&DisplayLang=en

Matt Berther
June 17th, 2004

Thanks for the comment, Brian. I just got done looking at XPathReader, and for some reason, it is actually slower for me, by about 30 seconds.

I’m not sure if it is because it is actually going through every element in the document, whereas (I think) the XPathNavigator is only picking out the items I’m interested in.

unni
June 6th, 2007

just wondering how can we build an expeth expression if we want to load the xml from a string, not from a file.
there is no way to do that? i am not seeing an xpathdocument constructor using a string instead of a uri to load the xml data

clueless!…

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>
-->

flag
Favorite Charity
wounded warrior project
Search
Social
  • mattberther on twitter
  • mattberther on linkedin
Syndication
Archives
  • January 2010
  • September 2009
  • July 2009
  • June 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • September 2008
  • August 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007
  • February 2007
  • January 2007
  • December 2006
  • November 2006
  • October 2006
  • September 2006
  • August 2006
  • July 2006
  • June 2006
  • May 2006
  • April 2006
  • March 2006
  • February 2006
  • January 2006
  • December 2005
  • November 2005
  • October 2005
  • September 2005
  • August 2005
  • July 2005
  • June 2005
  • May 2005
  • April 2005
  • March 2005
  • February 2005
  • January 2005
  • December 2004
  • November 2004
  • October 2004
  • September 2004
  • August 2004
  • July 2004
  • June 2004
  • May 2004
  • April 2004
  • March 2004
  • February 2004
  • January 2004
  • December 2003
  • November 2003
  • October 2003
  • September 2003
  • August 2003
  • July 2003
  • June 2003
  • May 2003
  • April 2003
  • March 2003
mattberther.com © 2003 - 2010