XML Parsers

Everything you need to know about parsing XML – without being baffled by long winded ramblings and flag waving for the latest/greatest frameworks.

There are 2 basic parsing strategies with XML.

1. DOM – Document/tree based strategy.
2. SAX – event based strategy.

DOM

  • Simple.
  • Nested tree structure.
  • Process and query with XPath or other parser supported functions.
  • Implements specific interface defined by W3C.
  • You have to load the whole document into memory – so if the XML is big – then you have to watch out.
  • Simple and large XML documents do not suit this style.
  • Random access to the document.

SAX

  • Complex to use.
  • Event based.
  • Starting/Ending tags, XML comments and entity declarations all raise events.
  • Flexible – you can handle events when needed, and take action on those events as they are raised.
  • Once events have been fired they are gone, you have to re-read the document again if you need to get back to the same bit of XML again.
  • Malformed XML can cause the code to fall into exception – make sure your document is well formed before passing to the parser.

C# and Java XML Implementations

C#

  • Look at System.Xml.XmlReader.
  • DOM trees can be used via XmlDocument – and XPathDocument lets you interrigate the tree nodes via XPath.

Java (1.5)

  • If you are using 1.5 you are in luck as this version includes the Apache Xerces project.
  • Look at packages javax.xml.* – this also has XPath functionality.
  • DOM functionality can be found in org.w3c.dom.*
  • If you are using Java 1.4 and below, you will have to install Xerces.
Advertisements
XML Parsers

2 thoughts on “XML Parsers

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s