Join David Gassner for an in-depth discussion in this video Choosing an XML processing API, part of XML Integration with Java.
Java developers have many choices in deciding how to work with XML in their applications. To choose the right API, you should first decide what's important to you and your application. Some applications just need speed, the best possible performance. To figure out which API will be fastest for you, you'll need to take into account which platform or operating system you're using and the size and complexity of the XML content you'll be working with. Other environments need to pay attention to memory usage.
In some environments you might have plenty of memory but if you're building an app for mobile devices, say for Android, you might be constrained. You should also pay attention to ease of programming, both in the initial development in your app, and in long term maintenance. Some of the older API's such as Dom, and Sax, can take more code and be more complex whereas, newer API's such as JAXB, the java API for XML binding, can take significantly less code.
But you might also need it to work on Android and that will put certain limits on your choices. For example, there are no current implementations for Android for the JAXB and StAX APIs. Here are the different types of XML processors in Java. Typically, they break down into three categories. Tree-based APIs, streaming APIs, and binding APIs. A tree-based XML processor represents the entire XML document as a tree of objects in memory.
This gives you a lot of convenience, you can traverse the tree, forward it back, you can inspect one part of the tree, and then jump to another part of the tree pretty easily. On of the great advantages of a tree based processor is that you can search the XML content with the XPath expression language or with tools that are specific to a particular API. But the downside of a tree-based processor is that it just takes more memory, and certain tasks can be a lot slower than a streaming API.
Examples of tree-based processors include the document object model, and Jdon. Streaming processors are designed to build or parse XML one node at a time. There are two kinds of streaming processors, known as pull processors and push processors. The simple API for XML or Sax is a push processor. That means it pushes the data into callback methods that you design. In contrast, the streaming API for XML, or StAX, is a pull processor, where you loop through the data and only call methods that are meaningful to you.
Typically, pull processors give you a more convenient programming model. But both types of streaming processors can be incredibly fast, and highly memory efficient. The downside of a streaming processor, is that because the complete data set isn't stored in memory at once, you can't do XPath's style searches. And also decoding can be complex, especially for the simple API for XML or SAX. Also the SAX API is a read-only API, it knows how to parse XML but not create it.
But as you'll see, if you want a streaming API for creating XML, you can use StAX unless you're working on Android. And there's one other streaming API that's worth mentioning, called the XMLPullParser. This is an API that's been implemented in android. So, if you like this streaming model and you are working in Android, the XMLPullParser is one possibility. The binding processors are similar to DOM in background that is their tree processors that store all the data in memory, all at the same time, but the programming model is dramatically different.
To use a binding processor such as JAXB, or the simple XML serialization framework, you take Java classes, POJOs, and you annotate them indicating which properties or fields of a Java class are mapped to portions of your XML structure. And then you run very simple code to either serialize or deserialize XML content. The upside of a binding processor is that it's a very efficient programming model and it's very easy to maintain.
And about the only downside is that JAXB, the binding processor that's included with Oracle's JDK is not available in Android. But there's a binding processor that does work in Android. It's called the simple XML serialization framework that's different than the simple API for XML, which can be confusing. But it's an open source library that you can add to your Android apps, and works quite well there. As you've seen so far the world of working in XML with Java is an alphabet soup of acronyms such as DOM, JDOM, JAXB, and so on.
One of the acronyms you'll see frequently is J A X P, or JAXP. This stands for the Java API for XML processing and it's an umbrella term that describes the standards for the XML APIs that are included in Java SE and these include these APIs. SAX, the simple SPI for XML, DOM, the document object model, StAX, the streaming API for XML, TrAX, the transformation API for XML and JAXP, the Java API for XML binding.
So when you hear the term Java API for XML processing or JAXP, you're not referring to a specific programming model. It's the entire set of APIs that are available in Java SE without having to go and get a third party library. If you're an Android developer, these are the APIs that specifically work in Android, SAX and DOM are included in the Android run time. The XMLPullParser is also included in Android SDK although it's not a part of the Java API for XML processing.
And finally, third party libraries that work fine in Android include JDOM, you'll need 2.0.1 or later, and the simple framework, which also needs a JAR file. And the API's that don't work in Android are JAXB and StAX. And as you can see here there are alternatives that give you similar styles of programming and similar benefits. And there are other XML APIs for Java developers that I don't cover in this course. These include XOM which you can find at xom.nu, dom4j which you can find at dom4j.sourceforge.net and XStream.
I haven't included these libraries simply to manage the length of the course, I had to make some choices. But there are advantages and disadvantages to these APIs as well, and they're worth checking out. As you decide which API to use for your application, you might want to do some benchmark tests. Finding out how fast an API will be and how much memory it will use. Don't depend on the benchmark tests that are offered by the vendors or by other developers, do you own.
Test on the platform that's as similar as possible to what your users will use. If your building an app for Android, test on a variety of hardware. If you're building in a server environment, use the same server that you'll use in production on similar hardware. Test with XML content that matches the size and complexity of the XML that you expect to encounter. And for application server environments, such as Java EE servers, test multi-user scenarios.
Make sure you are using Java code in a way that works in a multi threaded environment that you'll working in. And, finally do multiple test runs, don't depend on a single test. Do multiple runs through each of your scenarios and then take the average. There are too many factors that can cause a single test to not be representative of what you'll actually see in production. Through your benchmarks and through your understanding of the relative ease or complexity of the APIs I'll be covering in this course, you'll have plenty of choices.
And you should be able to choose the API that's best for your application.
- Choosing a Java-based XML API
- Reading XML as a string
- Comparing streaming and tree-based APIs
- Parsing XML with SAX
- Creating and reading XML with DOM
- Adding data to an XML document with JDOM
- Reading and writing XML with StAX
- Working with JAXB and annotated classes
- Comparing Simple XML Serialization to JAXB