Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member
There are many ways to read XML files in Java, there are many programming interfaces and packages available. They are known by their acronyms, there is JDOM and Stacks and Sacks, but the most venerable, the one that's been around the longest is known as DOM or Document Object Model. The Document Object Model approach to reading XML isn't the most fun or the most concise, but it's the most deeply embedded in the Java libraries. And so that's the version I'm going to show you here. I'm not going to show you everything about DOM that's something that would require a whole separate course, but I'll introduce you to some the most critical classes that are available.
And as with the classes that I've shown you to read files from the local file system and from the network these are classes that are available in the core Java class library. As with the previous video, I'm going to be working with this XML feed at services.explorecalifornia.org/rss/tours.php. The code I'll show you will work with any RSS feed because the structure of an RSS file is standardized. So let's start with this empty glass. So the first step in reading an XML file is retrieving it and when you're working in the document object model world, your first goal is to create a document object.
It takes a few classes to get there. I'm working in an empty project named parsexml in its class read XML and its empty main method. The first step is to create an instance of a class called document builder factory. I'll type outside DocumentBuild and press Ctrl + Spacebar and I'll choose the DocumentBuilder factory class from Javax.xml.parsers. And I'll name the variable factory. Now this code is going to get a little bit wide, so I'll maximize my editor and then I'll press Enter or Return and go to the next line and the DocumentBuilderFactory class is a class that you instantiate by calling it Static Method New Instance.
So I'll call DocumentBuilderFactory.newInstance(). Now I have an instance of that class and I am ready to go onto the next step, creating an instance of a class called DocumentBuilder. I'll type in the name of the class and I'll choose the class from Javax.xml.parsers the same package as the factory class. I'll name this one builder and I'll get its reference from factory.newDocumentBuilder. Now I'm ready to create my XML document object. The data type for this will be document, I'll type in the name of the class and press Ctrl + Spacebar and I'll make sure that I'm choosing the document class from org.w3c.dom.
When you call the parse method, it takes care of the downloading of the file and parsing the document and turning it into a hierarchical set of objects that you can traverse using document object model programming. Once you have the document in memory, the next step is to get the data, you could walk down the XML tree one level at a time, but there's a really great convenience method that's a part of the document class named get elements by tag name. If you know the tags that is the element names in your XML file, then you know how to get the data out.
So I'm going to create an object which is data typed as NodeList. In DOM a node list is kind of like an array, it's an ordered collection of objects. It has its own API which is a little bit different from an array as you'll see in a moment, I'll type in NodeList and press Ctrl + Spacebar and choose the Node List from org.W3C.dom. I'll name the variable list, and I'll populate it using the syntax doc.GetElementsByTagName and I'll pass in the name of an element I'm looking for, title.
In RSS, each news item is named an item and each item element has a sub element named title. For this exercise, I'm only interested in the titles of the news articles. So that's what I'm retrieving. And now let's find out how many items we got back. I'll output a string, "There are" and then here's a way in which the node list class is different from an array. With an array, you determine the size of the array with its length property.
With the node list you call a getLength method. It looks like this list.getLength I pressed Ctrl + Spacebar to auto complete the code there. Then I'll append the string items. I'll save my changes and see if there are any errors, I see one right here that I need to fix. It tells me that there's a syntax error here and I just missed my plus operator, here we go. I'll save that change, and see that there are errors from unhandled exceptions as I've done earlier exercises, I'll handle that by selecting the code and then surrounding the code with a Try Catch block.
There are three possible exceptions and Eclipse generates catch block for each of them. I'll get rid of the TODO comments to shorten the code and then I'm ready to save and run the application. And I'm told that there are 27 items. So now, the next step is to go get the text value of each title element. In the document object model, each part of the XML file is seen as something called a node, and each node is of a particular type. So for example, let's say that you had an XML node that looked like this starting with title and ending with title and then between those tags some text.
In the document object model, that's actually two separate notes, one parent and one a child. Title is an element node and the text within it is a text node, and if you know then you know how to get the text out from the title. I'm going to create a loop. I'll use a for loop to loop through the node list contents, I'll choose iterate over array and I'm going to keep on looping as long as my counter variable doesn't exceed the length of the node list.
So I'll change args.length to lists.getLength. Within the for loop, I'll get the item in the list. I am going to data type the item that I'm retrieving as an Element. Once again, this is a class of the DOM interface. I'll name the object item and I'll set its value using casting syntax. I'm going to be retrieving an item from the list, the node list will return it as a note, the superclass but I know it's an element the subclass.
I'll retrieve it, by calling the item method and passing in the "I" counter variable. Finally, I'll output the value within the element. To do this, I'm going to walk down the XML tree from the Element Node down to its Child Node, the Text Node. I know it's the first child of the element, because text nodes always are. So I'll use system.out.println and I'll output item.getFirstChild and then from there I'll call a method called getNodeValue.
When you're working with a Text Node, the GetNodeValue method returns the string value. I'll save and run the application and there is the result. A listing of all the titles, from all the items within the RSS feed. Now if that seems like a lot of code, just to get titles you're right, it's more code than is really needed and if you switched over to one of the third-party libraries for working with XML and Java such as JDOM you'd find your code was significantly smaller and easier to work with but it's important to know what's available in the Java class library.
When you're working with XML in the class library that's a part of the SDK the document object model is always available. It works in all Java development environments, Console applications, Web applications, mobile applications for Android and Blackberry, because as always it's just Java. There is obviously a lot to learn just about working with the document object model, but you can choose between that approach and the other libraries that are available in the Java community.
Get unlimited access to all courses for just $25/month.Become a member
61 Video lessons · 100058 Viewers
56 Video lessons · 113115 Viewers
71 Video lessons · 81971 Viewers
131 Video lessons · 39327 Viewers
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.