Join David Powers for an in-depth discussion in this video Choosing the right tool to work with XML, part of Up and Running with PHP SimpleXML.
- PHP has so many tools for working with XML it can be difficult to choose the right one. Six core extensions for manipulating XML are enabled by default. The two main ones are DOM, or Document Object Model and SimpleXML. Both are capable of reading and writing XML. XML Parser is an updated version of the Simple API for XML inherited from PHP 4. It can only read XML. Unless you need to support legacy code, it's best to avoid XML Parser.
It's been superseded by XMLReader, which is a lightweight class for reading XML. It's particularly useful for handling very large documents. XMLWriter is a memory efficient class for generating new documents. The final core extension, XSL, is basically a wrapper for extensible stylesheet language transformations or XSLT. Using XSL to restructure XML data requires a good knowledge of XSLT syntax.
Let's take a closer look at the two main extensions for manipulating XML. First, DOM. This supports virtually every aspect of the worldwide web consortiums document object model. It's a large API consisting of 19 built-in classes. It handles XML documents through the document object model, making it possible to navigate through the document even if you don't know its structure beforehand. You can also edit elements and attributes, adding new ones, or removing or changing the values of existing ones.
It's very powerful. But its complexity makes it difficult to use. Because it follows the W3C document object model, it requires a lot of steps to achieve even basic results. SimpleXML, on the other hand, is designed to be well, simple. It consists of just two built-in classes and three functions. Its principle attraction is that it offers direct access to most elements and attributes by name.
Consequently, it's ideal for extracting data from XML documents that have a predictable structure. You can also use it to edit XML documents. It doesn't have the same flexibility as DOM, but you can change values, delete elements, and insert new elements and attributes. And if you need more control, you can import all or part of a SimpleXML element into a DOM object and then export it back to SimpleXML.
Both DOM and SimpleXML are what are known as tree-based parsers. This means they load the whole XML document into memory, to generate a tree structure before they can read or write. This has the advantage that you can navigate through the document in either direction. It also makes it possible, not only to read, but also to write to the same document. The downside is that very large documents require a lot of memory which can adversely affect performance.
If you're dealing with very large XML files, with tens of thousands of lines, it's more efficient to use a streaming parser. Both XML Parser and XMLReader are streaming parsers. They process the document in small chunks, which makes them very fast and memory efficient. The downside is the that you can't navigate back to an element once it's been processed, nor can you edit the content. When it comes to making a decision as to which tool to use, it depends what you want to do.
For both reading and writing, use SimpleXML for basic tasks. But consider DOM for more complex situations. If you only want to read the document, SimpleXML is ideal for documents with a known structure. But if the document is very large, you should consider using XMLReader instead. Unlike SimpleXML, you can't access elements and attributes by name. But XMLReader is fairly easy to use, and it's light on system resources.
If you only want to write XML, both SimpleXML and XMLWriter are capable of generating XML that follows a regular structure. However XMLWriter does have several advantages over SimpleXML, including the ability to regularly flush the data to a file to conserve memory. It also formats the XML and automatically encodes ampersands and quotes. For more complex structures, use DOM.
If you need to filter the XML data, or present it in a different way, the conventional approach is to use XSL with an XSLT stylesheet. This works fine if you're comfortable with XSLT syntax, but XSLT isn't easy to master. Often you can achieve just the same results using Simple XML with the standard PHP library. By now, I hope you've got the message. Except for very big files or complex XML, SimpleXML can probably do it.
Trying SimpleXML first will usually save time and trouble.
- Loading an XML document
- Converting values to strings
- Handling errors
- Working with XML namespaces
- Using XPath
- Consuming an RSS feed with XML
- Adding and editing XML elements and attributes