Real-World XML

Real-World XML

with Joe Marini

 


XML technologies offer web developers and designers more flexibility than ever before. In Real-World XML, industry expert Joe Marini covers the best programming practices with XML, including the tools needed to build effective XML structures. He demonstrates the implementation of XML formats, how these formats work in real-world situations, and how they can facilitate project planning and development. Exercise files accompany the course.

XML Essential Training is a prerequisite for getting the most out of this course.
Topics include:
  • Understanding the Sitemap index format
  • Integrating XML and design
  • Using XML effectively in Firefox and Internet Explorer
  • Avoiding common design mistakes
  • Understanding and implementing DOM algorithms
  • Building an XML tag set
  • Using XML with RSS and Atom
  • Processing XML data with XSLT

show more

author
Joe Marini
subject
Developer, Web, Programming Languages
software
XML
level
Intermediate
duration
3h 34m
released
Apr 14, 2009

Share this course

Ready to join? subscribe


Keep up with news, tips, and latest courses.

submit Course details submit clicked more info

Please wait...

Search the closed captioning text for this course by entering the keyword you’d like to search, or browse the closed captioning text by selecting the chapter name below and choosing the video title you’d like to review.



Introduction
Welcome
00:00With all the data we have to deal with as developers and designers, the world
00:04needs a way to transmit it, store it and describe it. Well, I have a 3-word
00:09solution for you: Extensible Markup Language.
00:12(Music playing.)
00:15Okay, I'll make it three letters: XML. I'm Joe Marini and this is Real-World XML.
00:21I have spent a better of my career working in the web and graphics
00:24industries developing applications like Dreamweaver, Expression and
00:28QuarkXPress. During this time, I have seen XML become more and more important
00:33for information exchange on the web.
00:35Now in this title, I'll show you how XML is used in the real world from common
00:39formats like RSS and Atom to technologies for processing XML data like XSLT.
00:45I'll also show you how you can build your own XML tag set and review some XML
00:49design and developing techniques I have learned over the years.
00:52I will be showing you these tools and techniques with an eye to making them
00:55applicable to your development needs, so you can use what you have learned here
00:59in your work environment. Now let's get started with Real-World XML.
Collapse this transcript
Using the exercise files
00:00If you are a premium member of the lynda.com Online Training Library, or if
00:05you are watching this tutorial on a disk, then you have access to the Exercise Files
00:09used throughout this title and I have laid out the Exercise Files in this
00:13folder format here where the chapter number corresponds to the folder where the
00:18corresponding example files are located.
00:21Typically what I do is I provide, in this case, sample files but for chapters
00:27in which we provide exercises, you can see that I have laid out the files using
00:30a _start and _finished format. So if you want to follow along with me during
00:36the lesson, open up the file name that has the _start in it and type along with me.
00:41Or if you want to use the finished version to just jump ahead and
00:46see how things are done, you can just open that file up in your editor and try
00:49things out yourself.
00:50Now if you are a monthly or annual subscriber to lynda.com, you don't have
00:55access to the Exercise Files, but you can follow along with me on the screen
00:58and just pause the movie to type in the code as you see me typing it.
01:03In most of the examples I scroll through the document so you can see all the code. So,
01:06you can just go ahead and pause the movie and type the code in.
01:09All right, let's get started!
Collapse this transcript
Tools for working with XML
00:00XML is essentially just a text format so you can use pretty much any text
00:03editing tool to work with it. However, there are some good tools for both
00:07Windows and Macintosh that you can use that provide some additional features.
00:11Now, the tool that I have been using in this course is Visual Web Developer,
00:15Express Edition and that's a free tool from Microsoft and I have provided the
00:19URL for you to download it if you want to try it out yourself on Windows.
00:22For the example on which we edit XML and integrate it into our website, I was using
00:28Microsoft Expression Web. That's a professional level product. It's also
00:31available from Microsoft, but you can also use programs like Adobe Dreamweaver
00:36or whatever other professional editor you have.
00:38On the Macintosh side, there are programs like BBEdit and TextMate. Those are
00:43really good editors and products like WebScript and of course Dreamweaver works
00:46on the Mac as well. The important thing to remember is that XML is just text,
00:50so you can use any text editor for it. However, text editors that have features
00:54like automatic indenting and syntax coloring and IntelliSense are going to be
01:00a lot more useful then just straight text editors without those features.
01:04Whichever tool you decide to use though, just make sure it's comfortable.
01:06That should be all you need to know, so let's go ahead and get started.
Collapse this transcript
1. The XML Landscape
Reviewing XML
00:00Let's begin by taking a quick review of XML, what it is and what it looks like.
00:05Now if you haven't seen XML before or if you are new to the subject, then I suggest
00:10you take a look at another lynda.com title that's available in the Training Library
00:15called XML Essential Training and I do that title as well.
00:19It's really a foundational title. So if you are new to this and you haven't
00:23seen it before, I highly suggest you go check that title out first because that
00:28will provide the foundational knowledge that you will need to get through
00:31the rest of this title.
00:32In this course, we'll be using concepts that are introduced in XML Essential Training.
00:37So you are going to want to make sure that you have those concepts
00:41and ideas under your belt before you tackle a title like this one.
00:45XML is the Extensible Mark-Up Language. It's tag-based like HTML is. So if you
00:51are familiar with HTML code, then XML will look very familiar to you. XML is
00:56used to describe data and the structure of that data. Because XML is extensible,
01:03you get to make your own tags up.
01:06The benefits of XML are numerous. I have called out a few here. First,
01:10XML allows you to separate the content of a document from how it's presented.
01:15XML does not contain by itself any notion of how data should be presented to the
01:20person reading it or consuming it.
01:22You can also create tag sets that target specific problems. In fact, we'll take
01:26a look at how to do that in this course. XML stores information in a way that
01:30people can easily understand. So even though XML may be intended to be consumed
01:35by another computer system, or a machine, it's still in a format that a person
01:40can read and understand with some time and depending on how large the XML file is.
01:45XML allows you to exchange data among disparate systems using technologies, for
01:50example, like web services. So two systems that may never have been designed to
01:55talk to each other can use XML to exchange data among them.
01:59Finally, XML is an open format and it's text based. So it can be processed by
02:04any program that happens to be aware of XML. Now that's not to say that XML
02:09doesn't have some drawbacks. XML, for example, is not good for storing
02:14large amounts of data.
02:15In some cases, performance can also be slower than other methods of storing and
02:20retrieving data, so binary format, for example. If you were to store say a
02:24Photoshop file as an XML file, that file would probably be a lot less efficient
02:29than the binary format that you could store the image in. XML might not be
02:34the best format for representing certain kinds of data, audio, video, that kind of binary stuff.
02:40Finally, some parts of XML, like namespaces, are kind of difficult to
02:44understand and hard to work with. Now XML documents must be what's known as
02:49well-formed. They always have a single root tag, just like HTML does, and tags
02:55have to be properly nested.
02:57In other words, you have to have an A tag completely inside of a B tag or to
03:02take an HTML example, you can't have a tag that's like bold and then italic and
03:07then close bold, and then close italic. The XML parser won't let you get away with that.
03:11Unlike regular HTML, empty tags always have to end with a slash inside the
03:17closing angle bracket, just like in XHTML for example. Attributes have to be
03:21inside quotes and they can't be minimized. If you are using XHTML,
03:24you're already familiar with these concepts.
03:26XML documents can be what's known as valid. In other words, you can take an XML
03:32document and validate it against a schema or Document Type Definition to make
03:38sure it confirms to certain rules. This is a sample XML file. In fact, we'll be
03:44using this sample XML file a couple of times in this course.
03:47Now looking at the sample XML file, you can see that I have defined a few tags,
03:51like BusinessCard and name. So this is an XML file that represents the contact
03:56you might find on a business card. Here we have some phone numbers and an
04:00e-mail address. The phones have certain attributes.
04:03So you can see that XML can be used to mark-up tag sets that solve a particular
04:09type of problem. In this case, we needed a way to represent contact data on a
04:14business card. Later on in the course, we'll see how to do something like this
04:17in a real life web setting.
04:19Okay, so that's a quick review of XML. Again, if you are new to all this,
04:24I highly encourage you to go check out the XML Essential Training title that's
04:28also available at lynda.com before continuing further.
Collapse this transcript
Understanding XML usage today
00:00XML is used in a number of different real-world settings today, but you can
00:05break down how XML is used into three main categories. Let's cover those now.
00:09In the data extraction usage, you are taking XML and using it to represent some
00:14type of data format.
00:16Now most modern databases can provide data in XML format today. All the large ones,
00:21like Oracle and Microsoft and MySQL, IBM for example. They can all export
00:28data using XML. The modern browsers can also load XML from different data
00:33sources. You can provide a URL or from the local file system. We'll be seeing
00:39an example of that later in this course.
00:41There are also technologies like XPath and XQuery that are used for querying
00:46XML data in XML documents. XPath is a very lightweight form of querying and
00:51it takes a syntax that looks a little bit like Directory Paths that you might be familiar with.
00:55XQuery is a bit more complex. XQuery is to XML what SQL is to structured
01:03relational data. We are not going to cover that in this course because it's
01:07fairly advanced and could easily fill a title all on its own.
01:10In the data preparation and processing area of usage, you take the XML data
01:15you have been given and prepare it for presentation and process it further.
01:19So for example, if you have an XML file that represents a series of products or items
01:26and these items might have prices, you might do some data preparation or
01:31processing to run through all the tags and add up all the prices to arrive at a
01:35total for example, or count the number of items in a file for some reason.
01:39Technologies for doing this are XML Schema, which ensures that XML data in a
01:44document conforms to certain rules. So for example, a certain tag of a certain
01:50type has to be inside another tag of a certain type, or a tag that indicates a
01:55price has to contain only numbers and a period for the decimal place and so on.
02:01The XSLT technology, which stands for XML Stylesheet Language Transformations,
02:06is used to transform XML into other syntaxes like ASCII or PDF or HTML or more XML.
02:15DOM and SAX are two different programming methods used for scripting XML.
02:21In this course, we'll use mostly the DOM because the browsers don't use SAX.
02:26For data presentation, you can use a combination of CSS, XSLT or DOM scripting
02:32in order to present data to the user. These are not mutually exclusive. You can
02:37use a combination of any of these three. During this course we'll do this a few times.
02:42Okay, let's take a look at the XML landscape as it's currently today.
02:46In the data storage and exchange side, we have some established standards, like XHTML
02:52and RSS and SVG. We'll cover RSS a little bit later in this course. RSS is
02:58essentially a way of syndicating content that changes over time.
03:03You're probably familiar RSS by reading blogs, for example.
03:06SVG stands for Scalable Vector Graphics. That's an XML syntax that describes
03:12drawings made using vectors, such as Illustrator files. There are a number of
03:18emerging standards however. ATOM, for example, is a standard publishing
03:23syndicated content just like RSS is. Only it's slightly richer.
03:26Then there are standards like RDF and XForms, which we won't get into in this course,
03:30but solve their own types of business problems. XForms, for example, is
03:34a way of processing forms on the web. RDF stands Resource Description Framework.
03:40Then there's XHTML 5, which is an emerging standard that aims to standardize
03:45the way that web applications are built. On the data processing side, there's
03:49DTD and Schema, which we have talked about earlier which enforces rules.
03:53There's DOM and SAX, which we also mentioned. Then there's technologies like
03:57XMLHTTP Request, which is the foundation of AJAX, and XSLT and XPath, which
04:04we have covered earlier.
04:05We also talked a little bit about XML Query for querying XML data and then
04:10there's the XLink and XPointer specifications, which are also emerging. Again,
04:15we won't cover those in this course because they are fairly advanced, but the idea
04:19is that these provide more advanced ways of linking XML documents together far
04:23beyond what the standard HTML link gives us today.
04:26Okay, so with this in mind, let's take a look at some important XML technologies.
Collapse this transcript
Important XML technologies
00:00The first technology we'll look at and talk about is XPath. Now Extensible Path
00:04Language is what XPath stands for and it's used to extract data from inside an
00:08XML file. It uses a path-like syntax similar to directory or folder paths.
00:14If you are not familiar XPath, we cover it a little bit in the XML Essential
00:18Training title. So you might want to refer to that title to get familiar with it.
00:21XSLT is also another XML based language for defining style sheets.
00:27As I mentioned earlier, it's a styling language that takes an XML file and
00:31transforms it into something else, like HTML or PDF or some other file format.
00:37We talked a little bit about SAX and DOM. These are methods of processing data
00:41and Schema. Schema is a way of expressing rules for a given XML syntax.
00:46Now you may be already familiar with Document Type Definitions. You can think of Schema
00:50as the next step beyond DTDs. They define things like what tags are or are not
00:56allowed and where they can go, what kinds of data they contain, so on and so forth.
01:00In this title we'll take a look at some formats like RSS, which stands for
01:05Really Simple Syndication. This provides data in discrete chunks that can be
01:10read individually. You have probably seen this in blogs or news sites or other
01:15syndicated content. If you own a TiVo, for example, the TiVo actually makes the
01:19items that are recorded on your TiVo available as an RSS file.
01:24So ATOM is another format for syndicating content and like RSS, it provides
01:28content in a richer syndicated fashion. It was adopted back in 2005 and we'll
01:33take a deeper look at the ATOM format in this course.
01:35There is also better support for XML built into the browsers.
01:40Modern browsers, like Internet Explorer 7 and higher and 3 and higher for Firefox,
01:45provide really good support for technologies like the DOM and XPath and XSLT
01:50and some newer things like serialization and parsing. We'll get into that later
01:55when we get to the chapter on XML and the browsers.
01:58Okay, so now that we have seen what important technologies there are in the XML
02:02world today and we have seen the XML landscape and how XML is used, let's get
02:07started and take a look at some real-world XML formats.
Collapse this transcript
2. Real-World XML Formats
Understanding the Sitemap and Sitemap index formats
00:00Before we jump in and start designing our own XML format, I thought it would be
00:04instructive to take a look at some of the real-world XML formats that are in
00:08use today. We'll start out by looking at the Sitemap and the Sitemap Index formats.
00:15These formats provide a way for web masters to inform search engines about the
00:20contents of their sites that are available for searching or crawling by the
00:26search engines. The Sitemap and Sitemap index currently enjoy pretty wide
00:31support. They are supported by Google and Yahoo and Microsoft search engines,
00:35which pretty much constitute the bulk of the search engine traffic that's out
00:40there today.
00:41Now I want to point out that Sitemap and Sitemap Index don't affect the way
00:45that your sites appear or are ranked in the search engines. The whole point of
00:52these file formats is to tell the search engines how they can crawl your site
00:57more intelligently. This is not about Search Engine Optimization or anything like that.
01:03Each Sitemap is an XML file and that XML file lists information about each URL
01:10that is available on your site. It lists information like when it was last
01:15updated, and how often it changes, and so on and so forth. Now as I said, this
01:20does not guarantee that pages are going to be included in search results or
01:26that it's in any way going to affect how your page gets ranked. The whole idea
01:31here is that this is a way for your site to inform the search engines about the
01:37structure of your site, how they should search the site, that kind of thing.
01:41You can find out more information about the Sitemap and the Sitemap Index
01:46formats at the URL that you see here, www.sitemaps.org. Okay, so each Sitemap
01:54file contains a collection of tags that define the URLs that the search engines
02:01should care most about.
02:03Now Sitemap files are limited to 10 Megabytes in size. So if you have to use
02:09more than one Sitemap file, then Sitemap index files are used to group multiple
02:14Sitemap files together. You can imagine for websites that have a lot of URLs,
02:20such as say a large catalog shopping site, they want to index all of the URLs
02:26that are available. That can easily exceed 10 Megabytes in size pretty quickly.
02:30So the Sitemap index file is how you group multiple Sitemaps together.
02:35Ideally, you place these files at the root of your website and you then either
02:40include them in a robots.txt file or you submit the site directly to the search
02:46engines in order to let them know that these files exists and the sitemap.org
02:52URL that I listed earlier has more detailed information on how to do this.
02:56These are all only just hints. The search engines don't use this to affect your
03:01site's search rankings.
03:02Let's take a look at the tags available in the Sitemap file. Each Sitemap file
03:09has a set of tags, some of them are required and some are not. This table lists
03:14all of the tags that are in the Sitemap file format. So you can see there are
03:19six tags. So it's a pretty compact, pretty focused file format that does one
03:24job and does it well.
03:26The urlset tag, the one at the top here, it's required. It encapsulates the
03:32file and it references the current protocol standard. So this basically serves
03:36as the root tag in any of the Sitemap files. Urlset tags contain one or more
03:44URL tags. This is the parent tag for each URL entry. All the other tags in this
03:51list are child tags of this url tag. As you can see, it's also required.
03:58Inside each url tag, there's one required tag and that's the loc tag right
04:04here. The loc tag stands for location and it lists the URL of the page. The URL
04:11has to begin with the protocol like HTTP. If your web server requires it, then
04:16it has to end with the trailing slash on the URL. Some web servers require it
04:22and some don't. The whole idea though is that these URLs are going to be used
04:27by the search engines to crawl your site. So if your web server requires it,
04:31then you have to include them in these tags as well.
04:33The rest of the tags are optional. The lastmod tag indicates using a date
04:40format when this URL was last modified. Now this date should be in the W3C
04:46Datetime format which you can look up on the W3.org website. If you want,
04:51you can just omit the time portion and use the format of a four-character year,
04:57followed by a two-digit month and a two-digit day.
05:00The next tag, changefreq, indicates the frequency that the page changes. It
05:06provides basically general information to search engines. Now this may or may
05:11not co-relate exactly to how often they crawl over the page. Remember, this
05:15file's purpose in life is to provide hints to the search engines, they don't
05:19necessarily denote solid rules that the engines have to follow.
05:23So you can put in values for this tag, either always or hourly, daily, weekly,
05:30monthly, yearly and never. So if you place always in this tag, it means that
05:36the page is always changing, it dynamic and it needs to be searched each and
05:41every time as if it were a new page. The never value, you should only use that
05:46in cases of pages that have been archived and don't need to be searched
05:51anymore. Ironically enough, that may or may not mean that search engines honor
05:56that value. They may choose to search pages listed as never anyway just in case
06:01there are unexpected changes to those pages. Again, these are hints.
06:05Then finally, there's the priority and that's also optional. This indicates the
06:10priority of this particular URL relative to the other URLs on your site.
06:17You can place values from 0, meaning least important, up to 1.0, which means most important.
06:24The default priority, if you don't specify this, is going to be 0.5. Meaning
06:28it's kind of a middle priority. Now this priority again does not affect how
06:33your page gets listed in search engine rankings. It just indicates how
06:38important the file is relative to the rest of the ones in your site.
06:42So this is what a sample Sitemap looks like. You can see at the top, there's
06:46the XML declaration. In XML version 1. 0, this is optional but it's always a
06:51good idea to declare it anyway. In 1.1, this became mandatory but in XML 1.0
06:57the XML version is not needed but I always like to put it in because it's proper XML.
07:02You can see here, here is the urlset at the top of the page. It references its
07:06namespace in case we wanted to include this in another file, we wouldn't have
07:10name collisions. Then inside the urlset, you have a collection of URL tags.
07:15You can see that each one of these guys has a location tag but not all of them
07:20have, for example, priority or last modification. It turns out that each one of
07:25them has a change frequency but again, those are optional as well.
07:28So this is a finished and complements sample Sitemap. You can see it's focused
07:33on one job. Its whole job in life is to tell search engines how often and which
07:38URLs they should crawl on your site. Okay, so moving along looking at the
07:43Sitemap index tags.
07:45Now Sitemap index files are even more compact. That's because their only
07:49purpose in life is to group together multiple Sitemap files, in the case that
07:54you build Sitemap files that are larger than 10 Megabytes, you have to break
07:57them down into smaller parts and then group them together using a Sitemap index.
08:02So all but one of these tags are required. The sitemapindex tag is required and
08:08it's the root tag of the document. The sitemap tag is also required. These go
08:12inside the sitemapindex root tag and there can be one or more of these. Each
08:17sitemap tag essentially encloses the location and lastmod tags about each
08:24Sitemap file. The location or loc tag indicates the URL of the Sitemap that it
08:31points to and lastmod is the time that the corresponding Sitemap file was last modified.
08:37It does not correspond to the time that any of the pages in that Sitemap were
08:41changed. It's the file itself. Again, this should be kept in W3C style Datetime
08:47format. Here we have a sample Sitemap index. So you can see that in this case
08:52we have a sitemapindex. This is the root and here is its namespace declaration.
08:57This sitemap index file points to two different sitemaps. This one here has an
09:02example URL. This one has another one. We indicate when they were last
09:08modified. This is what the W3C Datetime format looks like. If you want to omit
09:13the time portion, which starts from the T and goes to the end, you just can use
09:17a four-character date followed by a two- character month and two-character day.
09:21That's essentially sitemaps and sitemaps index files in a nutshell. What we are
09:27going to do now is jump over to the code really quick, so we can look at in the
09:31other. Okay, so here we are in the code and if you have access to the sample
09:37files, then you have these files. I have included the example XML files from
09:43both the sitemap and the sitemap index files, along with the Schema files for
09:49each of these, in case you have a tool that can use Schema files in your XML design.
09:54So here you have the sample sitemap XML file that we looked at in the slides.
10:00You can see here the various tags. This is the corresponding schema that goes
10:04along with it. The schema file basically lays out the rules that an XML file
10:11has to follow. So you can see that this is defining what elements are allowed
10:15and where they can go inside the sitemap file. Same over here for the site
10:20index. This is the sample file and here is the schema that goes along with the
10:26site index file.
10:27So that's a pretty simple example to get our feet wet with a custom real world
10:32XML format. Let's take a look now at a more complex example and that's the RSS file format.
Collapse this transcript
Understanding RSS
00:00Okay, so the next real-world XML format that we are going to take a look at is
00:04the RSS format. RSS stands for Really Simple Syndication and you may have heard
00:11the term used before.
00:13Essentially, RSS is a family in fact of formats that are intended to publish
00:18information that is updated over time and you have probably heard of RSS used
00:24for things like blogs or news headlines. But the reality is it can be used to
00:29publish information about any kind of content that can be syndicated, whether a
00:34stock information or podcast or anything like that.
00:38These are examples of information that are delivered in small, easy to consume
00:42chunks, and can be individually stamped as discrete pieces of data. Typically,
00:47RSS content is consumed by "Feed readers" that present the information in a
00:52friendly way because it is a lot easier than reading raw XML code.
00:57RSS was originally developed and published by Netscape back in 1999. But then,
01:04they abandoned work on the effort and the RSS effort was carried forward by a
01:09bunch of other individuals, and RSS actually has a pretty long and torturous
01:15history behind it, and I'm not going to bore you with all the details.
01:19But the most current version, which is version 2.0, was published back in 2002.
01:25Today, the specification for 2.0 lives at the website that you see listed on
01:33your screen here. That's http://cyber. law.harvard.edu/rss/rss.html. Now, over
01:40the years, RSS has grown and evolved through several versions.
01:44The most popular versions that are available now are RSS 1.x and RSS 2.x and it
01:52turns out that RSS 2.x outlays RSS 1.x by a factor of almost 2:1 according to a
01:58recent, at the time of this recording measurement by a website named Syndic8
02:03which we'll take a look at in a moment.
02:06For a moment though, let me pop over to the browser really quick, so you can
02:09see the specification that is contained at that harvard.edu address.
02:14So I'm going to switch over to the browser. Okay, so here we are looking at the specification.
02:19This is the spec that currently describes the RSS 2.0 file format and you can
02:24see, it is a fairly long document. It explains all about what RSS is and shows
02:30some sample files and explains some of the elements in the file. Basically, it
02:35goes through all the different content that can possibly exist in RSS file. So
02:40anything that you might want to know about RSS 2.0 is contained here in this
02:46specification. You can see it is a pretty long document.
02:48So what I'm going to do is I'm only going to cover the most important parts of
02:52version 2.0, because that is the one that is most widely in use. Before I jump
02:58back to the slides however, let's take a look at the Syndic8 website that
03:03I mentioned just previously.
03:05So I'm going to go here to syndic8.com and this is a website that tracks usage
03:15statistics for various kinds of RSS and Atom feeds. What we are going to do is
03:22we are going to look here down through until we get to site statistics.
03:27You can see that there's a graph here that shows the RSS versions in use.
03:32So I'm going to click on that chart, and you can see that out of the 558,000
03:39feeds that Syndic8 is tracking, RSS is accounting for the vast bulk of that.
03:46Atom is decidedly smaller. Although Atom is a much newer file format and in
03:51fact, we'll cover Atom in the next section.
03:56So let's scroll on down here. You can see that this is the distribution of feed
04:01languages and the vast majority are in English, and there are some more
04:06interesting stats down here about feeds that are available. It's actually
04:09really great site to go looking through to see how RSS and Atom are being used.
04:15But in any case, you can see that RSS still accounts for the vast number of
04:21feeds that are out there.
04:23In fact, I'm going to go quickly look over here on the RSS tab. You can see
04:27that this is specifically the distribution of RSS versions. The large pink area
04:32right here corresponds to this 2.0 specification right here.
04:37So because RSS 2.0 is clearly the most widely used version, that is the version
04:42I'm going to be concentrating on here in this lesson. So now that we have seen
04:46the specification and we have seen some information about usage statistics for
04:52RSS 2.0, we are ready to get started on the basics of the RSS format, and that
04:57is the subject of our next lesson.
Collapse this transcript
Using required and optional elements in RSS feeds
00:00RSS feeds are composed of a collection of XML tags and some of them are
00:07required and some of them are optional. Now, all RSS feeds that have the
00:12version 2.0 as their version number have the RSS tag as their root. That goes
00:18along with the version attribute that contains the string 2.0, which clearly
00:23identifies the file as an RSS version 2.0 feed.
00:27Each RSS tag contains in turn a single Channel tag, and this is where the
00:32content of the RSS feed goes. So if we were to start building a Bare-Bones RSS
00:39feed, it would look something like this. We would have the RSS tag at the top
00:44with the version being 2.0, and then we would have a Channel tag inside the RSS tag.
00:49Now, this is a Bare-Bones RSS feed and it doesn't do anything at all. So we
00:56have to figure out how to add some content to it. Now, the Channel tag itself
01:00has some required elements and the required elements of the Channel tag are the
01:07title and this refers to the name of the channel.
01:10So for example, if you have an HTML website and that website contains the same
01:15information as your RSS file. In other words, your RSS file is just a
01:20syndicated version of the content on your site, then the title of the channel
01:24should be the same as the title of the website and I've provided an example over here.
01:29So if I have my website joemarini.com and I named the website Joe's news and
01:35information, if I had an RSS feed that provided essentially the same
01:40information as the site, I would name it the same name.
01:42On the other hand, if you have RSS feeds that provide more specialized
01:47information, for example, if you have an RSS feed that lists the number of
01:52times that you will be speaking in an upcoming given period of time or
01:56publications you've put out and when that happened, or places you have been to
02:00for lunch, and what dates you went. Then obviously you are free to name those
02:06other names. But name your RSS feed that provides the same information as your
02:10site, the same name as your site.
02:12The Link tag is also required. The Link tag provides a URL which indicates the
02:17HTML website that corresponds to the channel. So for example, if I had an RSS
02:22feed that correspond to my website and my website was joemarini.com, I would
02:26place that URL in there as well.
02:29Then finally the Description tag is also required, and this is a short
02:32description, a sentence or two, maybe three describing the channel. So if we
02:37were to update our previous example code using what we now know, the RSS feed
02:44will start to look like this. So we would have the RSS and Channel tags that we
02:47had before, and we would have the title link and Description tags.
02:52Now, this is beginning to look a little bit more like a real RSS feed, but not
02:56exactly very useful because it doesn't contain anything except for the
03:00information about the channel. In order to make this RSS feed useful, we have
03:04to add item tags to it, and item tags go inside the channel, and they specify
03:10information about each individual piece of syndicated content, and that's what
03:15we see here.
03:16So each item tag also has a set of sub- tags or child tags that are required or
03:24optional. So the tags I have listed here and technically speaking, all of the
03:29tags specified for item are optional. However, at least one title or one
03:35description has to be present.
03:38So the Title tag, right here, that is the title of the individual item, and
03:43I have provided an example. So one item might be named Joe goes to the movies,
03:47and the link which is a URL to that item, this is a URL that a feed reader can
03:53use to open up the larger piece of syndicated content.
03:57In this case, it might be something like URL to my website and then
04:01JoeAtTheMovies.html, and the Description tag which contains a sentence or two
04:08or three or so which describes the content of the item. In this case, it's just
04:13a short description. Here is what I saw at the movies last weekend.
04:17Now, just a quick note. According to the RSS 2.0 specification, the Description
04:22tag is allowed to have HTML content in it, and since you are going to be
04:27embedding HTML content inside XML file, you have to do some special encoding,
04:33and I'll cover that later.
04:35If we were to now take our RSS feed and apply what we now know, it would look
04:41something like this. We have our RSS tag, we have our Channel tag, we have got
04:46the tags up here which describes the channel, and then we have a couple of
04:50items. So this item here is Joe goes to the movies. It has got a link, and a
04:55description, and this item here is Joe has lunch, and there's the link and
04:59description of what I had for lunch.
05:01This here is actually a fully formed and proper RSS feed. Again, it is not
05:06particularly rich or useful because it doesn't contain things like publication
05:11dates or information about authors and so on, and we'll cover that more in the
05:16next section when we see more of the RSS file format.
Collapse this transcript
Enriching the RSS feed
00:00Okay, let's continue our coverage of the RSS file format by taking a look at
00:06some of the optional elements of the Channel tag.
00:09Now, this is not an exhaustive list. If you want to see every single tag that
00:15the channel can possibly contain, I urge you to refer to the specification at
00:21the URL I provided earlier. These are just a selection of some of the more
00:26important tags that you should know about.
00:28So the Language tag starting right here at the top indicates the language that
00:32the feed is written in. This is written using the W3C style language
00:38indicators. So for example, en-us indicates English in the United States, and
00:45you can look up a whole bunch of these language codes on the W3C's website.
00:50The Generator tag indicates the software package that created the feed and in
00:54this case, I have got an example here called MyRSSPackage 2.0. But if you edit
01:00it by hand, you can also just simply include the string by hand.
01:03The Image tag specifies an image that can be displayed with the channel, and
01:09there are typical constraints on the size that this image can be, and what you
01:15refer to the spec for that, it is not very big. It'd simply be like 140 pixels
01:19high by 80 pixels wide or something like that.
01:21The Copyright tag is the copyright notice for the channel content.
01:25So if you want your information to be copyrighted, you can put a copyright notice in
01:29using this tag.
01:30The Publication date for the channel content is indicated by the pubDate field.
01:35Now, this is a date field that indicates what date that the publication happens on.
01:41So for example, if you publish this everyday, then this date would flip
01:46every 24 hours.
01:48The Category tag indicates the category for the news and you can use as many
01:53category tags as you want. So here I have got one example for news, but if
01:58you had a channel that fit into more than one category, you can use as many of
02:04these category tags as you feel adequately describes the categories your feed fits into.
02:09Then there's the lastBuildDate. The lastBuildDate is different from the pubDate.
02:13This is the last time that the channel content changed. So you may
02:18publish on a regular schedule and the content may change on a different
02:23schedule. So then, you don't necessarily need to do the same thing.
02:26The last tag I'm going to point out which is optional for the channel is the
02:30Rating tag, which is the rating for the channel and this conforms to the PICS
02:36standard which is specified by the W3C, and it's described at this URL right
02:43here. So if you want to learn more about that, you can investigate that URL.
02:47The Item tag also has optional elements. The author, right there at the top,
02:51that indicates the person who wrote this particular item and the email address
02:56of the author. So for example, each individual item can have a separate author.
03:03This is what the content will look like. In this case, it is joe@joe.com.
03:06The Category indicates the category for this item. So just as you can have
03:10categories for the Channel tag, you can also have categories for an individual
03:15item, and again, you can use more than one here.
03:19The pubDate is the publication date for this particular item. So this is the
03:23date at which time this particular piece of information was published and added
03:28to the feed.
03:29The next field is interesting. It is called guid. A guid means a globally
03:34unique identifier and it is an identifier that is unique for this item. There
03:38are no rules for the format of guids. You can use anyone of a number of
03:44schemes. Though they usually take the form of URIs or URLs.
03:50If they have an optional attribute named is PermaLink = "true", then the blog
03:56reader or the feed reader to be more specific can assume that the guid can be
04:01used to open a link to the item in the browser.
04:04Then finally, there's the Enclosure optional element which describes a media
04:09object that is attached to the item. This is how podcasting is achieved. There
04:14are three required attributes if you are going to use the enclosure.
04:17There is a URL attribute which indicates where the item is located on the
04:21Internet and you can see I have got an example over here. So for example,
04:24if I was creating a podcast and I was creating mp3 files, for each item I would use
04:30an enclosure tag which specify the URL to the mp3 file.
04:35There is the Length, which is the size of the item in bytes. So I have got that
04:39here and then the MIME type of the item, which in this case would be audio
04:45MPEG. But it could be anything else based upon what the content of the
04:51information is. It might be some other audio file format or video or what have you.
04:55So now, let's go back and take a look at our RSS sample feed, because now we
05:01have much richer information that we can include in the feed format. So here we
05:06have the RSS and Channel tags that we started out with and the title link and
05:11description which are the required parts of the channel definition.
05:16Well, that now we have also added the optional language here, specified as US
05:20English and the generator. I've put in this By Hand because I don't have a
05:25software package that made this one, and the publication date. The publication
05:28date was Wednesday, 5th of March, 2009 at 2 AM.
05:32So far this feed only has one item in it and again, we have added some more
05:37rich information here. Rather than just the title link and description, there's
05:41also a pubDate and author and a guid which happens to be a PermaLink.
05:47So let's talk a little bit about including HTML content in RSS feeds. There's a
05:52couple of ways that you can achieve doing this. I'm going to talk about the two
05:56most common. Now, as I said earlier, the RSS specification for RSS 2.0 allows
06:01HTML to be included in the Description tags of items and channels.
06:06In order to make this work however, you can just simply shove HTML code with
06:10all of its angle brackets and everything inside the RSS file. You have to
06:15encode the HTML before you put it in.
06:18Now, the first way of doing this is encoding the HTML tags by doing what is
06:22known as escaping. You can see here that I have got the entity encoding for the
06:29less than sign, and then a bold and then a greater than sign, and over here,
06:34I have got another bold tag.
06:36If you look at this in HTML, this < would be a left-angle bracket like it on
06:42this description right here and then this > would be the greater than angle
06:49bracket. So I have entity encoded the HTML here, and included it in the
06:53Description tag.
06:55The other way to do it is to leave the HTML just as it is, but put it within
07:00what is known as a CDATA section. CDATA sections are standard parts of XML and
07:07they are declared by using an angle bracket with an exclamation point,
07:11a bracket, the word CDATA, and then another opening bracket.
07:15And then you can just put your HTML code
07:17right here inside the CDATA section and then close it off by two brackets and an angle bracket.
07:24The CDATA section basically tells the XML parser that's reading the XML feed,
07:29don't worry about what is in here. It is character data. You don't have to
07:32worry about parsing it. You can just skip over it for the purposes of trying to find tags.
07:36Okay. Well, now we know enough to create our own RSS feeds. Let's move on now
07:42and take a look at our next real-world XML file format, which is the Atom file format.
Collapse this transcript
Understanding the Atom Syndication feed
00:00Okay, the next real-world XML format that we are going to take a look at is
00:03called the Atom Syndication Feed and the Atom Format is a term that applies to
00:10two related formats. The first one is called the Atom Syndication Feed and that
00:13refers to web data feeds. You can think of this as analogous to being the same thing as RSS.
00:19Atom also defines what's known as a publication format and that's a
00:24specification that deals with creating and maintaining resources on the web.
00:30We are not going to deal with that particular specification in this section
00:33because it's fairly complex. So we are going to focus on the syndication feed
00:38in order to see how Atom implements an XML format.
00:41So just like RSS, Atom is used to provide information in the form of easily
00:47consumable chunks of data from sites on the web that are updated periodically
00:53or in using another term, syndicated. Then these are typically things like
00:57blogs or news site. The Syndication Spec that we are going to be looking at for
01:03web feeds was adopted back in 2005 and you can learn more about the Atom
01:10Specification at a website called atomenabled.org and I have provided the link
01:15there. And there's also the ietf site which contains the full link to the
01:20specification for the Atom Syndication Feed Format and that's rfc4287.
01:26And you can see I have also provided the link for that as well.
01:29Atom feeds are composed of a collection of XML tags just like RSS is. Again,
01:34just like RSS, some of these tags are required and some of them are optional.
01:40All Atom feeds like all XML documents have a root tag and in the case of Atom
01:45feeds the root tag is known as the feed tag. The feed tag contains some
01:50required child tags along with zero or more entry tags and each entry
01:55represents an individual piece of content.
01:58Now I say zero or more or more because technically speaking, they are optional,
02:02but Atom feeds aren't very useful if they don't have any entries in them. So
02:05we'll take a look at that in a moment. So to define an Atom feed, you can see
02:10I have created a feed tag there and I have included the XML name space that
02:14specifies the name space for Atom in my xmlns attribute. We are not going to be
02:19using that in this example.
02:20But if you wanted to include content from an Atom feed in another document say
02:26an XHTML document, you would use the name space for that. Okay, so that's a
02:29quick introduction to what Atom is. Let's get into the basics of the Atom
02:34format now and start building our first feed.
Collapse this transcript
Using required and optional elements in Atom
00:00Atom feed tags have some required elements in order to make the feed useful and
00:06I have listed the three elements that are required on the feed tag here.
00:11The first one is the title tag. The title tag indicates the name of what the feed is
00:16and this is usually, but it's not required to be, the name of the website
00:20that supplies the feed.
00:22Now if it's not the name of the website, you can make it whatever you want, but
00:25in any case, you should not leave this field blank because it's what most feed
00:30readers use to display the name of the feed to the user and I have provided
00:33some examples of each one of these tags, you can see there are on the right.
00:36The next tag is the id tag and the id tag is a unique identifier for your feed.
00:42Now, you are not limited to using URLs, you can use any value here that is
00:47going to be guaranteed to be unique and there are numerous schemes out there on
00:51the web that create unique ids for you, one you might want to consider looking
00:55up in addition to URLs is the UUID Generator on the web. There's several of those.
01:02Usually though, you will just use your website, address domain in your feed's id.
01:06The next required tag is the updated tag and this indicate for last time
01:11that the feed was modified in a "significant" way. Now the spec does not say
01:16what the word significant means, it leaves that up to the publisher. Usually
01:20what this means is the last time that the content of the feed was modified.
01:25Not necessarily the fixed typos or anything like that, but when the content
01:29itself was changed and this specifies a date. Date values have to conform to
01:34one of the formats I have listed there in the description field for updated.
01:38And you can look up any of these on the web. But I have provided an example
01:42over there on the right-hand side which specifies a date using the four
01:45character year, a two character month and a two character day and the then the
01:49character T which separates the time relative to GMT zone.
01:54Okay, so if you go back now and take a look at our feed tag. It's been updated
01:58to reflect the required tags, <title>, <update> and <id>. You can see what it
02:02looks like now. So now we have our title, which is my Atom feed. We have the
02:06updated tag put in there, and we have an id which points to my website. Okay,
02:10let's continue on looking some of the recommended and optional elements of the feed tag.
02:16The top table here lists two recommended tags offer and link and the lower
02:23table lists elements that are considered to be optional of the feed tag. So the
02:28author tag, and you can have more than one of these. It indicates the author of
02:33the feed and as I said, you can have multiple authors. The author tag is
02:37required unless all of the <entry> elements in the feed have authors as well.
02:43And you specify an author tag using the Atom "Person" construct. The "Person"
02:48construct you can look this up in the spec but it essentially contains three tags:
02:51name, email and URL.
02:54I have specified the name and the email over there in the example. Name is the
02:57only one that's required. Email and URL are optional. The link tag identifies a
03:04web page that's related to this feed and every feed should provide a link to
03:08itself and you can see over on the right-hand side there I have provided an
03:11example link and we'll see this in action later.
03:14Moving on to the optional elements, the category elements specifies a category
03:20that the feed belongs to, you can have more than one of these. The category
03:24basically contains a term attribute and inside the term attribute you specify
03:28the category that you want your feed to belong to and if you have multiple
03:32categories, you can just use multiple categories and specify a term for each.
03:37The contributor tag is similar to the author. This identifies a person who
03:42contributes to the feed and like authors there can be multiple contributors and
03:47these are specified using the same format as the author tag with name and email
03:51and URL and again email and URL are optional.
03:55The generator tag indicates the software that generated the feed and you can
03:59put any value in here you want. And then there's the icon tag, which specifies
04:04an optional icon for the feed. And in the icon tag, you essentially provide a
04:08path on your site to the icon.
04:11Okay, so now let's go back and take a look at the feed source. We have updated
04:15it now to reflect the feed tag elements, <author>, <category>, <link> and
04:22<icon> and you can see I have put them in bold there. Okay. So that's a brief
04:27introduction to the Atom Format. What we are going to do now in the next
04:31section is move on to adding entries to our Atom feed and we'll see a complete
04:36example at the end of the section.
Collapse this transcript
Adding entry tags to the Atom feed
00:00Okay, let's move on and take a look at how we would add some entry tags to our
00:05Atom feed. You can see there on the screen I have a basic entry and it looks
00:10something like this XML construct. You can see that there's an entry tag that
00:14wraps some other tags. There's a title, a link and an id and updated date and a summary.
00:21So we'll take a look at how we specify each one of these tags. Just like the
00:25feed tag, the entry tag has some required elements and not surprisingly, they
00:32are pretty much the same as the required elements of the feed tag. So the title
00:38for an entry is the name of the particular entry and you should not leave this
00:43blank because again this is how most feed readers will present the name of the
00:47entry to the user and you can see I have provided examples again for each one of these.
00:52The id again is unique identifier for this entry and like in the feed tag
00:57you can use any value here that's unique. Usually, you will use your website
01:01address domain along with some additional data that identifies the entry,
01:06either a path to the HTML file that specifies the entry or some other kind of
01:12identifier that your blogging system might use or some other type of unique
01:17identifier. The important thing here is that would be unique and the updated
01:21tag just like the feed tag indicates the last time that this entry was modified
01:24in a significant way and again the spec leaves it up to the publisher to
01:29determine what significant means.
01:32So for example, if you fixed the typo, that's not necessarily significant. So
01:36you wouldn't need to update the date there. And again, dates here have to
01:39conform to the formats that the feed updated tag needs to conform to and I have
01:45provided an example but you can look any of these up on the web and see how
01:48they work. And like the feed tag, the entry tag comes along with some
01:52recommended elements. Now these are not required, but they are strongly
01:56recommended because they provide richer information about a particular entry.
01:59So entries can have authors just like the feed can have an author. So the
02:03author tag specifies one author of the entry, and again just like the feed
02:09you can have multiple authors. Now if the feed tag that encloses the entries in
02:14this particular feed does not have an author tag, then the author tag becomes
02:18required for entries.
02:20So you need to put authors on your entries if your feed does not have a tag and
02:25it's probably a good idea to put authors on there anyway in case your entry is
02:30for some reason copied or referenced somewhere else.
02:33The link tag identifies a web page that's related to this entry somehow and the
02:40spec for Atom contains a lot more detailed information about what links can
02:44contain. But in this example, I have shown that this link tag links to a
02:49related web page that describes this particular entry. Now the content tag
02:54performs the bulk of the work in the entry because it contains or links to the
02:58complete content for this entry.
03:01If there's no summary tag, which follows next then this should be provided and
03:07the content as we'll see later can contain text, or HTML content. It can
03:12contain a whole bunch of different things. And then finally the summary tag
03:15provides a brief summary of what this entry says. If there's no content or if
03:20the content is not provided in line in this particular entry, in other words
03:25it's linked to, then the entry should provide a summary.
03:28Okay, let's finish up by taking a look at some of the optional elements of the
03:32entry tag. So entries can have contributors just like they can have authors and
03:37the contributor tag is used to indicate one of the contributors and it follows
03:41the same format as the author tag. You can have multiple of these for various
03:46contributors and one of the nice things about the Atom Specification is that
03:49contributors are distinct from authors. Entries can have a category tags as
03:55well. So the category tag specifies the category that an entry belongs to.
04:00Over there on the right, you see I have a category and the term is news. The
04:04published date indicates the initial publication time that this entry was made
04:08available. Now this is different from the updated date, publish means this was
04:12the first time that the world got to see this particular entry and it follows
04:16the same date rules as the updated tag follows.
04:18The source tag is used in the case where this particular entry was copied from
04:23another feed. The source tag is then used to preserve the child tags of the
04:28entry that the entry was copied from.
04:31For example, if I had copied this entry from some other feed, I would use the
04:35source to provide necessary things like the title that it came from, when it
04:40was updated, the copyright information, so on and so forth and I would also
04:43provide the id. You can see on the right-hand side there, its main purpose in
04:47life is to preserve information about the source of this entry and then finally
04:51there's a rights tag and you can put any copyright notice that this entry might
04:56have in it and you can see I have provided some example there with a copyright
05:00and my name.
05:01Okay, so let's go back and take a look at our updated entry tag. You can see
05:05that we have added the tags that we were looking at earlier. So in addition to
05:09the title and id and updated tag, I have added the link, a summary, a category,
05:17a published date and some content. Now to put it all together, this is what a
05:22finished Atom feed would look like.
05:25Right at the top here, you have the XML declaration and then that's followed by
05:28the feed tag which encapsulates the entire feed and then we have a title and id
05:33and updated tags for the feed, those three are required and then we have a link
05:37that specifies where the feed came from and the author and that would be me and
05:43then we have our entry which we just looked at.
05:45Okay, so this is a finished example of an Atom feed. Now like RSS, you can
05:51include HTML at various points inside your Atom feeds and specifically the
05:57tags, title, summary, content and rights can contain HTML code. There's a type
06:04attribute that you place on these tags that determine how this information is
06:09encoded. Now the default is text. So if you don't specify it then that's the
06:12default value, and if the type is text, then the element contains just plain
06:17text with no HTML in it.
06:19You can see there's an example of that right here. If the type attribute
06:23contains the string HTML, then the element contains entity escaped HTML. And
06:30you see an example of that down here. So in this example of the content tag
06:34contains type of HTML and then inside the content, I have escaped out the angle
06:40brackets that you would normally see on HTML tags.
06:43So instead of putting the term b and then new title with a closing b in order
06:48to make these words bold, I have to convert the angle brackets into their
06:52entity escaped equivalence. The ampersand and then less than with a semicolon
06:56and then the ampersand gt with a semicolon. There's a list of these and
07:00I'll get to those in a minute. And then finally, if the type is equal to XHTML, then
07:05the element contains, XHTML code and it is wrapped up in a single div tag. And
07:11that's an important thing you need to realize.
07:14So here you see an example of that. The contents contains XHTML and I have a
07:19div tag with the XHTML name space on it and then right inside the div tag I can
07:25put XHTML code straight in there and I have to escape it or anything like that.
07:29So I found a pretty good list on the web that lists all of the entity escaping
07:34characters that you can use inside XHTML and HTML. So if you want to follow
07:39that link, there you will see all the different ways that you can entity escape
07:42characters in HTML for inclusion in Atom.
07:46Okay brings us to the close of the ATOM format. Hopefully, you learned enough
07:51now to go out and make your own Atom feeds or at least read existing ones and
07:55now we are going to move on the next chapter.
Collapse this transcript
3. XML and JavaScript
Using XML support in browsers
00:00One of the greatest improvements to come along in recent years in the browsers
00:03has been the dramatic improvement in the way that they support XML natively and
00:08that's going to be the subject of this section of the course.
00:11Using the modern browsers like Internet Explorer 6.0 and later and Firefox 1.0
00:16and later you can work with XML right inside the browser environment. You don't
00:20have to resort to server side stuff, you can just work with it right there in
00:23the browser using JavaScript and other standard technologies.
00:27Now each Browser supports a slightly different set of functions and objects and
00:32properties for working with XML. Now Firefox had the benefit of being written
00:36after the DOM Level 2 Specification came along. So they used the DOM Level 2
00:41methods for things like creating and loading XML. Internet Explorer had XML
00:46support a little bit earlier and since the DOM had yet not addressed some of
00:50these issues, they used their own AP I for things like Loading XML and creating
00:54it from scratch and they used the MSXML2.DOMDocument ActiveXObject.
01:00Now the main differences are in creating and transforming documents, things
01:04like parsing and serializing and we'll get into all these terms in a moment.
01:07But the important thing to remember is that the DOM API for working with XML
01:12content like nodes and document elements and so on, that's consistent across
01:16the browsers.
01:17So what can you do with the built-in capabilities of the browsers? So if you
01:22want to work with XML in the Browsers there are a bunch of things that you can
01:24do just natively working with their existing capabilities. You can create new
01:28XML documents from scratch, you can also load documents from the network or
01:33from local files, and you can do these independently from the AJAX Objects that
01:38you might be familiar with. This capability has existed for some time now in
01:42the Browsers.
01:43You can load XML documents directly from a string content using XML parsing
01:48that's built into the browser itself. You can transform XML using XPath and
01:53XSLT and that's the subject that we'll cover a bit later.
01:57You can also serialize an XML document to a string. The word serialize refers
02:01to the process of taking structured content like XML and saving it out to a
02:06format that can be persisted somewhere usually a string or file or something
02:10like that. And you can manipulate XML content using the XML DOM.
02:14We are the point now where we can take a look at how Firefox supports XML. So
02:19let's go ahead and do that.
Collapse this transcript
Understanding XML in Firefox
00:00Okay, so as I mentioned earlier the browsers have the ability to create XML
00:03documents from scratch and to do them in Firefox, you call the function
00:07document. implementation. createDocument. That's the method that you use in
00:12order to create a new document and you can see the example right here. This is
00:16how it's called. It takes a few arguments. The first argument is the Namespace
00:20URL to use and we are not going to get too deeply into namespaces right now
00:24because they are fairly complex objects, but you should just know that you can
00:26pass an empty string for this.
00:28The second argument is the name of the RootTag that you want to be at the base
00:33of the XML document. And again you can pass them to strings for that, but if
00:36you pass a string here that will become the root tag that's at the base of the
00:41document. And the third argument is the document type and again this is bit of
00:45an advanced concept.
00:47For now we are going to use the constant null and in fact in most of the real
00:51world situations, this is what you will use any way. So I'm not going to go too
00:54deeply into that and as I mentioned the NamespaceURL and sRootTag can both be
00:58empty strings, but typically what you will want to do is at least pass a tag
01:03name as the second argument to save yourself a little bit of typing.
01:07Now once the document has been created, you can use the standard DOM methods to
01:12create content. Here is an example of that. This is a complete example in and
01:16of itself. You see at the first line what we are doing here is we have got a
01:19variable named xmlDoc and we are calling the createDocument method, and here
01:24I'm passing an empty string for the namespace and word myroot as the RootTag
01:31name and the third argument is null.
01:33So once the XML document has being created we can start making content using
01:37the standard DOM methods. So in this example I'm creating a new paragraph tag
01:41using the createElement function. That's going to create this P tag and then
01:45I'm going to create some text to go inside the paragraph, and I do that using
01:49the standard DOM createTextNode function.
01:52Once I have done that, I append the text into the paragraph and I append the
01:57paragraph into the document. And if we were to look at this in XML text form,
02:03the result will look like this. We have myroot, because that's the RootTag that
02:06we passed into the createDocument function, and there's our paragraph tag and
02:10this the string 'this is some text.' So this is an example of how you can
02:14create XML right in Firefox.
02:16Now you can also create an XML document by directly parsing a text string that
02:22contains XML code. In some instances this may be easier and faster because if
02:27you have a small piece of XML code that you need to parse in, this is just a
02:31few lines of codes.
02:32So the way you do this in Firefox is by creating a DomParser object and then
02:37you call it's parseFromString method to parse the data. We can see an example
02:41for that right here. So here I have a variable named oParser and it's an object
02:46reference to this newDOM Parser object that I'm creating and again this is an
02:50example that works in Firefox, we'll cover IE in a bit.
02:53Then we have a variable here named xmlDoc and xmlDoc is being assigned the
02:57result of the Parser's parseFromString method. So this is another way of
03:01creating a document from scratch. You can see here that I'm passing in the XML
03:06code in text form and this is the same text that we had in our previous example.
03:10And the second argument to parseFromString is the MIME type that you are going
03:15to assign to the content and for XML this is going to be application/xml and
03:19the result of this will be an XML document just like we saw in the previous
03:25example. The only difference here is that we are creating from a string rather
03:29than using the DOM methods to create the tags.
03:31We are not done yet however. There are other ways to get hold of XML documents.
03:37You can load XML content from a URL and this can either be from the network or
03:43from the local file system, which is useful when you are building your
03:46application and debugging it. And the way that you do this is by using the load
03:50method and XML content can be loaded either synchronously or asynchronously and
03:55we'll look at both examples.
03:56So the example here, I have a variable named xmldoc and I'm creating it using
04:02the createDocument method like I did earlier on. In this case, however
04:06I'm going to load it synchronously. Now the default is to load documents
04:10asynchronously. So you have to explicitly set the async property on
04:15the document object to false, if you want to load things in an asynchronous fashion.
04:20So once I have done that, I call the xmldoc.load function and I pass in the URL
04:25where I want to load things from. Now I can just give it a file name and it
04:29will load it from the same directory that this page came from, or I can give it
04:33an http address and it will load from that address as well, subject to all the
04:37security instructions that your browser has.
04:40Okay, you can also as I mentioned load from a URL asynchronously. And what that
04:46means is when you call the load function, the load function is going to go off
04:51and start loading the document but it's going to return immediately, so your
04:55script can continue executing. And when the document has finished loading, an
04:59event will be fired by the browser and you can use that event to call a
05:03function that needs to be called when the document finishes loading.
05:06In this example I have got my same document and I have created it up here using
05:11document.implementation. createDocument and I'm passing in root and a null
05:17value. And then I set the async flag here to be true. Now when I do that I need
05:22to do things a little bit differently. I need to set the onLoad event handler
05:26for the document to be a function it's going to be called when the document
05:29finishes loading. In this case I have set it to be a function called docLoaded
05:34and I'm passing in the xmldoc as an argument to that function.
05:38So now when I call the load function the browser is going to go ahead and go
05:43off and start loading the document, but any JavaScript statements I have
05:46following the load statement here are just going to keep right on executing. So
05:51you shouldn't execute any statements that depend on the document being loaded
05:54until your asynchronous event handler has been called, and that's this function right here.
06:00So in this case docLoaded will be fired when the document has finished loading.
06:05Now that we have seen how to work with XML and Firefox, it's time to look at
06:10some real life examples.
Collapse this transcript
Using XML in Firefox
00:00Okay, so here we are in the code. What I'm going to do now is write a few
00:05example functions that exercise some of the methods we just learned about for
00:10working with XML in Firefox.
00:13So this is the document that I have here. I'm at my starting point. Let me just
00:17scroll down so you can see the code. We have got four functions that we need to write:
00:21createXMLDocument, loadXMLDocument, loadXMLDocumentAsync and
00:26parseXMLDocument.
00:27So this is going to be a little test harness for us to try out our newfound XML
00:33skills in Firefox. So I'm going to write each one of these functions and you can
00:36see that in this script block that is going to execute right here at the bottom
00:40of the script tag. So I'm not going to anything fancy like set up event
00:44handlers or anything like that.
00:46So let's write the createXMLDocument example first. Remember to create an XML
00:52document the first thing that we need to do is have a variable to hold the document.
00:55So I'll write in the xmlDoc. And then I'll write document.implementation.createDocument.
01:09Okay. Now also recall that createDocument takes a few arguments so I'm going to
01:13pass in an empty string for the name space and I'm going to pass in the
01:16string myroot for the root tag name and passed in null for the last parameter.
01:23So this will create the document.
01:25Now we are going to use the DOM functions to create the content like we saw in
01:28the slide earlier. So I'm going to write var oPara = and to create elements,
01:36we tell the document to create them. So we say xmlDoc.createElement and
01:42we're going to create a p tag here.
01:47Now we are going to create the text to go inside the paragraph. So we'll say
01:50var oText = xmlDoc.createTextNode and inside the TextNode we are going to put
02:02'This is some text'.
02:05Now we need to append that text into the paragraph. I'm going to write
02:09oPara.appendChild and that's going to append in the TextNode. Now we need to
02:18put the paragraph into the documents, so I'm going to say
02:20xmlDoc.documentElement, because the documentElement recall from the DOM
02:27is always at the root of the XML document. And we are going to tell
02:32the documentElement to appendChild and that's going to be the oPara.
02:39So now we are going to see a trick that we have not directly covered in the slides,
02:44yet this is how to serialize an XML document into a string and we are
02:49going to do this so that we can call it an alert function and display the
02:52contents. So what I'm going to do now is write alert and that's going to show
02:56the XML code.
02:58What I'm going to do now is type in new XMLSerializer and that creates a new
03:05XMLSerializer object and this works in Firefox and to get the text content of
03:12the XML document or more accurately to get a text representation of all the
03:17tags and all the content, I'm going to call the serializeToString method and
03:28I just need to pass in the XML document node.
03:31Now I can do this for any node in the document; it doesn't have to be the XML
03:35document itself. But in this case, I want the text for the entire document,
03:40so I'm going to pass in the root node of the document.
03:43So now we are at a place where we can try this out. So I'm going to browse this
03:48in Firefox. Browse With, and you can do this using whatever tool you happen
03:54to be using if it has some built-in way of launching a browser. If not just save the file,
03:58go out to the file system and bring it up in your browser.
04:01So we're going to launch Firefox here. You can see that the alert is being called
04:06and sure enough there's the root tag, myroot, and there's the paragraph we created
04:10and there's the text content.
04:12So it seems to work just fine so let's move on to next example. So now we are
04:17going to create the loadXMLDocument feature and the loadXMLDocument example
04:22what we are going to do is load a local file and the local file is this right here,
04:27the businesscard.xml file. So this is saved in the same directory as the
04:33page we are working on. So I'm going to go switch back to the code here.
04:36So to load a document, remember what we need to do. First, we need to have our variable to
04:42hold the document, so we'll write var xmlDoc and let's say
04:47document.implementation.createDocument.
04:55Now in this case, I'm going to pass empty strings for both the root tag
04:59and the name space, because we are going to load an entirely finished
05:01document, so I don't need to pass a root tag in. then I'm going to type in null
05:06for the third parameter. So now I have created an empty XML document.
05:10Now I'm going to load this document synchronously. So to do that I need to say
05:14xmlDoc.async = false. Otherwise this would default to true, so I need to
05:21explicitly do this. And it's always a good idea to explicitly write what your
05:26intentions are anyway rather than rely on implicit behavior, because for all
05:30you know in the future that make change.
05:32So once I have done that I just need to write xmlDoc.load and since the file
05:38is in the same folder, all I need to do is write the name of the file here.
05:43So that's businesscard.xml and we are going to use the same serialization trick in
05:52all of our examples. So I'm just going to ahead and paste that in down here.
05:55So let's go ahead and comment out the createDocument since we already know that
05:59that works and now we are going to called loadXMLDocument. All right, so let's
06:03go ahead and view this in the browser.
06:10And you can see that the XML document
06:13loaded properly and it got serialized out to a string and here it is in the alert.
06:18All right, so far so good. We're two for two, let's keep on going.
06:21Now we're going to do the same thing. We're going to load the XML document, but
06:25we're going to do it asynchronously. So once again I'm going to create the document.
06:29I'm just going to copy and paste this line here and now this time we are going
06:33to do it asynchronously. So to do it asynchronously, I need to set the
06:37xmlDoc.async = true. Now again this is the default, but I find it's better to
06:43be explicit.
06:45Now we are going to set the onload handler for the xmlDoc. So that's this guy here,
06:49xmlDoc.onload, and we are going to write = function and inside this function,
06:57we are just going to say alert. And we are going to do the same
07:03serialization trick that we have been doing. So I'm going to copy that and
07:08I'm going to paste it into my alert here.
07:12So now that we have written the asynchronous version, let's go down and comment out
07:16the one we know works. So I need to add the actual call to load the document.
07:23So I'm going to write xmlDoc.load and it is going to load the
07:32businesscard.xml. So let's go browse this in Firefox.
07:41There we go and you can see that it loaded asynchronously.
07:45So that's an example of loading the file both synchronously and asynchronously.
07:49So now I'm going to close the browser.
07:52The last example that we are going to write now is parsing a document from a
07:57text string and so what we are going to do is write the parseXMLDocument
08:04function and that's this guy right here. So let's go comment out the previous
08:10example. So what we are going to do now is write var oParser = new DOMParser.
08:25Once we do that we need to call the parser's parseFromString method and assign
08:31that to be an XML document, so we'll write var xmlDoc equals and now it's
08:38just a matter of calling oParser. parseFromString and we are going to pass in
08:51the text example from our earlier slide, the whole myroot thing. So I need to type
08:58myroot and we need to give a closing tag myroot and we are going to put the
09:05paragraph in here along with a closing paragraph and we are going to write in
09:11'This is some text'.
09:15So now we have a text string that represents a complete XML Document and once
09:21we have called the parseFromString method, once again we are going to use the
09:24XML Serializer trick to get a string that we can alert and that's going to be
09:28this guy right here.
09:29All right so now we are ready to try this example in Firefox. Oh, actually
09:33before I browse it, I forgot that there's actually-- parseFromString
09:37actually takes another argument, which is application/xml, pass in the MIME type
09:46that the file is going to be in this XML file.
09:48All right, so now we are ready to browse this is Firefox, so let's go ahead and do that.
09:59And you can see that we parsed this document from the string and
10:02we are showing it in the alert here.
10:05Okay, so that's using XML in Firefox. I think we are ready to move on now and
10:11look at the same kind of capabilities that Internet Explorer provides.
Collapse this transcript
Understanding XML in Internet Explorer
00:00All right, let's take a look at how XML is handled in Internet Explorer.
00:04Now that we have seen how Firefox handles XML, it's IE's turn.
00:08So to create a new XML document in Internet Explorer, use a slightly different
00:12syntax. You used the ActiveXObject to create the instincts of the DOMDocument
00:17object type and the most recent version of this DOMDocument.6.0.
00:22The way that you do this is shown in the example down here.
00:26So I have a variable in xmlDoc and instead of calling the document's
00:31implementation create document function like I do in Firefox, what I do here is
00:36create a new ActiveXObject and I pass in the string MSXML2.DOMDocument.6.0.
00:42This will create an XML document the same as it does in Firefox and in this
00:48example, you can see I'm doing pretty much the same kind of thing with creating
00:51document content using the DOM methods. From this point on, it's pretty much
00:56the same method as it is in Firefox.
00:58So now here we are creating an element named rootTag and appending that into
01:02the document and creating a paragraph, creating some text, appending the text
01:07into the paragraph and putting the paragraph into the document. So once you
01:11have got the document created, the DOMAPI for manipulating document content
01:17using this data DOM functions is consistent across the browser.
01:21Now just like Firefox, you can parse XML directly from a text string in
01:27Internet Explorer and the way that you do that is actually really pretty
01:31straightforward, you don't have to create a DOM parser or anything like that,
01:35IE makes this pretty easy. All you need to do is create the XML document like
01:39you did in the previous example and then it's a simple matter of calling
01:44the loadXML method on the document object. And in this case, I'm passing in a
01:50string which represents a functionally complete XML file.
01:55So these two lines of code do -- essentially what the Firefox example does,
01:59only we don't have to create any objects other than the document itself
02:02because the XML document in IE has a convenience function that just loads XML
02:07right from the string.
02:08You can also load XML from a URL in Internet Explorer, just like you can in
02:13Firefox and you used the load method. Now amazingly enough the code for
02:18Internet Explorer is exactly the same as that for Firefox. I know it's amazing,
02:24but they got this both the same. The only difference is how you create the document.
02:29So you see here I have got the ActiveXObject and in Firefox, you used
02:32the create document method but other than that it's exactly the same.
02:35The async property is the same and the load method is the same.
02:39Now again the default behavior is to load documents asynchronously, so I need
02:44to explicitly set the async property to be false before I called the load
02:49method if I want this to be handles synchronously. And you can see that
02:53the example that I'm using here is the same as the example I used in the Firefox example.
02:59And of course, you can load a document asynchronously. It's a little more
03:04involved than what you need to do in Firefox but not much. So here's the line
03:10of code where we create the XML document. In this case, I'm explicitly setting
03:14the async property to true because we want to load the document asynchronously.
03:19Now instead of handling the onload event like you do in Firefox, we have an
03:25event here called onreadystatechange and this is Internet Explorer's way of
03:29handling the asynchronous events. So onreadystatechange takes a function.
03:34I'm declaring that here and inside the function there's a property, a check called
03:38readystate, and readystate can be one of a bunch of different values.
03:43I won't go into all of them here because you can find documentation on that pretty easily.
03:48All you need to know is that when the readystate property is equal to 4
03:51that means that the document has been fully loaded and documents go through various stages.
03:56They go through, you know, the request stage, the data is being
03:58downloaded stage and so on. When state reaches 4, it's been loaded and so
04:03we call our docLoaded function with the document object as an argument and then
04:09we go ahead and called the load function. And in this case, any statements that
04:13follow the load function will just go ahead and keep right on executing.
04:16So you don't want to execute any statements that depend on the document being
04:20loaded until your call back function has been called and the document has been fully loaded.
04:26Okay, so now that we have seen how Internet Explorer handles loading and
04:30creating and parsing XML, let's go take a look at some real life examples on how to do this.
Collapse this transcript
Using XML in Internet Explorer
00:00All right, this is the code for the Internet Explorer examples and just like in
00:06the Firefox examples we have got some functions we need to fill out in order to
00:10accomplish the same kind of tasks that we did in the Firefox example previously.
00:14So you can see it's pretty much the same file. It's got empty content right now
00:19and we have the four functions that we are going to fill out to demonstrate
00:23creating, loading, and parsing XML, the same way that we did in the early
00:27example of Firefox.
00:29So let's begin by creating an XML document in IE, and we call to do that,
00:33it's a little bit different than Firefox. We write xmlDoc = and now we write new
00:39ActiveXObject and the ActiveXObject we need to instantiate here is
00:44MSXML2.DOMDocument.6.0. Okay, so now that we have created the document,
00:58we need to put some content into it.
01:01So what I'm going to do here is say xmlDoc.appendChild and we are going to
01:10create the root tag here. So we are going to say xmlDoc.createElement and
01:18we'll call it my root just like in the previous example. Then we'll create a
01:28paragraph, xmlDoc.createElement, paragraph element and now we'll create our
01:40text element. Some text. Okay, and we'll put the text inside the paragraph and
02:04now we'll put the paragraph inside document.
02:15Now to serialize XML content to a string in Internet Explorer, it's incredibly easy.
02:23You don't have to create any XMLSerializer object or anything like that.
02:28All you need to do is watch this because if you blink, you may miss it.
02:31I'm just going to say alert (xmlDoc.xml). So the XML property is a convenience
02:40property provided by E on every DOM note in an XML document.
02:45So just by referencing this property you can get a string representation of
02:48that note. It's a really incredibly easy way to do it. So we are going to alert
02:54the XML content that's in this document after we finish creating it. All right,
02:58so we are ready to go ahead and test this out. Let's bring it up in IE,
03:01see what happens. Okay, so you can see it worked. Here is the <my root> <p> and
03:08this some text, just like we saw in the previous Firefox example.
03:13So now that we have created the create XML document example, let's look at how
03:19we load an XML document. First let me comment out the create XML document example,
03:25because we have already completed that one. So recall that loading
03:29an XML document in IE is the same as in Firefox. The only difference is the way
03:34that we create the document. So we do that right here. I'm just going to copy
03:37that and paste it in. So once I have done that I'm going to load this document
03:42synchronously, which means I have to explicitly set the XML document's async
03:48property to false. Then I call xmlDoc. load because this is the same code for
03:56Firefox we call. So we type in businesscard.xml. That's the file we are going
04:05to load and we are going to use the same serialization trick that we used in
04:09the previous example just alerting the XML property on the XML document.
04:14So now we are ready to test things out and let me make sure I have commented
04:17out the previous example, and I have. All right, so let's bring this up in IE
04:21and see what happens. Okay, you can see that it worked. So the businesscard.xml file
04:31got loaded and here we are looking at in the alert. Everything seems to be
04:35working fine. So let's go back and go on to the next example.
04:38So the next example we are going to load the same document, but we are going to
04:41do it asynchronously. So before I do that, let me go down here and comment out
04:46the previous example. So remember that loading the document asynchronously is
04:53a little different than doing it in Firefox, but we need to create the document.
04:58So I'll just copy that and paste that in here and we are going to use the same
05:02serialization trick. So I'm going to copy that and paste that in here.
05:07So now in this case, I'm going to set the xmlDoc.async property to be true
05:14because we are doing this asynchronously. So instead of using the onload
05:18handler like in Firefox, if I can type in xmlDoc.onreadystatechange and I set
05:27that to be a function. Now in this function, which will be called when
05:34the document's readystatechange event gets fired, I need to check to see if
05:39the readystate is equal to 4.
05:41So if xmlDoc.readyState == 4, then we are going to just alert the XML
05:55document's XML content and that pretty much means we don't need to do this here.
06:02All right, so now we are pretty much just about ready to try this.
06:06All that's left to do is xmlDoc.load and this will kick off the loading process.
06:15So we'll type in businesscard.xml. All right, and I should be ready to go.
06:21So here we are creating the document, setting the async to true. We have defined
06:26the readystatechange function and then we load the document. So let's go ahead and
06:31try this in the browser.
06:32Okay, seems to be working. Let's refresh just to make sure. Yeah, there's it.
06:41It's loading. Everything seems to work. All right, let's move on to the final example.
06:45Final example is parsing XML document from a string. To parse
06:53a document from a string in Internet Explorer is a little bit easier than it is
06:58in Firefox and that you don't have to create any objects to do this. There's a
07:03convenience function called loadxml, which does this for you, but we still have
07:07to create the document.
07:08So I want to copy that line there and paste it in here and we still want to use
07:13the serialization trick to alert the content. So I want to copy that from here
07:18and paste that here. Now the only thing we need to do is write xmlDoc.loadxml
07:26and the loadxml function takes a string, which contains the XML code that
07:32we want to have loaded.
07:33I am going to just write that to my root, on to my root and we are going to put
07:46in the paragraph tag and the close paragraph tag and this is some text.
07:55That's all there's to it. So we have commented out the previous examples, we have got
07:59parse XML document ready to be called. So let's go ahead and view this in
08:03the browser to see what happens.
08:09And there you have it. So the XML is being loaded and we are displaying
08:13the content in the alert. Okay, so now you have seen how to work with XML in
08:17Internet Explorer and in Firefox. So let's move on to our next lesson.
Collapse this transcript
Serializing XML to a string
00:00Now as I demonstrated in the previous examples both Firefox and Internet
00:04Explorer have a notion of serializing XML to a string and serializing is
00:11basically the process of converting an XML document to a format that can be
00:15saved like a string.
00:17Now serialization can take other forms as well, but string is the most common,
00:21especially in most of the real world scenarios you will probably run into. And
00:26you would like to do this for several reasons. First, you can use it for saving
00:30XML content either to a file or some other persistent method, whether it's a
00:35stream or something like that.
00:36You can also use this to interchange XML content with another system, something
00:41that works with text processing like say regular expressions which we
00:45investigated in the practical and effective JavaScript title, also available
00:50here at Lynda.com.
00:52And you also might want to do this to aid in debugging. So if you are working
00:56with XML and you've got a whole bunch of logic, there will probably come times
01:00when you want to display some debugging messages that contain XML content and
01:06in order to do that you need to serialize the XML to a string before you can
01:09write it out to a console or display it in a alert or whatever debugging method you use.
01:14And as I showed, Firefox and Internet Explorer both have ways of serializing
01:21strings. Firefox provides support for an object called the XMLSerializer and
01:28you just call the serializeToString method on that object. Whereas Internet
01:32Explorer makes things really easy, it just simply provides an XML property on
01:36each XML node in the document, which provides a text representation of the
01:42document tree starting at that node.
01:44And just to review, to perform Firefox serialization to a string, you can see
01:49in the top example we've got a document fragment where we create a paragraph
01:54element with some text inside of it and to serialize it out, we instantiate a
01:59new XMLSerializer object and call the serializeToString function with the
02:04document as the argument.
02:05And as I mentioned earlier that argument to serializeToString, this can be any node.
02:10It doesn't have to be the document. You can serialize starting at any
02:13node in the tree. So we could have passed in the paragraph or the text or
02:16anything like that.
02:17And in the Internet Explorer case we use the XML property on the XML document,
02:23and again you can use this on any node. It doesn't have to be the document, but
02:28this is how you get the string representation of XML in Internet Explorer.
02:32Okay, so now that we have covered using XML in Firefox and IE and we have seen
02:37how to serialize the data, it's time to move on to our next lesson.
Collapse this transcript
Understanding cross-browser actions with the Sarissa library
00:00At this point we could take what we have learned using Firefox and Internet
00:04Explorer and the ways that they handle XML differently and we could sit down
00:10and write our own cross browser library for handling XML code.
00:15Now thankfully we don't need to do that, because someone else already has and
00:20that library is known as the Sarissa Library and that's what we are going to
00:25cover in this section.
00:26So the Sarissa Library is free open source library that's used for handling XML
00:32data and it was written by a guy named Emmanouil Batsis and it's available for
00:36download from the sarrisa. sourceforge.net website.
00:41The Sarissa libraries essentially provide a cross browser way of working with XML.
00:47Processing it, loading it, serializing it, transforming using XSLT,
00:52a whole bunch of cross browser support for common XML functions, and the API that
00:58it uses basically takes the best of both browsers, Firefox and IE, and creates
01:05one unified API for working with the XML content.
01:10Now to use the Sarissa Library in your projects essentially all you need to do
01:16is include the sarissa.js file in the document where you want to use the
01:21functions and we'll take a look at few examples of that in a few moments.
01:25Creating a document using the Sarissa Library is accomplished by calling
01:30getDomDocument() method with the namespace and root tag that you want to use
01:36and this code here works cross browser. So unlike Firefox or IE, you don't have
01:42to do any browser detection or write any special code; all you need to do is
01:46call the getDomDocument method.
01:48So looking at the previous examples that we have used in both the Firefox and
01:53IE sections, what we have done here is changed the way we instantiate the XML
01:57document to just call the Sarissa Library's getDomDocument method and we are
02:02passing in two empty strings here, and then it's just a matter of using the
02:06standard DOM functions to create the document content.
02:10So what Sarissa does is it creates an XML document that's native to the browser
02:15that you are in. But the way that it crates the document is cross browser. And
02:20the reason why this works well is because recall that the DOM API for
02:24manipulating XML content once you have a document is the same in both browsers.
02:28So once we have the document instantiated using this line, the rest of the code
02:33is cross browser just by its very nature.
02:36Sarissa also provides support for loading an XML document. In fact the
02:42asynchronous version for loading the document is the same as it is in the
02:47Internet Explorer. So if we look at the example here, you can see that the
02:52example for loading an XML document is slightly different because of the way
02:57you get the document. So here we are calling the getDomDocument function on the
03:01Sarissa Library and from that point forward it's pretty much the same in IE and
03:06Firefox. You set the async property to false and you just load the XML you want to load.
03:10For the asynchronous case, the Sarissa Library emulates the Internet Explorer
03:15method inside of Firefox for you. So here we have our loadXMLDocumentAsync
03:22function and here we are instantiating the document by using the Sarissa
03:26Library's version and we set the asyn property to true. Now Sarissa implements
03:31the onreadystatechange and readystate properties in Firefox, so you can write
03:35the same code in IE and Firefox and it will just work cross browser.
03:40The Sarissa Library also implements the Serializer and Parser objects that we
03:46saw in the earlier Firefox example, only these versions are cross browser and
03:52you can call the code the same way that you would in Firefox. So if we look at
03:56our parse.xml document example from earlier, you can see that the example
04:00pretty much looks exactly the way it would in Firefox only now once you have
04:04included the Sarissa Library, this code also work in Internet Explorer.
04:08Okay, we have enough to see if we can go back, rewrite our earlier Firefox and
04:15Internet Explorer examples using the Sarissa Library, and test them in cross
04:19browser environment.
Collapse this transcript
Creating Sarissa examples
00:00Okay, so here we are in the code. In this example, we're going to use the
00:05Sarissa library to go back and rewrite our earlier Firefox and IE examples
00:10using this cross-browser library for handing XML.
00:15You can see here in the code that this is pretty much the same document that
00:19we started out with both in the Firefox and IE examples with one major difference.
00:26I have included the Sarissa library here using this script tag. Again, I've
00:33provided the URL for downloading Sarissa in the slides part of the section.
00:38So you can go ahead and do that on your computer and just get the sarissa.js
00:43library and you'll be good to go. Let's go ahead and start rewriting our
00:47example to use the Sarissa library and see if it works cross-browser, like it
00:52says it does.
00:53For creating a document, what we need to do is have a variable named xmlDoc.
00:58Now remember that what we're going to do here is use the Sarissa version of
01:02getting a DOM document. So I'm going to type Sarissa.getDomDocument and
01:11I'm going to pass in two empty strings.
01:15Okay, so now that I have the document, I'm going to go ahead and start creating
01:19the document content using the DOM. So I'm going to write xmlDoc.appendChild.
01:30We're going to create a root element named myroot. We're going to create the
01:42paragraph tag, like we did in the last examples. We'll create the TextNode.
02:04Okay, so far so good. We'll put the text inside the paragraph and we'll put the
02:17paragraph inside the document.
02:29Now we're going to do the serialization trick,
02:33so that we can display an alert containing the XML's document content.
02:38So we're going to say alert (new XMLSerializer().serializeToString())
02:52and that is going to take the XML document as an argument. Okay, it looks like we're
03:00ready to test this out. We are creating the document, creating some content and
03:06then calling this XMLSerializer().serializeToString.
03:09Now this looks pretty much like the Firefox example we did earlier, except for
03:14this code right here which instantiates the document using Sarissa.
03:17So we'll try it in Firefox first to see what happens.
03:24Okay, you can see that it worked.
03:26Here we have the XML document and it's being displayed in this alert.
03:31All right, so now the real test is does this work in IE? So let's see if that works.
03:37Okay, and it does. So now we've created a document using a cross-browser
03:42syntax that works in both IE and Firefox. All right, let's continue onto the next example.
03:47So in the next example we're going to use Sarissa to load an XML document. So
03:52let me comment out the previous example right there. Okay, so for loading a
03:57document, remember, this is the easy part. So we need to create the document.
04:01So I'm going to go ahead and copy the line from up here that does that.
04:06We're going to use the same serialization trick to show the content. So I'll copy
04:11that line and paste that in down here.
04:13Now to load the document, we need to set the async property to false to make
04:19sure that we load it synchronously. Then it should be a simple matter of saying
04:24xmlDoc.load. We're going to load the same businesscard.xml file that we loaded
04:32in the previous examples. Okay, so let's make sure that we've got that function
04:37being called and it is.
04:38So let's first try it in Firefox. Okay, there it is and you can see it's working.
04:50Now let's try the same thing in IE. Okay, so far so good. We're two for two.
04:57Let's move on to the asynchronous example. So I'll comment out
05:03the previous one here.
05:05The asynchronous example follows IE's model. So we're going to go ahead and
05:10copy the line where we instantiate the document. In this case, we're going to
05:17be doing this asynchronously. So let's say async = true. Now what we need to do
05:25is define the onreadystatechange(). So I'll write xmlDoc.onreadystatechange = function().
05:40In the onreadystatechange, we need to check to see if the xmlDoc's readyState
05:48property is equal to 4. If it is, we're going to do our little serialization
05:55tick to show that it loaded. So we'll copy that guy there, paste it in down
05:59here. Then all we need to do after the readystatechange is load the document.
06:04So I'll say xmlDoc.load and it's the same businesscard.xml file we've been
06:14using in our example so far. Okay, so that should be pretty much all I need to
06:19do in order to test the asynchronous loading. So let's bring this up in Firefox
06:25first. Make sure I've got it running and I do, okay good. Let's bring up
06:28Firefox. All right, there it is. So far so good. One more time in IE, okay.
06:42This is looking pretty good.
06:44So far we've made through three of the examples and the cross-browser promise
06:48is holding up. So the final example is parsing the XML document from a string.
06:55Now the Sarissa library implements a cross-browser version of the DOMParser object.
07:01So that's what we're going to use. We'll write var oParser = new
07:10DOMParser(); We'll write var xmlDoc = oParser.parseFromString();
07:27Remember that when we call parseFromString(), it takes two arguments.
07:30We need to pass in the string that we want parsed as well as the MIME type. So,
07:35we'll pass in "application/xml" as the MIME type. We're going to type in our
07:45<myroot>, paragraph tags, and This is some text.
07:58Okay, so now we're rewritten the example to look pretty much like it did in the
08:04Firefox example using the DOMParser object. The only thing we would love to do
08:08here is the serialization alert trick. So I'll copy that and paste it in here.
08:15All right, let's hold our breath. Try it out in Firefox.
08:25All right, so it worked in Firefox. We were able to parse from a string.
08:28That's good news. Finally, let's try it in Internet Explorer and there it is. So using
08:37the Sarissa library you can see how we were able to implement cross-browser
08:41examples of handing XML code.
08:44So I strongly encourage you go download this Sarissa library. It's free.
08:48You can download it from SourceForge.net and I've provided the URL. It provides
08:52pretty robust handing of XML content across browsers.
Collapse this transcript
Understanding the ECMAScript standard (E4X)
00:00Okay, so the last technology I'm going to cover in this section is called
00:05ECMAScript for XML. It's typically known by its abbreviated name E4X.
00:10ECMAScript for XML or E4X is an international standard. It's defined by the
00:18ECMA-357 specification that was adopted back in December 2005.
00:23The whole purpose behind ECMAScript for XML is to add support for XML as a
00:31first class datatype in ECMAScript, which forms the basis for languages like
00:36JavaScript and ActionScript, if you've ever used Flash or Flex.
00:40The real power of ECMAScript is that it lets me treat XML as a built-in
00:45datatype. So you see the example I've got here where I've got a line of code
00:49declaring a variable named j and setting it to the numerical value 3 or a
00:54variable like myStr and setting it to a string variable.
00:57Using E4X, I can do something like this. I can declare a variable named myXML
01:03and just send it to XML code right in the script without having to do any DOM
01:08manipulation or any other kind of fancy tricks. This is a very powerful
01:12feature. It's one of the great things about ECMAScript as a scripting language
01:16is that it provides this kind of support for working with XML.
01:19Its whole purpose, as I said, is to allow you to work with XML as a native
01:23datatype. So where can you find an implementation of E4X? Well, the known
01:30implementations as of this recording, Firefox 1.5 and later has native support
01:36built in for E4X. IE does not support this technology yet.
01:41Adobe ActionScript version 3.0 and later, which is present in Flash Creative
01:47Suite 3.0 and Adobe AIR and Adobe Flex as well as Acrobat version 8.0 and
01:52later, both in the Reader and in the full Acrobat versions and version 1.6 of
01:59the "Rhino" version of the JavaScript Interpreter engine as well as Aptana's
02:05"Jaxer" AJAX Application Server which uses the Mozilla code on the back end on the server.
02:11These are so far some of the better- known implementations. So if you're using
02:15Firefox or if you're writing code in ActionScript version 3.0 or later, you can
02:21use E4X in your code.
02:23So there are two main ways of creating XML using E4X. The first, which we've
02:29already seen, is to just assign XML code directly to a variable in your script.
02:34The second way is to use the XML object as a constructor using the new XML
02:41operator. This is what both examples look like.
02:44So you've seen the first one already in the previous slide where I have got a
02:48variable and I'm assigning just an XML code straight to it. The second example
02:52down here is using the XML constructor. Both of these are functionally
02:56equivalent, you can use either one of these. The new XML syntax is obviously a
03:01bit more object-oriented. So if that's your preference, you can go that way,
03:05but either one of these is perfectly fine and valid.
03:08What's nice about E4X is the way that it interacts with JavaScript's native types.
03:13So you can evaluate JavaScript expressions using the brace syntax and
03:20I've got an example of that here. Suppose I had a variable name which contained
03:25a text string and I wanted to have that inserted into XML based on an
03:30expression evaluation.
03:32I can do that using the brace syntax as you've seen here. So if I write
03:36something like var myXML, and using E4X I just type out some XML code. I put
03:41the name of this JavaScript variable inside these two braces. Then the result
03:46will be as if that variable name gets substituted using the contents of that
03:50variable in the XML. This is a really powerful way of working with XML code and
03:55mixing it with JavaScript logic.
03:56E4X provides several ways of accessing content that's in XML. So it's very
04:03straightforward and the good news is you use common JavaScript syntax in order
04:08to do it. So for example, the bracket and dot operators work the same way for
04:14E4X content, as they do for objects. The at operator, which is a syntax that's
04:20borrowed from XPath, is used to access attributes on a tag. We'll be seeing
04:26live examples of this in a few moments.
04:28You can also use arbitrary selectors. For example, if you have an element that
04:33has attributes on it and you've got multiple of these elements with attributes,
04:38you can do some basic filtering by checking to see if an attribute is equal to
04:42a certain value to filter out the selection of certain nodes in your E4X XML
04:48content which is really cool and we'll see how that works.
04:51You can also use some built-in functions, like length() to see how many child
04:55tags a parent tag has. Creating XML content in E4X is also really easy and
05:00straightforward. In fact, one of the main sources of the power of E4X comes
05:05from the way it starts to draw the line between what's a JavaScript object and
05:10what is XML content. So for example, you can use JavaScript operators like the
05:14brackets I've already shown, you can also use the += syntax, etcetera, to
05:20create new content in the XML.
05:23You can also modify the XML data directly in place by placing an E4X expression
05:29on the left-hand side of an assignment operator. So imagine I had a bunch of
05:33XML content in E4X format. I wrote myXML .title. Imagine there's a title element
05:39and I send it to the string, I would actually modify the contents of that XML
05:43element directly in place by using common JavaScript notation.
05:48Deleting XML content is also really straightforward. You just use the delete
05:51keyword. For example, if I had XML content and I want to delete the title
05:56element, I would just simply type delete myXML.title. The title element or all
06:02of them, if there was more than one, would be gone.
06:04I can also delete individual attributes by using the at syntax. So for example,
06:09to delete the name attribute from the title, I would simply write delete
06:12myXML.tittle.@name. As I mentioned earlier, you can delete multiple instances
06:17of a given element by just referring to the name of the tag.
06:21So a couple of things to note about E4X and the way it inter-operates with the DOM.
06:27It's important to note that E4X content are XML objects, they're not DOM
06:34objects. So E4X content and DOM XML are not the same. The reason for this is
06:39because E4X creates its own object types and they don't directly operate with
06:44the DOM API that's provided in the browser. However, we can be clever about
06:49this in a couple of ways.
06:51You can achieve some measure of interoperability by using the two-string
06:55operator on the E4X content. Then you can go ahead and pass that to a DOMParser
07:01object, which will create a DOM representation of the XML code for you, because
07:06remember using the DOMParser we can create XML content directly from strings.
07:12As long as we have a string that we can pass to a parser, we can create a fully
07:15formed DOM document. Remember going the opposite direction, you can serialize a
07:20DOM document to a string using the XMLSerializer class and you can pass that to
07:27the XML constructor to create E4X content.
07:32Okay, so I think we've reached the point now where we've had enough theory,
07:36let's go ahead and look at E4X in action!
Collapse this transcript
Using E4X
00:00Okay, time for some examples with E4X. So I'm here in the code and I have got
00:06my E4X example file open and let me just scroll through the document so you can
00:11see what we are going to be doing.
00:13So right up here at the top, I have got a couple of variables defined.
00:18One of them is a string that contains my name and the other is XML code that
00:24I'm assigning to a variable named myXML. And we have got some functions that we are
00:29going to write that demonstrates some examples and that's pretty much it for this file.
00:35So we are going to write these functions in order and we are going to exercise
00:39some of the capabilities of E4X. Now, remember as far as browsers are
00:43concerned, this only works in Firefox. E4X works also in the latest version of
00:50ActionScript but for purposes of our demonstrations, I'm only going to be
00:53showing you these in Firefox because it has native support built-in.
00:57So the first thing I'm going to do is you can see I have created an alert down here
01:02and the alert is going to show the contents of the myXML variable once
01:07the XML has been loaded. And we are using the same BusinessCard data that
01:13we have been using in the external file cases for loading XML in Firefox and IE
01:20and what I have done is I have copied that XML data right here into my
01:23document. So you can go ahead and copy it along with me if you don't have
01:27access to the sample files.
01:29The other thing I want to point out is that I have got a brace syntax here in
01:34the name field, so I have replaced my name with that syntax and you can see
01:38I have assigned that variable up here. So the first thing I'm going to do is save this
01:42and bring it up in Firefox to see if it works and you can see that it does.
01:49So the BusinessCard logic got parsed correctly and you can see that
01:55the name inside the brace syntax got replaced with the expression that evaluated to
02:00the string content that's my name and rest of the XML will be just fine.
02:04Okay, so let's go back to the example. So now that we know that that's working,
02:09I'm going to go ahead and comment that line out. So the first thing that we are going
02:12to do is write a few exercises to see how we can access E4X content. So,
02:20the first thing we are going to do is write a variable named name and
02:25we're going to do something very simple. We're going to say myXML.name and we'll alert that.
02:32So, using the dot syntax in E4X you can refer to any one of the elements in the
02:39XML data. You don't have to be very fancy about how you access it.
02:43Here I'm just referring to the variable and referring to the name of the element that I want.
02:47Let's see how that works. I'm going to go ahead and bring up the browser and
02:55you can see that that's bringing up the text content of the tag that contains
03:00my name. So it's not giving me the entire tag including the angle brackets and
03:04the tag name; it's just the text content which is pretty useful. And in fact,
03:07in most real-world scenarios that's what you want to have happen.
03:10Let's get a little more sophisticated. I'm going to declare another variable
03:15named phones and I'm going to assign that myXML.phone. And what I'm going to do
03:23here is alert phones.length, plus a string, plus the phones variable and
03:39I'm going to say plus phone tags found and we are going to alert the contents of
03:47the phones and we'll comment out these two guys, okay.
03:52So what I doing now is accessing the phone. Notice that there's more than one phone.
03:58So let's see what this does. I'm going to bring up the browser.
04:06See, you can see in this case that by referring to a tag that has more than one instance
04:10in the XML data it actually came back with an array of those tags and you can
04:15also see that the length function returned the number of tags that were in that
04:20array. So if you get a little bit more powerful here, this is really great stuff.
04:24Using this kind of information, I would be able to do things like write loops
04:28and that kind of stuff so let's move on and get a little bit more
04:31sophisticated. So now we have seen how to access an entire array of XML data.
04:37Let's write some code that accesses a specific element in the array.
04:43So I'm going to say var phoneType = myXML.phone. Only this time I'm going to use
04:54the bracket syntax to get a specific phone. So I'm going to use index zero and
05:00I'm going to write .@type.
05:03So let's take a look at the XML code to see what's going on here. You can see
05:07that phone 0 is going to refer to the first index in the array of phone tags.
05:13The @type syntax refers to the type attribute that's on that phone.
05:19So if all goes well, this should return the string mobile. So let's test that out.
05:26alert (phoneType) and let's see if that works and let's open the browser.
05:39Okay, and you can see that worked.
05:40So that's an example of extracting an attribute from the XML code. Let's keep
05:47on going. What I'm going to do now is do some basic filtering. So suppose
05:53I wanted to find the phone that corresponds to a particular value of the type
06:02attribute. Suppose I didn't know which index I wanted to get the, say, fax
06:07phone and I had to find it somehow. So what I'm going to do is write var phone
06:14= myXML.phone. And then in parenthesis I'm going to write @type = fax.
06:28Now, this is where we start getting really powerful E4X. You can see that
06:31we are doing some pretty interesting filtering operations here using a very simple
06:35complex JavaScript-like syntax. So now I'm going to alert the phone.
06:44So what this should do is come back with the phone that matches the phone tag that has the
06:50type of fax on it. So let's go ahead and do that.
06:56And sure enough it works.
06:57So, okay, let's move on to some more stuff. E4X introduces a new looping
07:05construct into the ECMAScript or in this case JavaScript syntax and perhaps
07:12you have used for loops. You might have even used for in loops. ECMAScript for XML
07:19introduces the concept of the for each loop. So I'm going to write for each var
07:27tag in myXML, alert tag. So the 'for in' construct in JavaScript loops over all
07:40the properties of a given object, but in this case what I'm looping over is
07:45I'm looping over for each tag that's in the XML code, and let's see what happens
07:50when I run this.
07:56So there's the name, right. That's the content of the name tag. All right,
08:00there's the first phone, there's the second phone, there's the third phone and
08:06there's the email address. That's a pretty powerful construct to be able to
08:10access XML in E4X syntax.
08:14So enough with accessing examples. Let's move on to creating XML using E4X.
08:22So what we are going to do now is switch gears a little bit and we are going to
08:25write some example code that creates new XML content on the fly using this XML
08:32as a starting point.
08:36So in the create E4X function what I'm going to do here is I'm going to write
08:40myXML.BusinessCard, because that's the root tag in the XML that we have above.
08:48I'm going to write += and I'm going to write phone type = home
08:58and I'll write some sample code in here and make sure it works okay and
09:05we'll write 415-555-1111. Okay, and now we're going to alert myXML to see what happened.
09:17Okay, so if all goes well, this should be adding a new phone tag to the end of
09:25the BusinessCard content, at the inside of the closing tag right here. So after
09:30this email that's what we are going to do. We're going to put some new content
09:34in there. Let's see what happens when we run this in Firefox.
09:42And you can see that here is the XML that we had originally defined and there's the new phone
09:47that just got added and that's my new fictitious home phone.
09:52So let's move on to another example. This time we are going to use the new XML syntax.
09:58So let me comment these guys out and this time I'm going to write
10:03myXML.BusinessCard += and now I'm going to write new XML. And in this case,
10:15I'm going to write phone type = pager and I'll put some fictitious pager in there.
10:28So like 415-555-6789, and we'll close off that phone tag.
10:39So now I'm using the new XML. This is more the object-oriented syntax and we'll just
10:46copy and paste this alert in there. Let's see what happens in this case.
10:52And low and behold you can see that the new XML syntax works just as well.
11:01Moving on, let's try one more example of creating new XML content. What we are
11:07going to do is we are going to modify the existing content of this BusinessCard
11:13construct. What we are going to do is we are going to modify the contents of
11:17this phone tag right here. We are going to change the number. So to do that
11:22I write myXML.phone and since it is a zero-based index I'm going to write 1 and
11:32I'm going to write = <phone type="work"> and I'm going to write 800-555-0000.
11:45Close off the tag, okay. So this should replace what's currently in the first
11:53phone tag and you can see the current number is 555-9876. So when this code
12:01executes the 9876 should be replaced with these four zeros. All right,
12:05let's run that to make sure.
12:12And sure enough you can see that's exactly what happens.
12:15So using E4X it's really easy to manipulate and change the content of XML using
12:23standard scripting constructs. Let's move on to the next example.
12:29I'm going to comment these guys out.
12:31So now we are going to look at the ways to delete content from E4X using the
12:36delete keyword. So the first thing we are going to do is type delete myXML.name
12:44and this should remove the name tag from the XML construct. So when we do the alert,
12:54this tag should not be there. So let's go ahead and browse.
13:03And you can see that it is gone. So far, so good. Let's move on to the next example.
13:08We're going to write delete. In this case we are going to need a whole series of tags
13:13and we are going to write delete myXML.phone.
13:19Now because phone refers to a tag that has multiple instances, it's going to
13:26delete all of them. Let's go ahead and look.
13:33Oops! Uh, there we go. I left both alerts in there.
13:36Okay, so you can see that now the only thing left is the email tag, all right.
13:41So we'll go ahead and click OK. For the last example we're just going to delete one
13:45single attribute. So what I'm going to do is comment these guys out. So what
13:50I'm going to do now is delete a single attribute and I'm going to delete the
13:53type attribute from the very first phone tag. Type delete myXML.phone.@type and
14:02that should just simply delete this attribute right here. It's going to get rid
14:08of the type = mobile. Okay, so let's go back down to the code. All right, save
14:14and we are going to bring this up in Firefox.
14:18And you can see that the type attribute is now gone, okay.
14:22Let's move on to our last E4X example. What I'm going to do here is show you
14:26how you can work with strings and XML and the DOM to interchange data between
14:34E4X and DOM construct. So a couple of things we need to do here. What we are
14:39going to do is first create a DOM parser. So we'll type var oParser =
14:46new DOMParser, and if you haven't seen the examples earlier on in this section,
14:53you might want to go back and take a look at the lessons where I cover what
14:57the DOMParser does because we'll be using it here.
15:00So now I'm going to write var xmlDoc = oParser.parseFromString and
15:12parseFromString takes two arguments and the second one is the mime type,
15:18so I'm going to type application/xml. And in the case of the first argument what
15:24I'm going to do is get a little bit clever and I'm going to write myXML.toString.
15:29So I'm going to convert the XML code in the E4X construct into a string.
15:33I'm going to serialize it into a string. I'm going to pass it off to the DOMParser.
15:37This will give me back an honest to goodness DOM document.
15:40So now I'm going to do a little bit of DOM manipulation on the content that we
15:44just created. So I'm going to write var oNode = xmlDoc.createElement and
15:54I'm going to create an element named createdInDom. To make it easier to
16:02read there, okay.
16:04So, I have got an element now called createdInDom and now I need to get the
16:09BusinessCard root tag and put this at the inside of the business card.
16:15So I'm going to write var oBC = xmlDoc. getElementsByTagName and I'm looking for
16:28the BusinessCard element and there's only one of those. So I'm going to get the
16:33zeroth element that comes back from that array and I'm going to write
16:38oBC.appendChild. So using the DOM, I'm going to put in the node that we just created.
16:48And now I'm going to write alert
16:54new XMLSerializer().serializeToString and we are going to serialize out
17:03the XML document. Once we do that, we are going to write myXML = new XML();
17:15So we are going to convert the DOM back into E4X content now and basically
17:20we're going to take the same call that we just did here. We're going to serialize
17:24the DOM out to a string, pass that by back to E4X, and then we'll just alert myXML.
17:32Okay, so let's go over quickly what we were doing here before I run the
17:36example. So I have created a DOM parser and I have converted the E4X XML into a string
17:42and now I'm going to create an XML DOM document out of that. When I have
17:45the DOM document created, I'm going to create a new element called createdInDOM
17:49and I'm going to put that at the end of the business card root tag and then
17:54I'm going to alert to make sure it worked. And then I'm going to take the DOM
17:57and convert it back into E4X content using the new XML construct. All right,
18:04drum roll please. Let's see if it works.
18:10Bring up Firefox.
18:13Here you can see here is the business card and there's my
18:16createdinDOM element. So it looks like that worked just fine. So we now have a
18:24DOM document that we created from our E4X content and I was able to manipulate it
18:28using the DOM. Now when I click OK, what should happen is the DOM should get
18:33serialized back out to a string and converted back into E4X content and it did.
18:39There it is. You can see that it is the E4X content. There's the createdInDOM
18:44element that we created.
18:45So this is a way of interchanging content between the E4X constructs and the DOM.
18:52So you can use E4X for what it's good for, you can use the DOM for what
18:57it's good for or if you need to transmit information back and forth using
19:01serialization and now you are ready to go out and use this in your own projects.
Collapse this transcript
4. Designing and Implementing an XML Format
Understanding XML formats
00:00We have now reached the point in the course where we have seen enough and
00:03learned enough to design our own XML format. Before we do that, I'm going to
00:09show you an example of where we are going to be using our XML format and why we
00:14are going to be designing it.
00:15So let me switch over to the browser here for a moment. Here I'm in a browser
00:19and we are looking at a fictitious company called Teacloud. Teacloud is a
00:25website and company that people go to to learn about teas and brewing teas.
00:30It's basically all about tea.
00:33In addition to being a destination site to learn about tea, Teacloud has a
00:38product catalog and they sell products online. So we are going to switch over
00:41to the Our Products section. You can see here that there are two different
00:45categories of products that they have. They have Kettles & Teapots and they
00:48have Teacloud Teas.
00:50So here in the Kettles & Teapots section, you can see that there's a table of
00:54products here and each one has a picture along with the name and a price and
00:59there's a description to go along with each.
01:01So let's switch over to the Teas for a moment. Now in the case of the Teas,
01:07there's no picture but there's a name and there's a description and the price
01:12is for a given unit of weight, in this case, it's for pounds.
01:16So this is the site that we are going to be working on. Our job is going to be
01:20to implement this product catalog using XML. Once we have designed the XML
01:25format, we are going to look at the code that's used to read the XML data in
01:30and present it here in the webpage both in Internet Explorer and in Firefox.
01:36Okay, so let's go ahead and get started.
Collapse this transcript
Avoiding common design mistakes
00:00Okay, so before we get started designing our own XML file format, let's take a
00:04look at some common XML design mistakes. So we can make sure that we don't
00:09repeat them.
00:11Mistake number 1 is using the word XML as the document root. I have seen this
00:17from time to time. In fact, it's not just using XML. It's using any word that
00:22begins with XML. Now this is not specifically an error but you shouldn't do this.
00:27The reason is because the term XML along with words that begin with the
00:33xml is reserved for use by the W3C and the XML specifications.
00:39Besides root tags in documents should be descriptive and they should reflect
00:44the type of document that they are representing. If you name your root tag XML,
00:49you are not following that principle. Of course it's XML. What else would it
00:52be? In fact, if you were to add the XML declaration above this, you would make
00:57it even more obvious. So don't name your root tags XML, choose a descriptive name.
01:02Common mistake number 2 is including information in an XML file that is not
01:08itself XML. You can see an example of that here. You can see this is a tag that
01:14describes a file and the fileInfo. The name is pretty clear but then there's
01:18this thing called attributes. If you have ever worked on a file system,
01:22you would know that files have got permissions on who can read, who can write and
01:26who can execute them.
01:27The problem with this is that you are forcing the person consuming this XML
01:30file to do further processing that number 1, the XML Parser can't do for them
01:37and number 2, is not in a very obvious format. If I didn't know that files had
01:42attributes like this indicating who had read and write and execute permissions
01:48and that these permissions were grouped into three categories for the world,
01:51group and individual, then I would have no idea what this means.
01:55So don't include data in an XML file that has to be processed further beyond
02:01what the XMLParser can already do. You have got this powerful XMLParser.
02:05You should make it do all the hard work. Don't force people to consume your
02:09document format. This is an example of including a format inside the XML file
02:14that's either proprietary or specific to one system or another. You could
02:20easily rewrite this format using XML and that's what should have been done here.
02:24The idea is to aim for clarity and ease of use in the XML rather than
02:29compacting the syntax down as much as possible in order to save room. Don't be
02:34afraid of being verbose in your XML files. XML files are supposed to be as
02:39self-documenting as they can possibly be. I should be able to consume data in
02:44an XML file without having to worry about which system things came from or
02:48having to use processing techniques beyond what the parser gives me.
02:53Mistake number 3 involves being too precise with your tag names. So let's take
02:58a look at this example. On the left hand side, we have a snippet of XML code
03:03that encapsulates a furniture order. You can imagine that this had come from
03:08some furniture store or a factory or something. We have the same thing on the right.
03:13Now on the left you notice that each tag is named individually and they are
03:18very specific. So this one here is a sleepersofa. We have a queenbed,
03:21coffeetable and so on. Whereas on the right, we have got a much more loose
03:26coupling between tag names.
03:28So instead of calling this sleepersofa, this over here is just sofa, then as an
03:32attribute of isSleeper. Here we have bed and beds can have different sizes. So
03:37we have size = "queen" and so on down the line.
03:40Now on the left, the reason why this is a problem is because this causes an
03:43unnecessarily tight, what we call, coupling with the underlying processing
03:48code. Imagine a situation where I had some XML processing code that counted up
03:53the number of tables in differential order.
03:57Well, on the left hand side, if I add a new table type and I call it something
04:02else, like nightstand or endtable or whatever. If I want to do that, I then
04:06have to go back and change the processing code because I have to account for
04:10the new table name because I have added a new tag.
04:13In the example on the right hand side, that code could be left as it is because
04:17all I would do is add a new table tag. I would simply describe it in the
04:22attribute that I have here for type. So whatever code I had that did things
04:28like counted up tables or retrieved all the table tags and added up their
04:32prices that were in a different attribute, all of that would remain the same.
04:35The other problem with the code on the left versus the code on the right is
04:39that it makes writing things like XPath expressions and XSLT templates a lot
04:44more complex. Because again I have gotten so specific with my tag names that
04:49every time I add a new one or if I want to change the code around or if I want
04:53to change the order in which tags appear in the XML, chances are I'm going to
04:57have to go back and change the corresponding XPath or XSLT templates.
05:02So this is a bit more of an art than it is a science. There's never any real
05:07right or wrong answer, except maybe in this case. The idea here is try to
05:12choose tag names that represent base level classes or objects and save things
05:18for descriptions in attributes or child tags.
05:22So a good example here is the sleepersofa. This is clearly an adjective
05:26describing a noun. So try to use nouns as your tag names and save the
05:33descriptive attributes for things like adjectives and so on.
05:37Okay, mistake number 4 involves being far too compact with your syntax. Take a
05:44look at the example here. Okay, there I have got some XML code and it's really
05:48compact, but I have no idea what any of these means, like what is an f-o? Like
05:53what is isSl? Just by looking at this I have no idea what any of this is.
05:57Well, I can make it a lot clearer, suppose I did this. Now it's a lot clearer.
06:01See now my tag names are a lot more descriptive and we can see that this is an
06:05evolution of the furniture order that we had from the previous example.
06:10Once again, we have drastically improved the readability and usability of this
06:16XML file just by being a lot more descriptive with our tag names. So we can see
06:21now that this is a furniture order and it is grouped by living room and bedroom
06:26and there are tags that represent items that would go in each. We have expanded
06:30up the tag names and attributes and we have added some pricing information.
06:34This is much easier to understand.
06:36So the lesson here is don't worry too much about making your code compact.
06:40You need to aim for readability and maintainability. Verbosity is not necessarily
06:44a bad thing in XML.
06:46Okay, so now that we have seen how not to do things, let's take a look at some
06:50design tips for how to make good XML and that's our next lesson.
Collapse this transcript
Planning design and development
00:00Okay, let's take a look at some design and development tips that you can follow
00:03when creating an XML format. To begin with, create the requirement summary that
00:08you are trying to implement because this helps you ferret out the tags and the
00:12attributes that you are going to need.
00:14What you will find is when you write things down, nouns will typically become
00:18tags and adjectives will typically become attributes. Although they may become
00:23tag themselves, if they are somewhat complex. Verbs will typically become functions.
00:29Once you have done that, you can identify the base tags. Now these are tags
00:34that will serve as containers and wrappers for other tags. The reason you want
00:40to do this is because this ensures that you get all of your collection tags in place first.
00:45So in the previous example, we would have things like maybe tables or rooms
00:51encompassing things like the living room or bedroom, sections of the furniture
00:55order. Once you have done that, you define the tags that are going to make up
00:59the bulk of the data. These are your base level objects. These are the things
01:03like the sofas and tables that we saw in the furniture example in the previous lesson.
01:09Once you have done that, if you feel up to it you can create the associated
01:12schema or DTD. In fact, you should probably create the schema or DTD in
01:18parallel with your XML data or maybe even beforehand. That's another tip that
01:21we'll look at in a minute. Now we are not going to do that in this title
01:25because that's relatively involved and we want to get straight to the data. So
01:30we are going to skip that part.
01:31Then once you have created the data, you can go ahead and write the associated
01:35style sheet or CSS or script that goes along with your data. One of the common
01:40questions that people who are designing XML tags sets come across is when
01:45should I choose between using tags and using attributes? The short answer is
01:53there's really no hard and fast rule. You need to use your better judgment, but
01:57there are some guiding principles you can follow.
01:59Typically, you should prefer tags when you have to represent data that's
02:04relatively complex or data that can be broken down into multiple parts.
02:10You should use attributes as modifiers for data or containers for simple data.
02:17Now another guiding principle that you should follow on top of that is if data is
02:22suitable for an attribute but it could end up as multiple attributes on the
02:27same element, then you should probably use child tags instead.
02:31So let's take a look at an example. Here I have an XML fragment that defines a
02:36movie. You can imagine that this XML fragment is used in XML dataset for, say,
02:44a video rental store. So here we have a tag that defines movie. There's an
02:49attribute inStock and that can either be true or false.
02:54So in this case inStock is clearly an attribute. Why? Because it describes a
02:59property of the movie. So this is an adjective and it describes whether the
03:02movie is in or out of stock. Now a movie can either be in stock or out of
03:07stock. There's no chance of inStock appearing more than once on the movie tag.
03:12It's a relatively simple attribute.
03:14Although, consider a case of something like title. Now you might be wondering
03:18why isn't title an attribute? The reason is because sometimes movies have
03:22different titles in different languages. So although I could have solved this
03:27problem by putting something like title-us = movietitle and title-fr =
03:35theFrenchversion, that kind of defeats the purpose of using child tags. What
03:40I'm doing here is I'm using the base tag title and then decorating that with a
03:45language attribute that describes which language that title is.
03:49Same idea with price.
03:51I could have placed price on the movie as an attribute. In fact, if I only sold
03:57the movie in the United States, I might just do that. However, if you sell your
04:01products in multiple countries for multiple different pricing units or if
04:06you have prices that reflect discounts, that might not be such a great idea.
04:10So follow these principles and you will usually end up at the right conclusion.
04:14You will probably find out pretty quickly, if you did it, but in essence, use
04:19tags for complex data, use attributes for simple data and modifiers for data.
04:25Remember that if an attribute could possibly be used more than once, then break
04:30it up into a child tag.
04:32We talked about this a little bit beforehand. You should create your schema or
04:35your Document Type Definition, DTD before or maybe during your design process,
04:41not at the end. This is assuming that you are even going to do this. The reason
04:45for this is because it removes the temptation to try to retrofit the rules of
04:49the schema to fit your data, instead of designing the data correctly right upfront.
04:54Now if you follow this process, you will catch design ever sooner, if you think
04:59about your schema beforehand. If you do this, you can also do data testing and
05:03validation in parallel with the development process. Like I said, we are not
05:08going to do this in this example because it is pretty involved and we want to
05:11keep this a little bit higher level and instructional.
05:14Okay, so we have reached the point now where we are ready to go ahead and start
05:18designing our tag sets. So let's do that in the next lesson.
Collapse this transcript
Creating the Tag set
00:00Let's create the tag set for our particular web site here. So as I indicated in
00:05the last lesson, it's usually a good idea to write things out. That will lead
00:08you to the tags that you need to create for your particular tag set.
00:13So let's read what I have got here. So 'In addition to being a place to learn
00:16about tea, Teacloud sells products on our site: kettles and teas. Kettles have
00:23an associated product image, name and price along with a description.
00:26Teas have a name, price, and unit for the given price and weight. They don't have an
00:33associated product image.'
00:34So if we go ahead and underline the various nouns that we have here in
00:40the paragraph, we can very quickly get an idea of what our tags are going to need to be.
00:47Let's switch over to the code and take a look at our site and the XML file
00:53that we need to build.
00:54So I'm here in the code now for the XML file that we need to build. I have got
01:01a little bit of a start here by listing out all of the products that the site sells.
01:06So the first three here are teakettles, and the next three are specific teas.
01:13So we need to turn this start file into a finished XML file.
01:19So recall that when we wrote out our paragraph describing the various nouns
01:24and adjectives and so on. We very quickly realized that there were products
01:28that we sell. So it's probably a good place to begin.
01:30So let's begin by writing first of all our XML declaration, because that
01:35should come at every XML file. So we'll write:
01:40<<XML version="1.0" encoding="UTF-8">>
01:55Now this will usually be done for you by your XML editor, but the default
01:59encoding is going to be UTF-8 anyway. I just want to be explicit about it.
02:03So let's begin by making our base class tag. Now this is the tag that someone
02:08reading XML file would look at and quickly get an idea of what the contents of
02:12the file are.
02:14So it seems to me that since we are selling products for the TeaCloud site,
02:18as good a name as any is going to be something like teaCloudProducts.
02:27So now that we have our base tag, we can start putting in our container tags.
02:32The container tags are going to be tags that group together individual tags
02:36that are related.
02:37So recall back from the example when wrote out the paragraph that our website
02:42sells kettles and teas. So what I'm going to do here is create a tag named
02:48kettles and I'm going to create one named teas. Okay, so far so good.
02:57Now we have got three of each here. We have three kettles and we have three
03:01teas that are available for sale. So I'm going to go ahead and inside
03:04the Kettle section I'm going to create a tag named kettle. Each kettle is going to
03:12have information associated with it. So at this point let's stop and take
03:16a look at some of the kettles.
03:18So we can see that each teakettle has a name and it has a price and there's a
03:22path in the assets folder that has its related image.
03:27So if we scroll over here and look in the images, we can see under Products,
03:34under Kettles... So there are images that correspond to each one of these guys.
03:39Now image paths are not likely to be different for each one of these tags in
03:43this example. In fact, there's only one. So what we are going to do is
03:46we'll make the image an attribute, and we'll just copy this data right here.
03:55We are also in the place to put the name. So we'll make the name an attribute
04:00as well and we'll just put that right up in here.
04:10We have got one piece of information left to go in attributes. That's price.
04:14Now in this case, we only sell in the U.S. let's say and we only have one price.
04:17So I'll make that an attribute. That's 49.95 for this one.
04:25Last but not least, we have the description. So the description is some pretty
04:29long text and I'm going to make that the text content of the kettle tag.
04:37So now we just need to do this for each one of the kettle tags. We'll just copy
04:45that information to each one.
04:47So here is the Earl's Grey one. In fact, I need to entity escape that
04:52apostrophe right there. So I'm going to write ampersand, apos, semicolon.
04:57You need to do entity escaping when you're putting things like quotes inside XML files.
05:03We'll have that be that price and the image here is Earl's Grey instead.
05:13So I'll replace that.
05:16And the description. Okay, that's the second one. We have got one more to go.
05:25So we'll do the description...
05:30and we'll do the name...
05:36and we'll do the price...
05:40and we'll change the image name. Okay.
05:48Looks like our XML file is taking shape nicely here. So now we just need to go
05:53ahead and do the teas. So inside the Teas section, I'm going to go ahead and
05:58make a tea tag for each one of these guys.
06:02Now each tea product has a name and a price, but there's no image. So we don't
06:07have to worry about that. But each one does have a description. So I'll take
06:10the description for each and put that inside the tea tag. What I'm going to do
06:16is make three copies here, because I've got three teas. I'll copy this one
06:23and paste that one in and then finally this one. Okay.
06:31Now I just need to deal with the names and the prices. So the names are pretty straightforward,
06:36because they are the same as in the kettle case. So I'll just
06:39have a name and we'll do some copy and pasting here and this one too...
06:52and finally on this one.
06:59Now we need to deal with the prices. You will notice that the price is not just
07:03a price. It's per unit. So we could just take the approach of saying, well,
07:09let's just call this price = and then 26.95 per pound. Now I don't suggest you
07:17do this and here's why. What you are doing is you are combining two different
07:20pieces of information in one single attribute.
07:23Let's suppose we have some processing code on the back end that calculated
07:27the total price of an order that someone put together and they ordered a couple of
07:32kettles and a couple of teas. Well, adding up the prices for the kettles is
07:36pretty straightforward, because you have got the numbers in here, but to do
07:39that for the teas you have got to go through some additional processing to
07:42strip off this per pound indicator.
07:46So it's not a good idea to try to combine different pieces of information and
07:50besides, suppose in the future we decide to sell teas per ounce for some
07:55really expensive teas, or we decide to switch over to the metric system and
07:59sell teas per kilogram. You want some easy way to change that.
08:04So what I'm going to do is leave off the per pound unit and make a separate
08:10attribute named "unit" and that's going to be pound. I'm going to do that for
08:16each one of these guys. price = and this one is 16.95 and the unit is pound,
08:25and then finally this one as well. In this case the price is 18.95 and the unit
08:35is pound. Okay.
08:40Well, that's pretty much it. We are now done creating the XML file format.
08:45So we'll save it and now it's time to integrate this with the HTML code.
08:51That's the next lesson.
Collapse this transcript
Integrating XML with design
00:00Okay, so in this lesson we are going to take some of the concepts that we have
00:03learned up until now and specifically in the previous chapter to integrate
00:08the XML file that we have just built into our TeaCloud website.
00:12So let's go ahead and open up our XML file that we have just built. So recall
00:18that this was the XML file that we created to represent the kettles and teas
00:23that our TeaCloud site list for sale online. So we go to the index page.
00:30This is our TeaCloud site and to see the product listing, we go to the Our Products section.
00:36Now on to the Our Products section there are two individual files, one for
00:41teas and one for kettles. So under the product section here, what we are going
00:46to do is open the file for the kettles and tea pots, and that's this one here,
00:52and we are going to open up the HTML file for the teas and that's this file here.
00:57Okay, so you can see that there's nothing in the page where the product listing
01:00is going to appear. So let's switch over to the Code view and see what's going on.
01:07So I'm going to switch to the Split view here, I'm going to click down here.
01:12Okay, now I'm going to go to the Code view.
01:15So you can see that there's an empty table right here in the teas page that's
01:20got an id of products on it and it has no content, and we are going to take a
01:24look over at the kettles page because this one is also similar. I'm going to
01:29click there, go to the code, and you can see that in the kettles page,
01:32same idea, right? We have a table that's empty and it has an id of products on it.
01:37So that's very important for a reason. Our XML processing code is going to
01:41build the content for these tables and then place it in there when it's done being built.
01:48So now let's scroll to the top of the file and you can see a couple of things.
01:56First here, I'm including the cross platform sarissa.js library, which
02:02we covered in the previous chapter, and this is recall the Library that allows me
02:06to use XML functions across browsers like Firefox and IE. And the next script line here,
02:13this is the code that's going to be used to build our products list.
02:19Let me scroll down a little bit and you can see that in my windows onload function,
02:25in addition to the other stuff going on, I have a function here
02:28called buildProductsList.
02:30The story is pretty much the same for the kettles side. So up here, we have
02:35the sarissa library. That's that right there and here is my products.
02:39Now the function called buildProducts is being called with two arguments.
02:43There's this products string and in this case there's kettles string. Over here
02:48it's products and teas.
02:50So the buildProducts function is going to build a list of products depending on
02:55whether it's teas or kettles, and you can see that each one of these
02:59corresponds to the name of a tag here in the XML data and let me switch back over here.
03:08So let's take a look at the code and see how this works.
03:13Okay, so I'm just going to go ahead and click on buildProductsList and that opens up the code.
03:21So here we are in the JavaScript code that is responsible for building the products list.
03:25So the first thing that this buildProductList function does is it retrieves a
03:31DOM document reference using the sarissa library.
03:34So recall from our earlier examples in the previous chapter that this gets us
03:38an empty document and what I'm going to do now is load
03:43the teacloudproducts.xml file. So we are going to set the asynchronous property to false,
03:47because I want this load synchronously and then on the document I just created,
03:52I call the load function, which loads the teacloudproducts data which
03:56we created. Once I have done that, I get a reference to the table which was
04:02supplied as the id argument which is how we call this function. So this is
04:06the products table back in the HTML file.
04:10Once I have that, I create a new table body, because this tea body element is
04:14going to hold the created table for the products. Now I check to see which
04:19string was passed in and remember the second argument was the products that
04:24we're building a list for. So it's going to be one of two cases. Either it's going
04:27to be teas or it's going to be kettles and that is this argument here.
04:34So in the case of teas, what we do is we have a variable here that we declare
04:40and that is the result of the getElementsByTagName on tea and this is the tea
04:48set of elements that we created in our XML file and that's going to come back
04:53within an array of tea tags. So we store aside the number of tea tags by
04:58calling the length property on the array.
05:01Now we are going to loop over each one of these guys and build up the table row
05:05and the table cells that's going to contain each piece of data. We do this line
05:11of code here, so we create a new table row element and then we create a table cell
05:15to go inside of it and we append the table cell into the table row.
05:20Now once we have done that, we need to extract the name of the tea and the
05:24price per unit to build up the string representation of the product. So we are
05:29going to create a div that's going to hold all this information and we do that
05:32by calling the create element function. So we create a div. Then we create
05:36a paragraph that's going to hold the text for the tea node. Then we create a span,
05:40because I'm going to wrap the tea name in a bold text to make it stand out.
05:46So once we have created these three elements, we retrieve the information from
05:50the XML data. So as we are looping through this loop here, we are counting over
05:55the contents of this array. So each one of these elements raise a tea tag.
06:00So for each tea tag, we retrieve the name, the price and the units. So this will
06:05be the name of the tea, the price and this is going to be the weight unit in
06:09either pounds or whatever we make it in the future.
06:12So we have set the span's className attribute to be productName and in order
06:17to make this work across browsers, we have to fix this. Because className is
06:21used in IE, because class is a reserved word in JavaScript, whereas Firefox
06:26doesn't have that problem.
06:28So what we're going to say is oSpan.setAttribute. I'm going to quickly modify this to say
06:33window.event so if that's not equal to null, then we know we are in IE,
06:39versus being in a non-IE browser. We can just say class and we are going to set
06:46that to be the CSS style productName, and I've defined a CSS style and
06:50we can look at that really quickly. I'm just going to open the CSS style file here.
06:54We scroll down to the bottom. You can see I have defined a productName class
06:59that just simply sets the font weight to be bold and I have also decided
07:04to create a style sheet for the table cells, which puts a border around the cells.
07:12So now that we have done that, we set the content of that span to be the name
07:18of the tea, which we have got right here, and we put that span into the paragraph
07:23and then we add on to the end of that text, this string right here.
07:27So it will be an open parenthesis with the price and then a forward slash and
07:33then the text of the unit attribute with a closing parenthesis. So that's
07:38going to end up looking something like this, like 49.95/lb. That's what
07:48it's going to end up looking like.
07:53And then we put that content also into the paragraph.
07:55So now we have a paragraph containing the tea name and the price.
07:59Now we need to retrieve the description from the interior of the tea tag,
08:03the text that we put inside the tea tag. To do that, we use the firstChild DOM
08:09property on the tea tag. That gets us the text node and then the data property
08:15on the text node gets us the actual text data. So once we have that in the
08:19description, we create another paragraph tag, and add that description into it
08:25by calling the appendChild function.
08:29Once we have done that, we add that paragraph to our parent div and put the
08:33div inside the table cell. We do that, we add the completed table row to
08:38the table body and then the loop goes back up and runs again.
08:43So this will loop over each one of the teas and build up table rows for each one.
08:48Okay, so let's save this. I'm going to go back over to the teaclouds.xml file.
08:53Okay, you can see here is all the kettles and teas. So for each tea,
08:58we have the name, the price, we have the unit, and we have the text inside.
09:02So we have now extracted each piece of information. I need to save this as the
09:07teakettles.xml file. So we Save As, take off the word start, there we go.
09:15Let's go head back to the index page. I'm going to preview this in the browser.
09:21So we'll go over to the Products page and you can see here that we have now
09:25built the teas and we have built the teakettle. So you can see the teas are
09:31showing up properly. There's the bold name with the 26.95/lb as
09:36the parenthesis string and there's the description. Let's make sure it works also
09:40in Internet Explorer. I'm going to go to the teas. You can see it's working there as well.
09:48Now let's go back to the code and take a look at how we build up the kettles.
09:55The kettles is pretty much the same. The only different in the kettle section
09:58is that we have in addition to the name and the price and the description,
10:04we also have an image that needs to be inserted. So the code is pretty much the same.
10:09You can see here where we get the teas and save aside the number of items,
10:14we have a loop. What we are doing here is the same thing only in this case
10:17we are doing it for the kettle tag, not the teas.
10:21So now the loop goes through and it's the same idea. We create a table row and
10:28a table cell. Now the difference here is we have two table cells because one
10:32holds the image whereas one holds the product information. So the first table cell,
10:36we create an image to hold the image that's going to represent the kettle.
10:41We then get the path for that image and that's going to correspond to
10:49this attribute right here. So we are retrieving the image part and that's this.
10:55So we also retrieve the name attribute.
10:58Once we have the image, we set the source of the image to be the path that we
11:02retrieved and to be nice and accessible, we set the alt attribute of the image
11:08to be the name of the product as well. So now that we have created the image
11:12and we have gotten the attributes, we then append the image into the table cell.
11:16Okay, that's the first part of it.
11:18The second part of it is to create the second table cell now which will hold
11:22the product information and this is pretty much the same as in the tea case.
11:25We create a div and a paragraph and a span and these are going to hold
11:29the pieces of data. The div is going to wrap everything up.
11:31So we retrieved the price using getAttribute on the kettle that we are
11:36currently looping over and once again referring back to XML data, you can see
11:41in the kettles each one has a price right here. So we retrieved the price,
11:48we do our little setAttribute and in this case, we got to do the same thing again
11:52for both IE and Netscape. So if we're in IE, we use className; otherwise we'll use class.
12:03Once we have set the class so that things show up properly, we then
12:07append the name and the once again, we do our little string trick. Only this time
12:13we don't have a unit of weight. We just have the price inside parenthesis.
12:19So we put that inside the span, we put the span inside the paragraph,
12:23we put the paragraph inside the div.
12:26Now we get the description. The description is the same process.
12:35the text out of the text node. We then create a paragraph with that
12:39We get the firstChild of the kettle tag and that gets us the text node and the data gets us
12:40description inside of it and we put that inside the div, put the div inside
12:47the second table cell, and put the table row into the body, and then whole loop completes.
12:54Once we are done, we tell the table to put the table body inside the table.
13:00Okay, so let's save, let's go back to the index, and I'm going to browse this in IE.
13:08Go to the Products section and you can see here that the image is being set
13:12and you can see that the Alt tag is showing us the name of each image as we
13:16mouse over. Here is the name and the price and it's showing up as bold text
13:22inside that span and here is the description. Switch over to Teas,
13:27yup, all that's working fine.
13:30Now let's browse in Firefox. Okay, here we go. There's the name and the price,
13:39the image and the description and let's switch over to Teas and everything worked.
13:45Okay, so now you have seen an end-to- end example of building an XML tag out
13:50from scratch and integrating it into your web pages in a cross browser
13:55fashion. That brings us to the close of this lesson. Let's move onto our next one.
Collapse this transcript
5. Real-World DOM Algorithms
Understanding the uses of DOM algorithms
00:00During the course of working with XML in the real-world there will be certain
00:04situations that you see coming up again and again, and in this section we'll
00:09talk about ways that you can deal with those situations using some real-world DOM algorithms.
00:15So as I was saying, when you're working with XML there's a number of common
00:18processing tasks that you are going to have to perform and you will come across
00:22these fairly regularly. There are some common algorithms that can be written as
00:27some standard functions which you can reuse in your specific projects.
00:32So I have got a few here. I'm going to provide about a half dozen of these. The code
00:36I'm going to show you is code that you can just take and re-use in your own projects.
00:41So I'm going to start off by talking about the concept of node traversal and
00:45what this basically means is there are common situations in processing XML
00:49where your code will need to visit nodes in an XML document, whether it's an
00:56XML data document or an XHTML file, whatever. It's common practice that
01:01you will have to write some code that visits either all the nodes or some subset of
01:06nodes in the XML file and node traversal is the way that you do that. The word
01:11traversal means you visit nodes in a certain order and we'll look at both
01:15depth-first and breadth-first node traversal.
01:18The rest of the algorithms I provide here are pretty useful functions that you
01:22can use in your code that performs some useful utility functions. For example,
01:28there's the isContainedBy function and you can use that function to see if a
01:32given node is contained within a given type or within a specific node.
01:39The containsNode function is the opposite. You can use that to see if a certain node
01:44or an element in a document contains another node, either of a given type
01:49or a specific instance.
01:51The hasSibling function is used to see if a node has a sibling of either a
01:57specific type or a specific instance. Finally, the getElementsByAttr function
02:03can be used to get elements if they have an attribute that matches a specific
02:08value. Let me begin by talking about document traversal.
02:11Document traversal is a process by which you process the nodes in an XML
02:16document and this is a fairly common task in just about any real-world setting
02:22and there are two common ways of traversing an XML document. One of them is
02:25called depth-first and one of them is called breadth-first.
02:29The depth-first version refers to the fact that each node a document is visited
02:33from the top down to the bottom. Actually, it might be more accurate when
02:38you see the example to call it bottom to top. The idea is that for any node before
02:42we process it, we first visit all of its child nodes. So in the most extreme
02:47example, we would start with the document route, go all the way down to the
02:50leftmost leaf node and then work our way back up. Now in breadth-first
02:55traversal, all of a node's siblings are processed in order before the child
03:00nodes of each one.
03:01Okay, so now that I have described what document traversal is, let's take a
03:05look at an example in action.
Collapse this transcript
Understanding depth-first document traversal
00:00Okay, let's start by looking at a depth-first traversal pattern and how it
00:06looks when we are operating on an XML file. So let's imagine that the structure
00:10you see here is the node structure of an XML file and at the top of the
00:15document you have the A tag as the route and then underneath A you have got B
00:20and then underneath B you have C, D and E and so on down to F and G here and
00:25then on the other side of A, you have got H, I and J.
00:28So if these are all tags in an XML document and we wanted to do a depth-first
00:34traversal of this document starting at the A tag, this is pretty much how it
00:39would look. First, we would pass the A tag to the function that started off the
00:45traversal and it turns out that A has two child nodes. So we would visit the B
00:50node first and then from B we would see that B has child node as well. So,
00:56before we process B we would travel down to C and it turns out that C also has
01:00two child nodes. So before we do any processing on node C, we would first
01:03travel down to node F.
01:05Now node F is the bottom of the tree and that's called a leaf node and it has
01:10no child nodes. So, we would do whatever processing we have for node F,
01:15we would travel back up to node C and we're not done here yet. There's one more
01:19child to go so we would travel down to node G, we would process the other leaf
01:24node which is node G here, then we go back up to C.
01:27Now we would process node C. We would do whatever we would have to do on the node,
01:31if we needed to do anything at all and then from C we would go back up to
01:34B. So this would go on. We would go back down to the next child and then back
01:38up to B and then down to E, back up to B and then we'll hit back up to A.
01:43Then we would do the other side of tree, down to H, down to I, process I, back up to
01:49H, down to J, process J, back up to H. Now all the child nodes are done so
01:55we would process H and then we would finally process A.
01:59So if you look at the order in which the nodes were visited, the order would
02:04look something like this. We would first process F and then G because those are
02:09the two leaf nodes in the far left. Then we would do C, because at that point
02:13all of its child nodes would be done. Then we would do D and E because those
02:17would be the remaining child nodes of B. Then we would travel back up to B and
02:22then the whole process would start on the other side of the tree. We would go
02:25all the way down to do I and J and then back up to H, and then back up to A.
02:30To implement this as a function using code, we would do something like this.
02:36Now this is a function called depth- first traversal and it takes as an argument
02:39the node in the document that you want to start on. Now this is a pretty bare
02:44bones example, it doesn't do anything like checking for a node type or a node
02:48name or anything like that. It just handles visiting all of the child nodes in
02:53order from left to right, starting at the very bottom of the tree.
02:57So the way it works is you call this function on the node where you want to
03:01start processing at and it doesn't need to be the document route, it can be
03:03anywhere in the tree. What happens is the very first thing we check to see if
03:07the node that we were given is not equal to null, and if it's not, then we need
03:11to check to see if it has any child nodes.
03:13So we declare a temporary variable and we get the nodes that we were given
03:18first child and we have the for loop to check the condition where the node is
03:22not equal to null to make sure we can keep going. Then to advance, we would get
03:26the node's next sibling. This is how we would travel left to right across a
03:29node's child nodes.
03:31Before we do anything however, you notice that we're calling the depth-first
03:34traversal again inside this for loop. This is called recursion. It's a function
03:39that calls itself and this is what's going to get us all the way down to the
03:44bottom of the tree. Because you see what we were doing essentially is each time
03:47we come through here, we get the first child, then we call this function again,
03:52only now the node has been set to the first child.
03:54So we come in here and then we get that node's first child and so on and so on
03:59as long as it has child nodes until we reach null. And if it's null, then we
04:05can fall out of this loop and if we reached the bottom of the tree and there
04:09are no further children to process, then this fall is out and we do any leaf
04:14node processing in here. Actually, it's really processing for any node but it's
04:18going to start with the leaf nodes.
04:20Perhaps, it's most illustrative to see this in action as a live code example,
04:25so let's jump over to a code and take a look at that now.
Collapse this transcript
Filling out the depth-first function
00:00Okay, so this is the code for the depth -first traversal. What we are going to
00:04do here is fill out this function, the depthFirstTraversal function.
00:09So before we do that, let's take a look at the rest of the file. It's a HTML file.
00:12A couple of things to point out. I have included the Sarissa library here
00:17so that I can do things that will work across browser with the XML. I have
00:22declared a function here called loadXMLData.
00:26The loadXMLData function essentially sets up the test XML for this exercise.
00:32It defines a string and you can see here, <a><b><c>, this is the sample document
00:37that we saw back in the slide. So this essentially constructs the same XML
00:43structure that we just looked at during the lesson portion.
00:46So I'm using the DOMParser object and again, this is going to work
00:50cross-browser now because I have got the Sarissa library included. The parser
00:55creates an XML document, which I store here in this global variable which is
00:59parsed from the TestData string.
01:02Then down here, I have my window.onload function which loads the XML data and
01:08then does the depthFirstTraversal and then shows an alert, which is this global
01:14variable string which we are going to be alerting on. That will be built up
01:18over time in a moment. What it's going to do is it's simply going to record the
01:22document nodes in the order that we visit them.
01:25Now if this were a more practical example, we would be processing each node as
01:30we visited it for some reason, but since this is an illustrative example,
01:34we are just going to build up the string in the order that we visit the nodes.
01:39So this is the function we need to write. So what we are going to do is first
01:43check to see if the node that we were given is equal to null. Because if the
01:48node is equal to null, then we can't really operate on it. So we'll say if
01:51(oNode != null). Okay, then we can do our operation.
01:58So we are going to write that for loop. The purpose of the for loop, remember,
02:01is to get us down to the lowest level of the tree first and then work our way
02:06back up and process each node after we have already visited all of its children.
02:14So I'm going to write for (var theNode = oNode.firstChild;)
02:24Then we need to make sure that theNode is not equal to null because if theNode
02:31is equal to null, then we've reached the end of the child list. To advance it,
02:36we say theNode = theNode.nextSibling.
02:44So all we are going to do inside this loop is call this function again,
02:48depthFirstTraversal, only this time we are going to call with theNode. So after
02:54this for loop completes, because we've run out of child nodes, what we are
02:58going to do is in the VisitOrder string, we are going to say g_sVisitOrder.
03:04I'm going to append a value to it. We are going to append the node name to indicate
03:11that we were here, +, and we'll put some space in there to make it look good.
03:19So we have reached the point now where we can try this out in the browser.
03:22You can see that after this function completes, we are going to just alert whatever
03:27this string is. So let's go ahead and bring this up in IE.
03:34So you can see that what happened was, we visited the nodes in the same order
03:38as we indicated back in the slide. So we went all the way down to the child
03:41nodes f and g, then we went back up to c, d and e and then up to b, and then
03:46all the way down the right-hand side we did i and j, back up to h, back up to a.
03:50Then we finally visited the document parent node. That's the #document
03:55element right there.
03:58Now let's try the same thing in Firefox to make sure that works there.
04:05It's the same result. You can see that we have visited all the child nodes first even
04:09though we passed in a, a is the very last root tag that we visited, followed by
04:14the document element, which is the parent element in the DOM tree of the root tag.
04:19That's the example of doing depth- first node traversal. Let's move on now and
04:24take a look at how we would do a breadth-first node traversal.
Collapse this transcript
Understanding breadth-first document traversal
00:00Now that we have seen how to do a depth-first traversal of the nodes in a document,
00:05let's now do a breadth-first traversal. Here in this diagram that
00:09you see we have the same XML node structure that we had in the previous example,
00:13only now we are going to traverse the nodes in a separate order. This is called
00:17breadth-first and in breadth-first traversal you pick a node to start at
00:22whether it's A here or B here, whichever one, and the idea is you visit all of
00:28the node's siblings and child siblings in order.
00:31So for example if we were to start off with node A, we would first visit node B
00:36and we would process B right at this moment. We wouldn't wait to come back to it.
00:39Then we travel over to H and we'll process node H. Now we have processed
00:44all of the nodes in sibling order under A. Then we'll travel all way over to C
00:49because now we are going to go do all of these child siblings and from C we
00:53would go to D and then over to E and from E we would go right back to the left,
01:00go down to F and over to G. And when that happens we would travel all the way
01:04back over to I and then to J.
01:06So in this case the node visit order would be A and then B and then H and then
01:12we would do C, D and E as the siblings underneath B and then we would do F and
01:16G as the siblings underneath C and then we would go over to I and J which are
01:23the sibling nodes underneath H.
01:26The algorithm that does this is here. This is called the breadth-first
01:31traversal and it takes the node that we want to start at. And the first thing
01:35we do is check to see if the node is not equal to null and that it actually has
01:40some child nodes that we can process.
01:43So we have our For loop and we say for theNode = oNode.firstChild and we make
01:50sure that the node is not equal to null and when we want to advance the node
01:53to get to the next sibling. And now what we are doing is we are changing the
01:56visit order to process it right here upfront before we go down to the next
02:01level and this is my string that I'm building up to illustrate the order in
02:06which we are visiting the nodes but the code you would write in here is
02:09whatever processing you want to do for the node when it gets visited.
02:12Okay. So that For loop is going to execute. It's going to do all the sibling nodes.
02:16Now we need to go down to the next level, which is where we get the
02:20first child for the node. We have the same kind of loop only now we do that
02:24recursive function call right here.
02:26So we call this function back into itself only this time we are going to
02:29process all of the child node siblings. So once again it's probably best you see
02:34this in action in order to understand it.
02:37Okay so here we are in the code for the breadth-first example and you can see
02:41it's the same code that we looked in the depth-first example. The difference is
02:46that the function name has changed but everything else is the same, the sample
02:51document structure is the same, and I'm including the Sarissa Library so I can
02:55do this cross platform. So this is the function you need to write here.
02:59The loadXMLData is going to load our sample document structure using the parser
03:04and after this function gets finished, we are going to display an alert with
03:09this string in it. So let's write the code and the code is going to look
03:14something like this. We are going to if oNode != null and oNode.hasChildNodes
03:26then we are going to do a loop and the loop will say for var theNode =
03:34oNode.firstChild; theNode != null and theNode=theNode.nextSibling.
03:53Okay, so inside this loop we are going to write g_sVisitOrder +=
04:07theNode.nodeName+ and some pretty printing to make it look good. So that's what
04:17takes care of visiting the nodes and processing them. Now we need to process
04:21all of the sub-child nodes. So we are going to do for and essentially it's the
04:27same loop. So I'm just going to copy this and paste that in and I'm going to
04:36put the braces in and I'm going to call breadthFirstTraversal(theNode).
04:45That's the function. So let's see how it works.
04:49I am going to view this in the browser and you could see that the nodes were
04:55visited in the sibling order. So we did node A and then we did node B and H and
05:01then C, D and E, which were all underneath B, then we did F and G, which are
05:07underneath C, and then we did I and J, which are underneath H. So things executed
05:13in the order that we did them. Before I go any further let's make sure it works
05:19in Firefox. That way I can see that it did.
05:23Okay the same results. A, B, H then
05:26C, D, E then F, G and then I, J.
05:28Okay, so you are probably wondering if you can maybe change the order in which
05:32things operate. Instead of going left to right, can you go right to left? And
05:36the answer is yes you can. If you wanted to go right to left for example,
05:40instead of doing firstChild you would do lastChild and then instead of
05:44nextSibling you do previousSibling and so on, and then would give you
05:47the reverse order.
05:49Now that we have seen how to do breadth-first traversal and depth-first traversal
05:53let's move on to the rest of our DOM algorithms.
Collapse this transcript
Using the isContainedBy() algorithm
00:00Another really useful DOM algorithm is the isContainedBy function, which
00:05determines if a node is contained within another node and this is another
00:09example that happens all the time in real-world DOM usage.
00:13The way the algorithm works, the isContainedBy function is given two
00:17parameters. The first parameter is the node that we want to see if it is
00:21contained within somewhere and the TestNode is either a specific instance of a
00:26node that we want to see if it's the container or is a string that
00:31illustrates the name of a type of node that we want to see if it contains
00:35the first argument.
00:36For example, we could pass in an instance of a node here and then for oTestNode
00:41we can give it a string like body or div and this function will return true if
00:47the first argument is contained within the node of that type. Or we can pass it
00:51a specific instance of a node in which case, the function will return true if
00:56the first argument was contained within the specific nodes specified by
01:00the second argument.
01:01So the way it works is, regardless of whether we are testing for strings or
01:04objects, we start off by declaring a TmpNode and we set that to be the
01:08parentNode of the node that we are checking, the first argument. And then while
01:13that node is not null, we in the case of strings check to see if the nodeName
01:19is equal to the string that we were given, and if they match we return true
01:23because we find the match in that case. Otherwise, we just set the TmpNode to
01:27be the TmpNode's parent and we keep doing that inside the while loop.
01:31Now eventually parentNode is going to be null because there's no more parents
01:34left and that will cause this to be set to null, which will cause the while
01:38loop to terminate, and if that happens then this return false statement
01:42gets executed. In the case of a specific object instance, the same thing is done,
01:47except instead of comparing the node name of the temporary node, we just
01:52compare the two object instances to see if they match each other.
01:56Okay, so let's take a look at a live example in the code to see how it works.
02:01Here we are in the code. This is the same code that we have been using for the
02:07previous examples. Here is my string representing the sample document and it's
02:11the same as the examples we have been using so far. And this is the function we
02:15need to write here. This is called isContainedBy and when my window loads,
02:22there's a couple of tests that we are going to run.
02:24First, we are going to get a reference to the g tag, which if you look in the
02:28string up here, you will see that g is contained with inside b, which is
02:32contained with inside c, which is contained with inside b, which is contained
02:37with inside a, which is contained within the document element. So we'll get
02:41the g tag and then we'll see if it's contained by a node of type b. We'll test it
02:47against a node of type h and then we'll test it against a specific instance,
02:51in this case, the documentElement itself.
02:53So this one should evaluate to true because g is in fact inside an element of
02:58type b. We are not comparing a specific instance. This one should evaluate to
03:02false because g is not inside a node of type h. h is way over here, so it's not
03:10containing g. This one should return true as well because the g tag is in fact
03:17inside this document. So let's go ahead and write the function.
03:21So we'll start off by checking to see if we are doing strings or objects. I'll write
03:24if (typeof (oTestNode) == "string"). Then we are comparing node types, else if
03:38(typeof (oTestNode)== "object"), then comparing specific object instance.
03:51All right, let's do the string case first. So we are going to write var oTmpNode.
03:57So we start off by getting the parent of the node we were given and while
04:03that's not null, so while we have a node to test against,
04:11we are going to keep on doing these comparisons. So we'll see in the case of
04:14string, if the name of this node matches the name that we were given to look for,
04:21then congratulations, we have got a match and we return true. Otherwise,
04:29we just get the next parent in the chain and that's eventually going to run out parents.
04:38So if that happens, then we return false.
04:42Now for the object case, it's pretty much exactly the same algorithm except
04:47for the name comparison. We are not going to be comparing names; what we are
04:51going to be comparing is the specific instance. So we'll take the node name off.
04:56Let's go back and take a look. So this should return true, and then false,
05:02and then true. See what happens.
05:08Okay, so the first one is true because g is inside b and it's false because g
05:13is not inside h and that's true because g is inside the document. So let's go
05:18and check the Firefox case.
05:24And there it is true, and false, and true, same result.
05:29So, now you know how to check to see if a node isContainedBy another node,
05:34either a type or a specific instance. Let's move on to the next example.
Collapse this transcript
Using the containsNode() algorithm
00:00The containsNode algorithm is also a really useful algorithm to use in real
00:05world XML processing. The containsNode algorithm takes a node and sees if it
00:11contains another node of a given type or a specific instance and the way it
00:16works is it's given two parameters, this node here and the TestNode and we are
00:22going to check to see if the first parameter contains a node either of the
00:26given type in the case of TestNode being a string or if TestNode is an object,
00:32the specific node referred to by TestNode.
00:35So for example, we can pass in for TestNode the string div and see if Node
00:41contains a type of tag named div or paragraph or table or whatever. Or we can
00:46give it a specific node and we can ask, hey does oNode contain the specific
00:52node that we are talking about over here in oTestNode? Let's take a look at how
00:56the algorithm works. We default our look of variable bFound to being false
01:01because we assume that there's nothing going to be found.
01:03And then we check to see if TestNode is of type string. And if it is, we are
01:08comparing the node name of the node that we have given to the string to see if
01:13they match and if they do, we return true because we have the match. On the
01:18other hand, if oTestNode is an object then we are checking to see if oNode
01:23contains the specific oTestNode. And in this case, we don't compare the node
01:28name, we compare the node itself against the TestNode we were given and if they
01:32match then we return true. If there's no match then we need to do our
01:36containsNode function over again. Only this time we need to process all the
01:41children contained within oNode.
01:43And this is where we see the appearance of our real life depth first traversal
01:49algorithm, which we talked about at the beginning of this chapter. For each one
01:53of the child nodes contained underneath oNode, we are going to check to see if
01:57it contains the node that we are looking for.
02:00Let's see the code in action because it's probably a little bit easy to
02:04understand that way. So I'm going to go ahead and switch over to the code. Here
02:09we are in the code. It's the same example code I have been using through all
02:12the examples up until now. So very small HTML files you can see. Up here,
02:18I have got my test data and the test string is the XML file we'll be executing
02:24against and this is the function that we need to write, this containsNode
02:27function right here.
02:29What we are going to do is execute a couple of test cases. We are going to get
02:33a reference to the b tag right here. You can see the b tag is right below the
02:39a. It's near the top of the document. And then we are going to check to see if
02:43the b tag contained a tag of type g, which you can see that it does. It's right
02:47there. We are going to check to see if it contains a tag of type h, which it
02:52does not. You can see that the h is all the way over here outside the b and
02:56then we are going to check to see if the document element contains the b tag,
03:03the specific one, not the type.
03:05So this one should return true because g is inside b. This one should return
03:10false and this one should return true because the document does contain that
03:14specific b tag. All right, so let's go ahead and write the code. We are going
03:18to start off with our local variable bFound and we'll set it to false because
03:23we are going to assume that no matches exist and that's what we are going to
03:27return from the function.
03:29So the first thing we are going to do is check to see if the type that we were
03:34given for the TestNode is a string. And if it is, we are doing the string
03:41comparison. Otherwise, if the type that we were given for oTestNode is an
03:50object we are doing specific object comparisons. And if that doesn't work,
03:56we are going to execute our loop, which does the depth first traversal, and
04:06we are going to call containsNode again.
04:11So let's write the comparisons. First one is in the case of the string,
04:15I'm going to check the node name which all nodes have to see if it matches the
04:21TestNode we were given and if it does, we return true. And in the case of the
04:29object, we check to see if the object itself matches and if it does, we return
04:38true, otherwise we have to do this loop here. So we'll say var theNode, get the
04:47firstChild. Okay, then we need to make sure that the node is not null, because
04:58we have to have something to compare against.
05:01And we need to make sure that we haven't found anything yet. So if we have a
05:07node to search and we need to keep on searching because we haven't found
05:10anything yet, then we are going to do the function call and we are going to
05:14proceed to the next sibling down the line if we have to. Let me throw on the
05:23parameters here. This is theNode and oTestNode.
05:28Okay, so we have our cursor function call, we have got our test in place and we
05:33have our comparisons in place and we are returning the ultimate result right
05:37here. So once again we are going to see if b contains a g and an h and then we
05:47are going to see if the document element contains this specific b tag. So we
05:51should have true, false and true. So let's go ahead and view that in the
05:56browser. true, false and true. That works. Let's do it in Firefox and there we
06:11go true and false and true.
06:16Okay, so now you know how to find out if a node contains another node, let's
06:22look at our next example.
Collapse this transcript
Using the hasSibling() algorithm
00:00This real-world XML algorithm is called hasSibling and you basically use this one
00:05to determine if a node has a sibling node that either has a name of a given
00:10type or a specific sibling. That is, a node that exists at the same level as the
00:16node we are interested in comparing on either side of it. So let's see how it works.
00:21So hasSibling takes two arguments: oNode and oTestNode. So oNode is the one
00:26that we are interested in testing against and oTestNode is either going to be a
00:31string, which indicates a type of node that we are interested in finding, or
00:36it could be an object, which indicates a specific instance of a node that we are looking for.
00:41So for example, using this function we can check to see if a paragraph has a
00:46sibling of another type, like another paragraph or a div or something, or we can
00:50check to see if a node has a specific sibling in mind. You might want to check
00:56to see if a button control has an adjoining edit field, for example.
01:02Let's take a look at how it works. We start off declaring a temporary variable
01:05that holds the previous sibling of the node that we are looking at. So
01:10the first thing we are going to do is search to the left, then we are going to
01:13search to the right if we don't have any matches. So while we have a node to
01:18compare against, we check to see if the type of argument that we were given for
01:22test node is a string and if it is, then we compare the node name against the test node.
01:27Otherwise, if it's an object, we are looking for a specific instance.
01:30In that case, we compare the two objects together and return true if they match.
01:34If we don't have a match, then we simply get the previous sibling and this will
01:39eventually become null if we run out of siblings and we have no matches, which
01:42will cause oTmpNode to null and then this while loop will fall through.
01:47When that happens, we look the other way. We start looking at nextSibling. Now we are
01:50going to search to the right and the whole process starts all over again. So,
01:54let's look at this in the code and see how it works.
01:56Okay, so here we are in the code and we need to write the hasSibling function.
02:03You can see that I'm using the same example file that I have been using all
02:08along. There's the sample XML code up there. So what we are going to do in this
02:12example is get a reference to the c tag, which is right here. We are going to
02:17check to see if the c tag has a sibling of type e, of type i and
02:24then we're going to see if it has a specific instance of e next to it.
02:30So we are going to get a reference to the e tag that's up here and check to see
02:35if this e tag is a sibling of it. So you can see that the c tag here is in fact
02:42a sibling. It has two of them actually. There's a c, a d and an e, they are all
02:45at the same level. i is all the way over here so it's not a sibling.
02:49So that should return false and since we are getting the specific instance of e that
02:55we compared up here, this one should also return true. So in this case it says,
03:01hey, does my node have a sibling of type e? And this one says, hey, does my node
03:05have this specific node as a sibling?
03:08So let's go ahead and write the hasSibling function. What we are going to do is
03:12declare our temporary variable and this is going to hold the node that we do
03:18our comparing against. And we are going to set it to be the previous sibling to
03:23start with and while we have an oTmpNode to compare against, we are going to do
03:36the comparisons. So if the type of oTestNode that we were given is a string,
03:44we are going to do a string comparison. Otherwise if we were given an object...
03:58we are going to do an object comparison. And if no match happens, then we'll just
04:05simply get the next node.
04:13Okay, and let me close this off right there and move
04:20this up to the right level.
04:25All right, now if this does not result in a search match, then we start on the
04:31other side looking at next siblings. So in that case, we are going to set
04:35oTmpNode = oNode.nextSibling and in this case, the logic of the top half just
04:46repeats again. So I'll copy that, paste it down here and in this case,
04:52we are not looking for the previousSibling anymore. We are looking for the nextSibling
04:56and in the case of strings we are going to compare the temp name.
05:01So if (oTmpNode. nodeName == oTestNode) then return true and in the case of objects,
05:14we are not comparing the nodeName. It's just the object itself and the same thing
05:22down in the nextSibling case. Just copy and paste those guys
05:33and if this while loop falls to the bottom, then we just need to return false.
05:41Looks like we are ready to give this a spin. Let's go ahead and bring this up
05:44in the browser. So remember, we are looking for true, false and true.
05:54There's true, there's false and there's true. Let's try it with Firefox.
06:03There's true, false and true. Now, you know how to check to see if a node has a sibling of a
06:10given type or a specific instance. So now we are going to move on to our last example.
Collapse this transcript
Using the getElementsByAttrVal() algorithm
00:00Okay. So for this last example we are going to write another algorithm that
00:04uses a depth-first traversal, and this is a really useful function.
00:10It retrieves all the elements that have an attribute that match a given attribute
00:14value and it's called, oddly enough, getElementsbyAttrVal.
00:19It works by taking three arguments. There's the node that we are interested in,
00:24that's the starting point, and we are going to look for elements inside oNode
00:28here that have an attribute the same as sAttName and a value that's equal to
00:35sAttVal right here.
00:37The way it works is we have an internal array here named aNodes and we also
00:42have a locally defined function inside our outer function here. The locally
00:47defined function is called processNodes. This is our depth-first traversal
00:51algorithm that's going to look through all of the nodes and check to see if
00:56they in fact have an attribute that matches the value we are looking for.
00:59Each time we find a match, we are going to add that node to this internal array
01:05right here, and when we are all done, we are going to return that array back to the caller.
01:10This is a really powerful function. You can use it to quickly build up lists of
01:15nodes that have attributes that have a given value. It also illustrates
01:19the concept of having a nested function inside of another function.
01:23Let's have a look at the code in action. So here we are in the sample code.
01:29This is pretty much the same sample code that I have been using up until now
01:32with a major difference. You can see I have gone up to the testXML data here
01:39and I have added some attributes to some of the elements.
01:44So I have added attribute type='test' to the b, the g, the i elements, and on
01:54the j element I have added type='blah'. So for testing, we are going to be
02:00looking for all the elements that have type='test'.
02:06So down here in the testing area, you can see that what I'm doing is
02:11I'm getting a reference to the top level a tag, that's the root of the document,
02:17and I'm calling getElementsbyAttrVal on the top level tag and I'm looking for
02:23elements that have a type attribute equal to the string test.
02:28I am going to have a string variable down here. It says, "Nodes with attribute
02:32'test': "; and then I'm going to have a loop that's going to build up a list of
02:36nodes that match the result. So this result right here that's returned by
02:41getElementsbyAttrVal, this is going to be an array of all the elements that
02:45match this criteria right here.
02:48Let's go ahead and write the code. So I'm going to clear my local array and
02:56initialize it to being empty, and then I'm going to write my internal function.
03:01This is my internal processNodes function. I'm going to call this from within
03:08my getElementsbyAttrVal function. So I need to call processNodes and this is
03:14what kicks off the process here.
03:17At the end of the day, we are going to return the aNodes array. And for
03:23processNodes, we are going to pass in oNode and we are going to pass in the
03:27aNodes array, because that's going to be modified and then we are going to pass
03:32in the name and attribute values that we are looking for.
03:39So the processNodes function, that's going to take these arguments right here.
03:45So let me go and Copy those guys onto this function right here, and I'm going
03:52to change the names slightly just to avoid confusion while we are reading the codes.
03:56So I'm going to call this aNodeList, and that should be good enough. So inside
04:04processNodes, we are going to check to see if (oNodes.nodeType; and this is
04:11something that all nodes have. They all have a nodeType. We are going to
04:14compare that against the constant value on the node class to be a
04:18node.ELEMENT_NODE and the reason we are going to do that is because only
04:22node.ELEMENT_NODES can have attributes.
04:25So there's really no point in comparing other kinds of nodes, like textNodes or
04:29comments or CDATA sections. All of these are XML data types, but they can have
04:35attributes, so we might as well exclude them.
04:37Then we are going to have a local variable called sAttr and we are going to get
04:44the attribute that we are looking for from the node, and we do that by calling
04:49the getAttribute DOM function with the sAttName. If this is not equal to null,
04:56then we need to check to see if it's equal to the value that we are looking for.
05:02We do that by saying if (sAttr == sAttVal). And if they match, then we say
05:14aNodeList.push() and we are going to push the node onto the array list.
05:24Now that we have done that, we have to recursively call the function to do our
05:29depth-first traversal. So we are going to say for (var n = oNode.firstChild;
05:42n!= null; n = n.nextSibling). And inside this for loop, we are going to call
05:57processNodes again and processNodes will give the n argument, because that's
06:04the child node that we are going to process.
06:07We have to pass in the NodeList so that we can keep on adding results and
06:11we have to pass in the attribute we are looking for and the value that we are looking for.
06:19At this point, we should be in a place where we can test this out. Let's look
06:24down here. So remember, starting at the a tag, we are looking to build a list
06:30of all the elements that have the type attribute set to 'test'. So if we look
06:35back up in the sample XML, that should be the b tag, the g tag, the i tag, and
06:48it should exclude the j, because it has a type attribute but it's not equal to 'test'.
06:52It's equal to 'blah'.
06:53Let's go ahead and run this in the browser and see what happens.
07:04So you can see nodes with attribute 'test' b, g, and i were all found, and j was excluded
07:11like it should have been.
07:13So let's try it in Firefox.
07:19Okay, same result. You can see nodes with attribute
07:22test b, g, and i, and j was excluded the way it should be.
07:26Okay. Once again, we have built a real-world DOM function that you can use in
07:31your real-world work. So please feel free to go ahead and use that code in your projects
07:36and that concludes this chapter.
Collapse this transcript
Conclusion
Goodbye
00:00Okay, that concludes Real-World XML. I hope you enjoyed working along through
00:04the examples with me and I hope you learned a lot. You should have a good
00:08foundation now to go out and work with XML in the real world, especially since
00:11we saw how to use XML and JavaScript in the browsers.
00:15We walked through designing and implementing our own XML format. We took a look
00:20at some of the real-world XML formats that are out there and used today.
00:23We took a look at some real-world DOM algorithms that you can use in your pages
00:28and projects to make you more productive in your work environment.
00:31Hope you enjoyed yourself. Thanks for listening.
Collapse this transcript


Suggested courses to watch next:

XML Essential Training (5h 48m)
Joe Marini

ActionScript 3.0: Working with XML (2h 27m)
Todd Perkins



Are you sure you want to delete this bookmark?

cancel

Bookmark this Tutorial

Name

Description

{0} characters left

Tags

Separate tags with a space. Use quotes around multi-word tags. Suggested Tags:
loading
cancel

bookmark this course

{0} characters left Separate tags with a space. Use quotes around multi-word tags. Suggested Tags:
loading

Error:

go to playlists »

Create new playlist

name:
description:
save cancel

You must be a lynda.com member to watch this video.

Every course in the lynda.com library contains free videos that let you assess the quality of our tutorials before you subscribe—just click on the blue links to watch them. Become a member to access all 98,609 instructional videos.

start free trial learn more

If you are already an active lynda.com member, please log in to access the lynda.com library.

Get access to all lynda.com videos

You are currently signed into your admin account, which doesn't let you view lynda.com videos. For full access to the lynda.com library, log in through iplogin.lynda.com, or sign in through your organization's portal. You may also request a user account by calling 1 1 (888) 335-9632 or emailing us at cs@lynda.com.

Get access to all lynda.com videos

You are currently signed into your admin account, which doesn't let you view lynda.com videos. For full access to the lynda.com library, log in through iplogin.lynda.com, or sign in through your organization's portal. You may also request a user account by calling 1 1 (888) 335-9632 or emailing us at cs@lynda.com.

Access to lynda.com videos

Your organization has a limited access membership to the lynda.com library that allows access to only a specific, limited selection of courses.

You don't have access to this video.

You're logged in as an account administrator, but your membership is not active.

Contact a Training Solutions Advisor at 1 (888) 335-9632.

How to access this video.

If this course is one of your five classes, then your class currently isn't in session.

If you want to watch this video and it is not part of your class, upgrade your membership for unlimited access to the full library of 1,894 courses anytime, anywhere.

learn more upgrade

You can always watch the free content included in every course.

Questions? Call Customer Service at 1 1 (888) 335-9632 or email cs@lynda.com.

You don't have access to this video.

You're logged in as an account administrator, but your membership is no longer active. You can still access reports and account information.

To reactivate your account, contact a Training Solutions Advisor at 1 1 (888) 335-9632.

Need help accessing this video?

You can't access this video from your master administrator account.

Call Customer Service at 1 1 (888) 335-9632 or email cs@lynda.com for help accessing this video.


site feedback

Thanks for signing up.

We’ll send you a confirmation email shortly.


By signing up, you’ll receive about four emails per month, including

We’ll only use your email address to send you these mailings.

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

By signing up, you’ll receive about four emails per month, including

We’ll only use your email address to send you these mailings.

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked