IntroductionWelcome| 00:00 | With all the data we have to deal with
as developers and designers, the world
| | 00:04 | needs a way to transmit it, store it
and describe it. Well, I have a 3-word
| | 00:09 | solution for you: Extensible Markup Language.
| | 00:12 | (Music playing.)
| | 00:15 | Okay, I'll make it three letters: XML.
I'm Joe Marini and this is Real-World XML.
| | 00:21 | I have spent a better of my
career working in the web and graphics
| | 00:24 | industries developing
applications like Dreamweaver, Expression and
| | 00:28 | QuarkXPress. During this time, I have
seen XML become more and more important
| | 00:33 | for information exchange on the web.
| | 00:35 | Now in this title, I'll show you how
XML is used in the real world from common
| | 00:39 | formats like RSS and Atom to
technologies for processing XML data like XSLT.
| | 00:45 | I'll also show you how you can build
your own XML tag set and review some XML
| | 00:49 | design and developing
techniques I have learned over the years.
| | 00:52 | I will be showing you these tools and
techniques with an eye to making them
| | 00:55 | applicable to your development needs,
so you can use what you have learned here
| | 00:59 | in your work environment. Now
let's get started with Real-World XML.
| | Collapse this transcript |
| Using the exercise files| 00:00 | If you are a premium member of the
lynda.com Online Training Library, or if
| | 00:05 | you are watching this tutorial on a disk,
then you have access to the Exercise Files
| | 00:09 | used throughout this title and
I have laid out the Exercise Files in this
| | 00:13 | folder format here where the chapter
number corresponds to the folder where the
| | 00:18 | corresponding example files are located.
| | 00:21 | Typically what I do is I provide, in
this case, sample files but for chapters
| | 00:27 | in which we provide exercises, you can
see that I have laid out the files using
| | 00:30 | a _start and _finished format. So if
you want to follow along with me during
| | 00:36 | the lesson, open up the file name that
has the _start in it and type along with me.
| | 00:41 | Or if you want to use the finished
version to just jump ahead and
| | 00:46 | see how things are done, you can just
open that file up in your editor and try
| | 00:49 | things out yourself.
| | 00:50 | Now if you are a monthly or annual
subscriber to lynda.com, you don't have
| | 00:55 | access to the Exercise Files, but you
can follow along with me on the screen
| | 00:58 | and just pause the movie to type in
the code as you see me typing it.
| | 01:03 | In most of the examples I scroll through
the document so you can see all the code. So,
| | 01:06 | you can just go ahead and pause
the movie and type the code in.
| | 01:09 | All right, let's get started!
| | Collapse this transcript |
| Tools for working with XML| 00:00 | XML is essentially just a text
format so you can use pretty much any text
| | 00:03 | editing tool to work with it. However,
there are some good tools for both
| | 00:07 | Windows and Macintosh that you can use
that provide some additional features.
| | 00:11 | Now, the tool that I have been using
in this course is Visual Web Developer,
| | 00:15 | Express Edition and that's a free tool
from Microsoft and I have provided the
| | 00:19 | URL for you to download it if you want
to try it out yourself on Windows.
| | 00:22 | For the example on which we edit XML and
integrate it into our website, I was using
| | 00:28 | Microsoft Expression Web. That's a
professional level product. It's also
| | 00:31 | available from Microsoft, but you can
also use programs like Adobe Dreamweaver
| | 00:36 | or whatever other professional editor you have.
| | 00:38 | On the Macintosh side, there are
programs like BBEdit and TextMate. Those are
| | 00:43 | really good editors and products like
WebScript and of course Dreamweaver works
| | 00:46 | on the Mac as well. The important
thing to remember is that XML is just text,
| | 00:50 | so you can use any text editor for it.
However, text editors that have features
| | 00:54 | like automatic indenting and syntax
coloring and IntelliSense are going to be
| | 01:00 | a lot more useful then just straight
text editors without those features.
| | 01:04 | Whichever tool you decide to use
though, just make sure it's comfortable.
| | 01:06 | That should be all you need to know,
so let's go ahead and get started.
| | Collapse this transcript |
|
|
1. The XML LandscapeReviewing XML| 00:00 | Let's begin by taking a quick review of XML,
what it is and what it looks like.
| | 00:05 | Now if you haven't seen XML before or if
you are new to the subject, then I suggest
| | 00:10 | you take a look at another lynda.com
title that's available in the Training Library
| | 00:15 | called XML Essential Training
and I do that title as well.
| | 00:19 | It's really a foundational title. So
if you are new to this and you haven't
| | 00:23 | seen it before, I highly suggest you go
check that title out first because that
| | 00:28 | will provide the foundational
knowledge that you will need to get through
| | 00:31 | the rest of this title.
| | 00:32 | In this course, we'll be using concepts
that are introduced in XML Essential Training.
| | 00:37 | So you are going to want to
make sure that you have those concepts
| | 00:41 | and ideas under your belt before
you tackle a title like this one.
| | 00:45 | XML is the Extensible Mark-Up Language.
It's tag-based like HTML is. So if you
| | 00:51 | are familiar with HTML code, then XML
will look very familiar to you. XML is
| | 00:56 | used to describe data and the structure
of that data. Because XML is extensible,
| | 01:03 | you get to make your own tags up.
| | 01:06 | The benefits of XML are numerous.
I have called out a few here. First,
| | 01:10 | XML allows you to separate the content
of a document from how it's presented.
| | 01:15 | XML does not contain by itself any notion
of how data should be presented to the
| | 01:20 | person reading it or consuming it.
| | 01:22 | You can also create tag sets that
target specific problems. In fact, we'll take
| | 01:26 | a look at how to do that in this course.
XML stores information in a way that
| | 01:30 | people can easily understand. So even
though XML may be intended to be consumed
| | 01:35 | by another computer system, or a machine,
it's still in a format that a person
| | 01:40 | can read and understand with some time
and depending on how large the XML file is.
| | 01:45 | XML allows you to exchange data among
disparate systems using technologies, for
| | 01:50 | example, like web services. So two
systems that may never have been designed to
| | 01:55 | talk to each other can use
XML to exchange data among them.
| | 01:59 | Finally, XML is an open format and
it's text based. So it can be processed by
| | 02:04 | any program that happens to be aware
of XML. Now that's not to say that XML
| | 02:09 | doesn't have some drawbacks. XML,
for example, is not good for storing
| | 02:14 | large amounts of data.
| | 02:15 | In some cases, performance can also be
slower than other methods of storing and
| | 02:20 | retrieving data, so binary format,
for example. If you were to store say a
| | 02:24 | Photoshop file as an XML file, that
file would probably be a lot less efficient
| | 02:29 | than the binary format that you could
store the image in. XML might not be
| | 02:34 | the best format for representing certain kinds
of data, audio, video, that kind of binary stuff.
| | 02:40 | Finally, some parts of XML, like
namespaces, are kind of difficult to
| | 02:44 | understand and hard to work with. Now
XML documents must be what's known as
| | 02:49 | well-formed. They always have a single
root tag, just like HTML does, and tags
| | 02:55 | have to be properly nested.
| | 02:57 | In other words, you have to have an A
tag completely inside of a B tag or to
| | 03:02 | take an HTML example, you can't have a
tag that's like bold and then italic and
| | 03:07 | then close bold, and then close italic. The
XML parser won't let you get away with that.
| | 03:11 | Unlike regular HTML, empty tags
always have to end with a slash inside the
| | 03:17 | closing angle bracket, just like in
XHTML for example. Attributes have to be
| | 03:21 | inside quotes and they can't be
minimized. If you are using XHTML,
| | 03:24 | you're already familiar with these concepts.
| | 03:26 | XML documents can be what's known as
valid. In other words, you can take an XML
| | 03:32 | document and validate it against a
schema or Document Type Definition to make
| | 03:38 | sure it confirms to certain rules.
This is a sample XML file. In fact, we'll be
| | 03:44 | using this sample XML file a
couple of times in this course.
| | 03:47 | Now looking at the sample XML file, you
can see that I have defined a few tags,
| | 03:51 | like BusinessCard and name. So this is
an XML file that represents the contact
| | 03:56 | you might find on a business card.
Here we have some phone numbers and an
| | 04:00 | e-mail address. The phones
have certain attributes.
| | 04:03 | So you can see that XML can be used to
mark-up tag sets that solve a particular
| | 04:09 | type of problem. In this case, we
needed a way to represent contact data on a
| | 04:14 | business card. Later on in the course,
we'll see how to do something like this
| | 04:17 | in a real life web setting.
| | 04:19 | Okay, so that's a quick review of XML.
Again, if you are new to all this,
| | 04:24 | I highly encourage you to go check out
the XML Essential Training title that's
| | 04:28 | also available at lynda.com
before continuing further.
| | Collapse this transcript |
| Understanding XML usage today| 00:00 | XML is used in a number of different
real-world settings today, but you can
| | 00:05 | break down how XML is used into three
main categories. Let's cover those now.
| | 00:09 | In the data extraction usage, you are
taking XML and using it to represent some
| | 00:14 | type of data format.
| | 00:16 | Now most modern databases can provide
data in XML format today. All the large ones,
| | 00:21 | like Oracle and Microsoft and
MySQL, IBM for example. They can all export
| | 00:28 | data using XML. The modern browsers
can also load XML from different data
| | 00:33 | sources. You can provide a URL or from
the local file system. We'll be seeing
| | 00:39 | an example of that later in this course.
| | 00:41 | There are also technologies like XPath
and XQuery that are used for querying
| | 00:46 | XML data in XML documents. XPath is a
very lightweight form of querying and
| | 00:51 | it takes a syntax that looks a little bit like
Directory Paths that you might be familiar with.
| | 00:55 | XQuery is a bit more complex. XQuery
is to XML what SQL is to structured
| | 01:03 | relational data. We are not going to
cover that in this course because it's
| | 01:07 | fairly advanced and could
easily fill a title all on its own.
| | 01:10 | In the data preparation and processing
area of usage, you take the XML data
| | 01:15 | you have been given and prepare it for
presentation and process it further.
| | 01:19 | So for example, if you have an XML file that
represents a series of products or items
| | 01:26 | and these items might have prices,
you might do some data preparation or
| | 01:31 | processing to run through all the tags
and add up all the prices to arrive at a
| | 01:35 | total for example, or count the
number of items in a file for some reason.
| | 01:39 | Technologies for doing this are XML
Schema, which ensures that XML data in a
| | 01:44 | document conforms to certain rules. So
for example, a certain tag of a certain
| | 01:50 | type has to be inside another tag of a
certain type, or a tag that indicates a
| | 01:55 | price has to contain only numbers and a
period for the decimal place and so on.
| | 02:01 | The XSLT technology, which stands for
XML Stylesheet Language Transformations,
| | 02:06 | is used to transform XML into other
syntaxes like ASCII or PDF or HTML or more XML.
| | 02:15 | DOM and SAX are two different
programming methods used for scripting XML.
| | 02:21 | In this course, we'll use mostly the
DOM because the browsers don't use SAX.
| | 02:26 | For data presentation, you can use a
combination of CSS, XSLT or DOM scripting
| | 02:32 | in order to present data to the user.
These are not mutually exclusive. You can
| | 02:37 | use a combination of any of these three.
During this course we'll do this a few times.
| | 02:42 | Okay, let's take a look at the XML
landscape as it's currently today.
| | 02:46 | In the data storage and exchange side,
we have some established standards, like XHTML
| | 02:52 | and RSS and SVG. We'll cover RSS a
little bit later in this course. RSS is
| | 02:58 | essentially a way of syndicating
content that changes over time.
| | 03:03 | You're probably familiar RSS
by reading blogs, for example.
| | 03:06 | SVG stands for Scalable Vector Graphics.
That's an XML syntax that describes
| | 03:12 | drawings made using vectors, such as
Illustrator files. There are a number of
| | 03:18 | emerging standards however. ATOM,
for example, is a standard publishing
| | 03:23 | syndicated content just like
RSS is. Only it's slightly richer.
| | 03:26 | Then there are standards like RDF and
XForms, which we won't get into in this course,
| | 03:30 | but solve their own types of
business problems. XForms, for example, is
| | 03:34 | a way of processing forms on the web.
RDF stands Resource Description Framework.
| | 03:40 | Then there's XHTML 5, which is an
emerging standard that aims to standardize
| | 03:45 | the way that web applications are built.
On the data processing side, there's
| | 03:49 | DTD and Schema, which we have talked
about earlier which enforces rules.
| | 03:53 | There's DOM and SAX, which we also
mentioned. Then there's technologies like
| | 03:57 | XMLHTTP Request, which is the foundation
of AJAX, and XSLT and XPath, which
| | 04:04 | we have covered earlier.
| | 04:05 | We also talked a little bit about XML
Query for querying XML data and then
| | 04:10 | there's the XLink and XPointer
specifications, which are also emerging. Again,
| | 04:15 | we won't cover those in this course
because they are fairly advanced, but the idea
| | 04:19 | is that these provide more advanced
ways of linking XML documents together far
| | 04:23 | beyond what the standard
HTML link gives us today.
| | 04:26 | Okay, so with this in mind, let's
take a look at some important XML technologies.
| | Collapse this transcript |
| Important XML technologies| 00:00 | The first technology we'll look at and
talk about is XPath. Now Extensible Path
| | 00:04 | Language is what XPath stands for and
it's used to extract data from inside an
| | 00:08 | XML file. It uses a path-like syntax
similar to directory or folder paths.
| | 00:14 | If you are not familiar XPath, we
cover it a little bit in the XML Essential
| | 00:18 | Training title. So you might want to
refer to that title to get familiar with it.
| | 00:21 | XSLT is also another XML based
language for defining style sheets.
| | 00:27 | As I mentioned earlier, it's a styling
language that takes an XML file and
| | 00:31 | transforms it into something else,
like HTML or PDF or some other file format.
| | 00:37 | We talked a little bit about SAX and DOM.
These are methods of processing data
| | 00:41 | and Schema. Schema is a way of
expressing rules for a given XML syntax.
| | 00:46 | Now you may be already familiar with Document
Type Definitions. You can think of Schema
| | 00:50 | as the next step beyond DTDs. They
define things like what tags are or are not
| | 00:56 | allowed and where they can go, what kinds
of data they contain, so on and so forth.
| | 01:00 | In this title we'll take a look at
some formats like RSS, which stands for
| | 01:05 | Really Simple Syndication. This
provides data in discrete chunks that can be
| | 01:10 | read individually. You have probably
seen this in blogs or news sites or other
| | 01:15 | syndicated content. If you own a TiVo,
for example, the TiVo actually makes the
| | 01:19 | items that are recorded on your
TiVo available as an RSS file.
| | 01:24 | So ATOM is another format for
syndicating content and like RSS, it provides
| | 01:28 | content in a richer syndicated fashion.
It was adopted back in 2005 and we'll
| | 01:33 | take a deeper look at the
ATOM format in this course.
| | 01:35 | There is also better support
for XML built into the browsers.
| | 01:40 | Modern browsers, like Internet Explorer 7
and higher and 3 and higher for Firefox,
| | 01:45 | provide really good support for
technologies like the DOM and XPath and XSLT
| | 01:50 | and some newer things like
serialization and parsing. We'll get into that later
| | 01:55 | when we get to the
chapter on XML and the browsers.
| | 01:58 | Okay, so now that we have seen what
important technologies there are in the XML
| | 02:02 | world today and we have seen the XML
landscape and how XML is used, let's get
| | 02:07 | started and take a look at
some real-world XML formats.
| | Collapse this transcript |
|
|
2. Real-World XML FormatsUnderstanding the Sitemap and Sitemap index formats| 00:00 | Before we jump in and start designing
our own XML format, I thought it would be
| | 00:04 | instructive to take a look at some of
the real-world XML formats that are in
| | 00:08 | use today. We'll start out by looking at
the Sitemap and the Sitemap Index formats.
| | 00:15 | These formats provide a way for web
masters to inform search engines about the
| | 00:20 | contents of their sites that are
available for searching or crawling by the
| | 00:26 | search engines. The Sitemap and
Sitemap index currently enjoy pretty wide
| | 00:31 | support. They are supported by Google
and Yahoo and Microsoft search engines,
| | 00:35 | which pretty much constitute the bulk
of the search engine traffic that's out
| | 00:40 | there today.
| | 00:41 | Now I want to point out that Sitemap
and Sitemap Index don't affect the way
| | 00:45 | that your sites appear or are ranked in
the search engines. The whole point of
| | 00:52 | these file formats is to tell the
search engines how they can crawl your site
| | 00:57 | more intelligently. This is not about
Search Engine Optimization or anything like that.
| | 01:03 | Each Sitemap is an XML file and that
XML file lists information about each URL
| | 01:10 | that is available on your site. It
lists information like when it was last
| | 01:15 | updated, and how often it changes, and
so on and so forth. Now as I said, this
| | 01:20 | does not guarantee that pages are
going to be included in search results or
| | 01:26 | that it's in any way going to affect
how your page gets ranked. The whole idea
| | 01:31 | here is that this is a way for your
site to inform the search engines about the
| | 01:37 | structure of your site, how they
should search the site, that kind of thing.
| | 01:41 | You can find out more information
about the Sitemap and the Sitemap Index
| | 01:46 | formats at the URL that you see here,
www.sitemaps.org. Okay, so each Sitemap
| | 01:54 | file contains a collection of tags that
define the URLs that the search engines
| | 02:01 | should care most about.
| | 02:03 | Now Sitemap files are limited to 10
Megabytes in size. So if you have to use
| | 02:09 | more than one Sitemap file, then
Sitemap index files are used to group multiple
| | 02:14 | Sitemap files together. You can
imagine for websites that have a lot of URLs,
| | 02:20 | such as say a large catalog shopping
site, they want to index all of the URLs
| | 02:26 | that are available. That can easily
exceed 10 Megabytes in size pretty quickly.
| | 02:30 | So the Sitemap index file is how
you group multiple Sitemaps together.
| | 02:35 | Ideally, you place these files at the
root of your website and you then either
| | 02:40 | include them in a robots.txt file or
you submit the site directly to the search
| | 02:46 | engines in order to let them know that
these files exists and the sitemap.org
| | 02:52 | URL that I listed earlier has more
detailed information on how to do this.
| | 02:56 | These are all only just hints. The
search engines don't use this to affect your
| | 03:01 | site's search rankings.
| | 03:02 | Let's take a look at the tags available
in the Sitemap file. Each Sitemap file
| | 03:09 | has a set of tags, some of them are
required and some are not. This table lists
| | 03:14 | all of the tags that are in the
Sitemap file format. So you can see there are
| | 03:19 | six tags. So it's a pretty compact,
pretty focused file format that does one
| | 03:24 | job and does it well.
| | 03:26 | The urlset tag, the one at the top
here, it's required. It encapsulates the
| | 03:32 | file and it references the current
protocol standard. So this basically serves
| | 03:36 | as the root tag in any of the Sitemap
files. Urlset tags contain one or more
| | 03:44 | URL tags. This is the parent tag for
each URL entry. All the other tags in this
| | 03:51 | list are child tags of this url tag.
As you can see, it's also required.
| | 03:58 | Inside each url tag, there's one
required tag and that's the loc tag right
| | 04:04 | here. The loc tag stands for location
and it lists the URL of the page. The URL
| | 04:11 | has to begin with the protocol like HTTP.
If your web server requires it, then
| | 04:16 | it has to end with the trailing slash
on the URL. Some web servers require it
| | 04:22 | and some don't. The whole idea though
is that these URLs are going to be used
| | 04:27 | by the search engines to crawl your
site. So if your web server requires it,
| | 04:31 | then you have to include
them in these tags as well.
| | 04:33 | The rest of the tags are optional.
The lastmod tag indicates using a date
| | 04:40 | format when this URL was last modified.
Now this date should be in the W3C
| | 04:46 | Datetime format which you can look up
on the W3.org website. If you want,
| | 04:51 | you can just omit the time portion and
use the format of a four-character year,
| | 04:57 | followed by a two-digit
month and a two-digit day.
| | 05:00 | The next tag, changefreq, indicates
the frequency that the page changes. It
| | 05:06 | provides basically general information
to search engines. Now this may or may
| | 05:11 | not co-relate exactly to how often
they crawl over the page. Remember, this
| | 05:15 | file's purpose in life is to provide
hints to the search engines, they don't
| | 05:19 | necessarily denote solid rules
that the engines have to follow.
| | 05:23 | So you can put in values for this tag,
either always or hourly, daily, weekly,
| | 05:30 | monthly, yearly and never. So if you
place always in this tag, it means that
| | 05:36 | the page is always changing, it
dynamic and it needs to be searched each and
| | 05:41 | every time as if it were a new page.
The never value, you should only use that
| | 05:46 | in cases of pages that have been
archived and don't need to be searched
| | 05:51 | anymore. Ironically enough, that may or
may not mean that search engines honor
| | 05:56 | that value. They may choose to search
pages listed as never anyway just in case
| | 06:01 | there are unexpected changes to
those pages. Again, these are hints.
| | 06:05 | Then finally, there's the priority and
that's also optional. This indicates the
| | 06:10 | priority of this particular URL
relative to the other URLs on your site.
| | 06:17 | You can place values from 0, meaning least
important, up to 1.0, which means most important.
| | 06:24 | The default priority, if you don't
specify this, is going to be 0.5. Meaning
| | 06:28 | it's kind of a middle priority. Now
this priority again does not affect how
| | 06:33 | your page gets listed in search
engine rankings. It just indicates how
| | 06:38 | important the file is relative to
the rest of the ones in your site.
| | 06:42 | So this is what a sample Sitemap looks
like. You can see at the top, there's
| | 06:46 | the XML declaration. In XML version 1.
0, this is optional but it's always a
| | 06:51 | good idea to declare it anyway. In 1.1,
this became mandatory but in XML 1.0
| | 06:57 | the XML version is not needed but I always
like to put it in because it's proper XML.
| | 07:02 | You can see here, here is the urlset at
the top of the page. It references its
| | 07:06 | namespace in case we wanted to include
this in another file, we wouldn't have
| | 07:10 | name collisions. Then inside the urlset,
you have a collection of URL tags.
| | 07:15 | You can see that each one of these guys
has a location tag but not all of them
| | 07:20 | have, for example, priority or last
modification. It turns out that each one of
| | 07:25 | them has a change frequency but
again, those are optional as well.
| | 07:28 | So this is a finished and complements
sample Sitemap. You can see it's focused
| | 07:33 | on one job. Its whole job in life is to
tell search engines how often and which
| | 07:38 | URLs they should crawl on your site.
Okay, so moving along looking at the
| | 07:43 | Sitemap index tags.
| | 07:45 | Now Sitemap index files are even
more compact. That's because their only
| | 07:49 | purpose in life is to group together
multiple Sitemap files, in the case that
| | 07:54 | you build Sitemap files that are
larger than 10 Megabytes, you have to break
| | 07:57 | them down into smaller parts and then
group them together using a Sitemap index.
| | 08:02 | So all but one of these tags are
required. The sitemapindex tag is required and
| | 08:08 | it's the root tag of the document. The
sitemap tag is also required. These go
| | 08:12 | inside the sitemapindex root tag and
there can be one or more of these. Each
| | 08:17 | sitemap tag essentially encloses the
location and lastmod tags about each
| | 08:24 | Sitemap file. The location or loc tag
indicates the URL of the Sitemap that it
| | 08:31 | points to and lastmod is the time that the
corresponding Sitemap file was last modified.
| | 08:37 | It does not correspond to the time
that any of the pages in that Sitemap were
| | 08:41 | changed. It's the file itself. Again,
this should be kept in W3C style Datetime
| | 08:47 | format. Here we have a sample Sitemap
index. So you can see that in this case
| | 08:52 | we have a sitemapindex. This is the
root and here is its namespace declaration.
| | 08:57 | This sitemap index file points to two
different sitemaps. This one here has an
| | 09:02 | example URL. This one has another
one. We indicate when they were last
| | 09:08 | modified. This is what the W3C Datetime
format looks like. If you want to omit
| | 09:13 | the time portion, which starts from the
T and goes to the end, you just can use
| | 09:17 | a four-character date followed by a two-
character month and two-character day.
| | 09:21 | That's essentially sitemaps and
sitemaps index files in a nutshell. What we are
| | 09:27 | going to do now is jump over to the
code really quick, so we can look at in the
| | 09:31 | other. Okay, so here we are in the
code and if you have access to the sample
| | 09:37 | files, then you have these files.
I have included the example XML files from
| | 09:43 | both the sitemap and the sitemap index
files, along with the Schema files for
| | 09:49 | each of these, in case you have a tool
that can use Schema files in your XML design.
| | 09:54 | So here you have the sample sitemap
XML file that we looked at in the slides.
| | 10:00 | You can see here the various tags.
This is the corresponding schema that goes
| | 10:04 | along with it. The schema file
basically lays out the rules that an XML file
| | 10:11 | has to follow. So you can see that
this is defining what elements are allowed
| | 10:15 | and where they can go inside the
sitemap file. Same over here for the site
| | 10:20 | index. This is the sample file and here
is the schema that goes along with the
| | 10:26 | site index file.
| | 10:27 | So that's a pretty simple example to
get our feet wet with a custom real world
| | 10:32 | XML format. Let's take a look now at a more
complex example and that's the RSS file format.
| | Collapse this transcript |
| Understanding RSS| 00:00 | Okay, so the next real-world XML format
that we are going to take a look at is
| | 00:04 | the RSS format. RSS stands for Really
Simple Syndication and you may have heard
| | 00:11 | the term used before.
| | 00:13 | Essentially, RSS is a family in fact
of formats that are intended to publish
| | 00:18 | information that is updated over time
and you have probably heard of RSS used
| | 00:24 | for things like blogs or news headlines.
But the reality is it can be used to
| | 00:29 | publish information about any kind of
content that can be syndicated, whether a
| | 00:34 | stock information or
podcast or anything like that.
| | 00:38 | These are examples of information that
are delivered in small, easy to consume
| | 00:42 | chunks, and can be individually stamped
as discrete pieces of data. Typically,
| | 00:47 | RSS content is consumed by "Feed
readers" that present the information in a
| | 00:52 | friendly way because it is a lot
easier than reading raw XML code.
| | 00:57 | RSS was originally developed and
published by Netscape back in 1999. But then,
| | 01:04 | they abandoned work on the effort and
the RSS effort was carried forward by a
| | 01:09 | bunch of other individuals, and RSS
actually has a pretty long and torturous
| | 01:15 | history behind it, and I'm not
going to bore you with all the details.
| | 01:19 | But the most current version, which is
version 2.0, was published back in 2002.
| | 01:25 | Today, the specification for 2.0 lives
at the website that you see listed on
| | 01:33 | your screen here. That's http://cyber.
law.harvard.edu/rss/rss.html. Now, over
| | 01:40 | the years, RSS has grown and
evolved through several versions.
| | 01:44 | The most popular versions that are
available now are RSS 1.x and RSS 2.x and it
| | 01:52 | turns out that RSS 2.x outlays RSS 1.x
by a factor of almost 2:1 according to a
| | 01:58 | recent, at the time of this recording
measurement by a website named Syndic8
| | 02:03 | which we'll take a look at in a moment.
| | 02:06 | For a moment though, let me pop over
to the browser really quick, so you can
| | 02:09 | see the specification that is
contained at that harvard.edu address.
| | 02:14 | So I'm going to switch over to the browser.
Okay, so here we are looking at the specification.
| | 02:19 | This is the spec that currently
describes the RSS 2.0 file format and you can
| | 02:24 | see, it is a fairly long document. It
explains all about what RSS is and shows
| | 02:30 | some sample files and explains some of
the elements in the file. Basically, it
| | 02:35 | goes through all the different content
that can possibly exist in RSS file. So
| | 02:40 | anything that you might want to know
about RSS 2.0 is contained here in this
| | 02:46 | specification. You can see
it is a pretty long document.
| | 02:48 | So what I'm going to do is I'm only
going to cover the most important parts of
| | 02:52 | version 2.0, because that is the one
that is most widely in use. Before I jump
| | 02:58 | back to the slides however, let's
take a look at the Syndic8 website that
| | 03:03 | I mentioned just previously.
| | 03:05 | So I'm going to go here to syndic8.com
and this is a website that tracks usage
| | 03:15 | statistics for various kinds of RSS and
Atom feeds. What we are going to do is
| | 03:22 | we are going to look here down
through until we get to site statistics.
| | 03:27 | You can see that there's a graph here
that shows the RSS versions in use.
| | 03:32 | So I'm going to click on that chart,
and you can see that out of the 558,000
| | 03:39 | feeds that Syndic8 is tracking, RSS is
accounting for the vast bulk of that.
| | 03:46 | Atom is decidedly smaller. Although
Atom is a much newer file format and in
| | 03:51 | fact, we'll cover Atom in the next section.
| | 03:56 | So let's scroll on down here. You can
see that this is the distribution of feed
| | 04:01 | languages and the vast majority are
in English, and there are some more
| | 04:06 | interesting stats down here about
feeds that are available. It's actually
| | 04:09 | really great site to go looking through
to see how RSS and Atom are being used.
| | 04:15 | But in any case, you can see that RSS
still accounts for the vast number of
| | 04:21 | feeds that are out there.
| | 04:23 | In fact, I'm going to go quickly look
over here on the RSS tab. You can see
| | 04:27 | that this is specifically the
distribution of RSS versions. The large pink area
| | 04:32 | right here corresponds to this
2.0 specification right here.
| | 04:37 | So because RSS 2.0 is clearly the most
widely used version, that is the version
| | 04:42 | I'm going to be concentrating on here
in this lesson. So now that we have seen
| | 04:46 | the specification and we have seen some
information about usage statistics for
| | 04:52 | RSS 2.0, we are ready to get started on
the basics of the RSS format, and that
| | 04:57 | is the subject of our next lesson.
| | Collapse this transcript |
| Using required and optional elements in RSS feeds| 00:00 | RSS feeds are composed of a
collection of XML tags and some of them are
| | 00:07 | required and some of them are
optional. Now, all RSS feeds that have the
| | 00:12 | version 2.0 as their version number
have the RSS tag as their root. That goes
| | 00:18 | along with the version attribute that
contains the string 2.0, which clearly
| | 00:23 | identifies the file as an RSS version 2.0 feed.
| | 00:27 | Each RSS tag contains in turn a
single Channel tag, and this is where the
| | 00:32 | content of the RSS feed goes. So if we
were to start building a Bare-Bones RSS
| | 00:39 | feed, it would look something like this.
We would have the RSS tag at the top
| | 00:44 | with the version being 2.0, and then we
would have a Channel tag inside the RSS tag.
| | 00:49 | Now, this is a Bare-Bones RSS feed
and it doesn't do anything at all. So we
| | 00:56 | have to figure out how to add some
content to it. Now, the Channel tag itself
| | 01:00 | has some required elements and the
required elements of the Channel tag are the
| | 01:07 | title and this refers to
the name of the channel.
| | 01:10 | So for example, if you have an HTML
website and that website contains the same
| | 01:15 | information as your RSS file. In
other words, your RSS file is just a
| | 01:20 | syndicated version of the content on
your site, then the title of the channel
| | 01:24 | should be the same as the title of the
website and I've provided an example over here.
| | 01:29 | So if I have my website joemarini.com
and I named the website Joe's news and
| | 01:35 | information, if I had an RSS feed
that provided essentially the same
| | 01:40 | information as the site,
I would name it the same name.
| | 01:42 | On the other hand, if you have RSS
feeds that provide more specialized
| | 01:47 | information, for example, if you have
an RSS feed that lists the number of
| | 01:52 | times that you will be speaking in
an upcoming given period of time or
| | 01:56 | publications you've put out and when
that happened, or places you have been to
| | 02:00 | for lunch, and what dates you went.
Then obviously you are free to name those
| | 02:06 | other names. But name your RSS feed
that provides the same information as your
| | 02:10 | site, the same name as your site.
| | 02:12 | The Link tag is also required. The Link
tag provides a URL which indicates the
| | 02:17 | HTML website that corresponds to the
channel. So for example, if I had an RSS
| | 02:22 | feed that correspond to my website and
my website was joemarini.com, I would
| | 02:26 | place that URL in there as well.
| | 02:29 | Then finally the Description tag is
also required, and this is a short
| | 02:32 | description, a sentence or two, maybe
three describing the channel. So if we
| | 02:37 | were to update our previous example
code using what we now know, the RSS feed
| | 02:44 | will start to look like this. So we
would have the RSS and Channel tags that we
| | 02:47 | had before, and we would have the
title link and Description tags.
| | 02:52 | Now, this is beginning to look a little
bit more like a real RSS feed, but not
| | 02:56 | exactly very useful because it
doesn't contain anything except for the
| | 03:00 | information about the channel. In
order to make this RSS feed useful, we have
| | 03:04 | to add item tags to it, and item tags
go inside the channel, and they specify
| | 03:10 | information about each individual piece
of syndicated content, and that's what
| | 03:15 | we see here.
| | 03:16 | So each item tag also has a set of sub-
tags or child tags that are required or
| | 03:24 | optional. So the tags I have listed
here and technically speaking, all of the
| | 03:29 | tags specified for item are optional.
However, at least one title or one
| | 03:35 | description has to be present.
| | 03:38 | So the Title tag, right here, that is
the title of the individual item, and
| | 03:43 | I have provided an example. So one item
might be named Joe goes to the movies,
| | 03:47 | and the link which is a URL to that item,
this is a URL that a feed reader can
| | 03:53 | use to open up the larger
piece of syndicated content.
| | 03:57 | In this case, it might be
something like URL to my website and then
| | 04:01 | JoeAtTheMovies.html, and the
Description tag which contains a sentence or two
| | 04:08 | or three or so which describes the
content of the item. In this case, it's just
| | 04:13 | a short description. Here is what
I saw at the movies last weekend.
| | 04:17 | Now, just a quick note. According to
the RSS 2.0 specification, the Description
| | 04:22 | tag is allowed to have HTML content
in it, and since you are going to be
| | 04:27 | embedding HTML content inside XML file,
you have to do some special encoding,
| | 04:33 | and I'll cover that later.
| | 04:35 | If we were to now take our RSS feed
and apply what we now know, it would look
| | 04:41 | something like this. We have our RSS tag,
we have our Channel tag, we have got
| | 04:46 | the tags up here which describes the
channel, and then we have a couple of
| | 04:50 | items. So this item here is Joe goes
to the movies. It has got a link, and a
| | 04:55 | description, and this item here is
Joe has lunch, and there's the link and
| | 04:59 | description of what I had for lunch.
| | 05:01 | This here is actually a fully formed
and proper RSS feed. Again, it is not
| | 05:06 | particularly rich or useful because it
doesn't contain things like publication
| | 05:11 | dates or information about authors and
so on, and we'll cover that more in the
| | 05:16 | next section when we see
more of the RSS file format.
| | Collapse this transcript |
| Enriching the RSS feed| 00:00 | Okay, let's continue our coverage of
the RSS file format by taking a look at
| | 00:06 | some of the optional
elements of the Channel tag.
| | 00:09 | Now, this is not an exhaustive list.
If you want to see every single tag that
| | 00:15 | the channel can possibly contain,
I urge you to refer to the specification at
| | 00:21 | the URL I provided earlier. These are
just a selection of some of the more
| | 00:26 | important tags that you should know about.
| | 00:28 | So the Language tag starting right here
at the top indicates the language that
| | 00:32 | the feed is written in. This is
written using the W3C style language
| | 00:38 | indicators. So for example, en-us
indicates English in the United States, and
| | 00:45 | you can look up a whole bunch of
these language codes on the W3C's website.
| | 00:50 | The Generator tag indicates the
software package that created the feed and in
| | 00:54 | this case, I have got an example here
called MyRSSPackage 2.0. But if you edit
| | 01:00 | it by hand, you can also just
simply include the string by hand.
| | 01:03 | The Image tag specifies an image that
can be displayed with the channel, and
| | 01:09 | there are typical constraints on the
size that this image can be, and what you
| | 01:15 | refer to the spec for that, it is not
very big. It'd simply be like 140 pixels
| | 01:19 | high by 80 pixels wide or something like that.
| | 01:21 | The Copyright tag is the copyright
notice for the channel content.
| | 01:25 | So if you want your information to be
copyrighted, you can put a copyright notice in
| | 01:29 | using this tag.
| | 01:30 | The Publication date for the channel
content is indicated by the pubDate field.
| | 01:35 | Now, this is a date field that indicates
what date that the publication happens on.
| | 01:41 | So for example, if you publish
this everyday, then this date would flip
| | 01:46 | every 24 hours.
| | 01:48 | The Category tag indicates the
category for the news and you can use as many
| | 01:53 | category tags as you want. So here
I have got one example for news, but if
| | 01:58 | you had a channel that fit into more
than one category, you can use as many of
| | 02:04 | these category tags as you feel adequately
describes the categories your feed fits into.
| | 02:09 | Then there's the lastBuildDate. The
lastBuildDate is different from the pubDate.
| | 02:13 | This is the last time that
the channel content changed. So you may
| | 02:18 | publish on a regular schedule and
the content may change on a different
| | 02:23 | schedule. So then, you don't
necessarily need to do the same thing.
| | 02:26 | The last tag I'm going to point out
which is optional for the channel is the
| | 02:30 | Rating tag, which is the rating for
the channel and this conforms to the PICS
| | 02:36 | standard which is specified by the W3C,
and it's described at this URL right
| | 02:43 | here. So if you want to learn more
about that, you can investigate that URL.
| | 02:47 | The Item tag also has optional elements.
The author, right there at the top,
| | 02:51 | that indicates the person who wrote
this particular item and the email address
| | 02:56 | of the author. So for example, each
individual item can have a separate author.
| | 03:03 | This is what the content will look
like. In this case, it is joe@joe.com.
| | 03:06 | The Category indicates the category
for this item. So just as you can have
| | 03:10 | categories for the Channel tag, you can
also have categories for an individual
| | 03:15 | item, and again, you can use more than one here.
| | 03:19 | The pubDate is the publication date
for this particular item. So this is the
| | 03:23 | date at which time this particular
piece of information was published and added
| | 03:28 | to the feed.
| | 03:29 | The next field is interesting. It is
called guid. A guid means a globally
| | 03:34 | unique identifier and it is an
identifier that is unique for this item. There
| | 03:38 | are no rules for the format of guids.
You can use anyone of a number of
| | 03:44 | schemes. Though they usually
take the form of URIs or URLs.
| | 03:50 | If they have an optional attribute
named is PermaLink = "true", then the blog
| | 03:56 | reader or the feed reader to be more
specific can assume that the guid can be
| | 04:01 | used to open a link to the item in the browser.
| | 04:04 | Then finally, there's the Enclosure
optional element which describes a media
| | 04:09 | object that is attached to the item.
This is how podcasting is achieved. There
| | 04:14 | are three required attributes if
you are going to use the enclosure.
| | 04:17 | There is a URL attribute which
indicates where the item is located on the
| | 04:21 | Internet and you can see I have got
an example over here. So for example,
| | 04:24 | if I was creating a podcast and I was
creating mp3 files, for each item I would use
| | 04:30 | an enclosure tag which
specify the URL to the mp3 file.
| | 04:35 | There is the Length, which is the size
of the item in bytes. So I have got that
| | 04:39 | here and then the MIME type of the
item, which in this case would be audio
| | 04:45 | MPEG. But it could be anything else
based upon what the content of the
| | 04:51 | information is. It might be some other
audio file format or video or what have you.
| | 04:55 | So now, let's go back and take a look
at our RSS sample feed, because now we
| | 05:01 | have much richer information that we
can include in the feed format. So here we
| | 05:06 | have the RSS and Channel tags that we
started out with and the title link and
| | 05:11 | description which are the
required parts of the channel definition.
| | 05:16 | Well, that now we have also added the
optional language here, specified as US
| | 05:20 | English and the generator. I've put
in this By Hand because I don't have a
| | 05:25 | software package that made this one,
and the publication date. The publication
| | 05:28 | date was Wednesday, 5th of March, 2009 at 2 AM.
| | 05:32 | So far this feed only has one item in
it and again, we have added some more
| | 05:37 | rich information here. Rather than just
the title link and description, there's
| | 05:41 | also a pubDate and author and a
guid which happens to be a PermaLink.
| | 05:47 | So let's talk a little bit about
including HTML content in RSS feeds. There's a
| | 05:52 | couple of ways that you can achieve
doing this. I'm going to talk about the two
| | 05:56 | most common. Now, as I said earlier,
the RSS specification for RSS 2.0 allows
| | 06:01 | HTML to be included in the
Description tags of items and channels.
| | 06:06 | In order to make this work however,
you can just simply shove HTML code with
| | 06:10 | all of its angle brackets and
everything inside the RSS file. You have to
| | 06:15 | encode the HTML before you put it in.
| | 06:18 | Now, the first way of doing this is
encoding the HTML tags by doing what is
| | 06:22 | known as escaping. You can see here
that I have got the entity encoding for the
| | 06:29 | less than sign, and then a bold and
then a greater than sign, and over here,
| | 06:34 | I have got another bold tag.
| | 06:36 | If you look at this in HTML, this <
would be a left-angle bracket like it on
| | 06:42 | this description right here and then
this > would be the greater than angle
| | 06:49 | bracket. So I have entity encoded
the HTML here, and included it in the
| | 06:53 | Description tag.
| | 06:55 | The other way to do it is to leave the
HTML just as it is, but put it within
| | 07:00 | what is known as a CDATA section. CDATA
sections are standard parts of XML and
| | 07:07 | they are declared by using an
angle bracket with an exclamation point,
| | 07:11 | a bracket, the word CDATA,
and then another opening bracket.
| | 07:15 | And then you can just put your HTML code
| | 07:17 | right here inside the CDATA section and then
close it off by two brackets and an angle bracket.
| | 07:24 | The CDATA section basically tells the
XML parser that's reading the XML feed,
| | 07:29 | don't worry about what is in here.
It is character data. You don't have to
| | 07:32 | worry about parsing it. You can just skip
over it for the purposes of trying to find tags.
| | 07:36 | Okay. Well, now we know enough to
create our own RSS feeds. Let's move on now
| | 07:42 | and take a look at our next real-world XML
file format, which is the Atom file format.
| | Collapse this transcript |
| Understanding the Atom Syndication feed| 00:00 | Okay, the next real-world XML format
that we are going to take a look at is
| | 00:03 | called the Atom Syndication Feed and
the Atom Format is a term that applies to
| | 00:10 | two related formats. The first one is
called the Atom Syndication Feed and that
| | 00:13 | refers to web data feeds. You can think of this
as analogous to being the same thing as RSS.
| | 00:19 | Atom also defines what's known as
a publication format and that's a
| | 00:24 | specification that deals with creating
and maintaining resources on the web.
| | 00:30 | We are not going to deal with that
particular specification in this section
| | 00:33 | because it's fairly complex. So we are
going to focus on the syndication feed
| | 00:38 | in order to see how Atom
implements an XML format.
| | 00:41 | So just like RSS, Atom is used to
provide information in the form of easily
| | 00:47 | consumable chunks of data from sites
on the web that are updated periodically
| | 00:53 | or in using another term, syndicated.
Then these are typically things like
| | 00:57 | blogs or news site. The Syndication
Spec that we are going to be looking at for
| | 01:03 | web feeds was adopted back in 2005
and you can learn more about the Atom
| | 01:10 | Specification at a website called
atomenabled.org and I have provided the link
| | 01:15 | there. And there's also the ietf
site which contains the full link to the
| | 01:20 | specification for the Atom Syndication
Feed Format and that's rfc4287.
| | 01:26 | And you can see I have also
provided the link for that as well.
| | 01:29 | Atom feeds are composed of a
collection of XML tags just like RSS is. Again,
| | 01:34 | just like RSS, some of these tags are
required and some of them are optional.
| | 01:40 | All Atom feeds like all XML documents
have a root tag and in the case of Atom
| | 01:45 | feeds the root tag is known as the
feed tag. The feed tag contains some
| | 01:50 | required child tags along with
zero or more entry tags and each entry
| | 01:55 | represents an individual piece of content.
| | 01:58 | Now I say zero or more or more because
technically speaking, they are optional,
| | 02:02 | but Atom feeds aren't very useful if
they don't have any entries in them. So
| | 02:05 | we'll take a look at that in a moment.
So to define an Atom feed, you can see
| | 02:10 | I have created a feed tag there and
I have included the XML name space that
| | 02:14 | specifies the name space for Atom in my
xmlns attribute. We are not going to be
| | 02:19 | using that in this example.
| | 02:20 | But if you wanted to include content
from an Atom feed in another document say
| | 02:26 | an XHTML document, you would use the
name space for that. Okay, so that's a
| | 02:29 | quick introduction to what Atom is.
Let's get into the basics of the Atom
| | 02:34 | format now and start building our first feed.
| | Collapse this transcript |
| Using required and optional elements in Atom| 00:00 | Atom feed tags have some required
elements in order to make the feed useful and
| | 00:06 | I have listed the three elements that
are required on the feed tag here.
| | 00:11 | The first one is the title tag. The title
tag indicates the name of what the feed is
| | 00:16 | and this is usually, but it's not
required to be, the name of the website
| | 00:20 | that supplies the feed.
| | 00:22 | Now if it's not the name of the website,
you can make it whatever you want, but
| | 00:25 | in any case, you should not leave this
field blank because it's what most feed
| | 00:30 | readers use to display the name of
the feed to the user and I have provided
| | 00:33 | some examples of each one of these tags,
you can see there are on the right.
| | 00:36 | The next tag is the id tag and the id
tag is a unique identifier for your feed.
| | 00:42 | Now, you are not limited to using URLs,
you can use any value here that is
| | 00:47 | going to be guaranteed to be unique and
there are numerous schemes out there on
| | 00:51 | the web that create unique ids for you,
one you might want to consider looking
| | 00:55 | up in addition to URLs is the UUID
Generator on the web. There's several of those.
| | 01:02 | Usually though, you will just use your
website, address domain in your feed's id.
| | 01:06 | The next required tag is the
updated tag and this indicate for last time
| | 01:11 | that the feed was modified in a
"significant" way. Now the spec does not say
| | 01:16 | what the word significant means, it
leaves that up to the publisher. Usually
| | 01:20 | what this means is the last time that
the content of the feed was modified.
| | 01:25 | Not necessarily the fixed typos or
anything like that, but when the content
| | 01:29 | itself was changed and this specifies
a date. Date values have to conform to
| | 01:34 | one of the formats I have listed there
in the description field for updated.
| | 01:38 | And you can look up any of these on
the web. But I have provided an example
| | 01:42 | over there on the right-hand side
which specifies a date using the four
| | 01:45 | character year, a two character month
and a two character day and the then the
| | 01:49 | character T which separates
the time relative to GMT zone.
| | 01:54 | Okay, so if you go back now and take a
look at our feed tag. It's been updated
| | 01:58 | to reflect the required tags, <title>,
<update> and <id>. You can see what it
| | 02:02 | looks like now. So now we have our
title, which is my Atom feed. We have the
| | 02:06 | updated tag put in there, and we have
an id which points to my website. Okay,
| | 02:10 | let's continue on looking some of the
recommended and optional elements of the feed tag.
| | 02:16 | The top table here lists two
recommended tags offer and link and the lower
| | 02:23 | table lists elements that are
considered to be optional of the feed tag. So the
| | 02:28 | author tag, and you can have more than
one of these. It indicates the author of
| | 02:33 | the feed and as I said, you can have
multiple authors. The author tag is
| | 02:37 | required unless all of the <entry>
elements in the feed have authors as well.
| | 02:43 | And you specify an author tag using
the Atom "Person" construct. The "Person"
| | 02:48 | construct you can look this up in the
spec but it essentially contains three tags:
| | 02:51 | name, email and URL.
| | 02:54 | I have specified the name and the email
over there in the example. Name is the
| | 02:57 | only one that's required. Email and URL
are optional. The link tag identifies a
| | 03:04 | web page that's related to this feed
and every feed should provide a link to
| | 03:08 | itself and you can see over on the
right-hand side there I have provided an
| | 03:11 | example link and we'll see this in action later.
| | 03:14 | Moving on to the optional elements, the
category elements specifies a category
| | 03:20 | that the feed belongs to, you can
have more than one of these. The category
| | 03:24 | basically contains a term attribute
and inside the term attribute you specify
| | 03:28 | the category that you want your feed
to belong to and if you have multiple
| | 03:32 | categories, you can just use multiple
categories and specify a term for each.
| | 03:37 | The contributor tag is similar to the
author. This identifies a person who
| | 03:42 | contributes to the feed and like
authors there can be multiple contributors and
| | 03:47 | these are specified using the same
format as the author tag with name and email
| | 03:51 | and URL and again email and URL are optional.
| | 03:55 | The generator tag indicates the
software that generated the feed and you can
| | 03:59 | put any value in here you want. And
then there's the icon tag, which specifies
| | 04:04 | an optional icon for the feed. And in
the icon tag, you essentially provide a
| | 04:08 | path on your site to the icon.
| | 04:11 | Okay, so now let's go back and take a
look at the feed source. We have updated
| | 04:15 | it now to reflect the feed tag
elements, <author>, <category>, <link> and
| | 04:22 | <icon> and you can see I have put them
in bold there. Okay. So that's a brief
| | 04:27 | introduction to the Atom Format.
What we are going to do now in the next
| | 04:31 | section is move on to adding entries to
our Atom feed and we'll see a complete
| | 04:36 | example at the end of the section.
| | Collapse this transcript |
| Adding entry tags to the Atom feed| 00:00 | Okay, let's move on and take a look at
how we would add some entry tags to our
| | 00:05 | Atom feed. You can see there on the
screen I have a basic entry and it looks
| | 00:10 | something like this XML construct. You
can see that there's an entry tag that
| | 00:14 | wraps some other tags. There's a title, a
link and an id and updated date and a summary.
| | 00:21 | So we'll take a look at how we specify
each one of these tags. Just like the
| | 00:25 | feed tag, the entry tag has some
required elements and not surprisingly, they
| | 00:32 | are pretty much the same as the
required elements of the feed tag. So the title
| | 00:38 | for an entry is the name of the
particular entry and you should not leave this
| | 00:43 | blank because again this is how most
feed readers will present the name of the
| | 00:47 | entry to the user and you can see I have
provided examples again for each one of these.
| | 00:52 | The id again is unique identifier for
this entry and like in the feed tag
| | 00:57 | you can use any value here that's unique.
Usually, you will use your website
| | 01:01 | address domain along with some
additional data that identifies the entry,
| | 01:06 | either a path to the HTML file that
specifies the entry or some other kind of
| | 01:12 | identifier that your blogging system
might use or some other type of unique
| | 01:17 | identifier. The important thing here
is that would be unique and the updated
| | 01:21 | tag just like the feed tag indicates
the last time that this entry was modified
| | 01:24 | in a significant way and again the
spec leaves it up to the publisher to
| | 01:29 | determine what significant means.
| | 01:32 | So for example, if you fixed the typo,
that's not necessarily significant. So
| | 01:36 | you wouldn't need to update the date
there. And again, dates here have to
| | 01:39 | conform to the formats that the feed
updated tag needs to conform to and I have
| | 01:45 | provided an example but you can look
any of these up on the web and see how
| | 01:48 | they work. And like the feed tag,
the entry tag comes along with some
| | 01:52 | recommended elements. Now these are
not required, but they are strongly
| | 01:56 | recommended because they provide
richer information about a particular entry.
| | 01:59 | So entries can have authors just
like the feed can have an author. So the
| | 02:03 | author tag specifies one author of the
entry, and again just like the feed
| | 02:09 | you can have multiple authors. Now if
the feed tag that encloses the entries in
| | 02:14 | this particular feed does not have an
author tag, then the author tag becomes
| | 02:18 | required for entries.
| | 02:20 | So you need to put authors on your
entries if your feed does not have a tag and
| | 02:25 | it's probably a good idea to put
authors on there anyway in case your entry is
| | 02:30 | for some reason copied or
referenced somewhere else.
| | 02:33 | The link tag identifies a web page
that's related to this entry somehow and the
| | 02:40 | spec for Atom contains a lot more
detailed information about what links can
| | 02:44 | contain. But in this example, I have
shown that this link tag links to a
| | 02:49 | related web page that describes this
particular entry. Now the content tag
| | 02:54 | performs the bulk of the work in the
entry because it contains or links to the
| | 02:58 | complete content for this entry.
| | 03:01 | If there's no summary tag, which
follows next then this should be provided and
| | 03:07 | the content as we'll see later can
contain text, or HTML content. It can
| | 03:12 | contain a whole bunch of different
things. And then finally the summary tag
| | 03:15 | provides a brief summary of what this
entry says. If there's no content or if
| | 03:20 | the content is not provided in line in
this particular entry, in other words
| | 03:25 | it's linked to, then the
entry should provide a summary.
| | 03:28 | Okay, let's finish up by taking a look
at some of the optional elements of the
| | 03:32 | entry tag. So entries can have
contributors just like they can have authors and
| | 03:37 | the contributor tag is used to indicate
one of the contributors and it follows
| | 03:41 | the same format as the author tag. You
can have multiple of these for various
| | 03:46 | contributors and one of the nice
things about the Atom Specification is that
| | 03:49 | contributors are distinct from authors.
Entries can have a category tags as
| | 03:55 | well. So the category tag specifies
the category that an entry belongs to.
| | 04:00 | Over there on the right, you see
I have a category and the term is news. The
| | 04:04 | published date indicates the initial
publication time that this entry was made
| | 04:08 | available. Now this is different from
the updated date, publish means this was
| | 04:12 | the first time that the world got to
see this particular entry and it follows
| | 04:16 | the same date rules as the updated tag follows.
| | 04:18 | The source tag is used in the case
where this particular entry was copied from
| | 04:23 | another feed. The source tag is then
used to preserve the child tags of the
| | 04:28 | entry that the entry was copied from.
| | 04:31 | For example, if I had copied this
entry from some other feed, I would use the
| | 04:35 | source to provide necessary things
like the title that it came from, when it
| | 04:40 | was updated, the copyright information,
so on and so forth and I would also
| | 04:43 | provide the id. You can see on the
right-hand side there, its main purpose in
| | 04:47 | life is to preserve information about
the source of this entry and then finally
| | 04:51 | there's a rights tag and you can put
any copyright notice that this entry might
| | 04:56 | have in it and you can see I have
provided some example there with a copyright
| | 05:00 | and my name.
| | 05:01 | Okay, so let's go back and take a look
at our updated entry tag. You can see
| | 05:05 | that we have added the tags that we
were looking at earlier. So in addition to
| | 05:09 | the title and id and updated tag,
I have added the link, a summary, a category,
| | 05:17 | a published date and some content. Now
to put it all together, this is what a
| | 05:22 | finished Atom feed would look like.
| | 05:25 | Right at the top here, you have the XML
declaration and then that's followed by
| | 05:28 | the feed tag which encapsulates the
entire feed and then we have a title and id
| | 05:33 | and updated tags for the feed, those
three are required and then we have a link
| | 05:37 | that specifies where the feed came from
and the author and that would be me and
| | 05:43 | then we have our entry which we just looked at.
| | 05:45 | Okay, so this is a finished example
of an Atom feed. Now like RSS, you can
| | 05:51 | include HTML at various points inside
your Atom feeds and specifically the
| | 05:57 | tags, title, summary, content and
rights can contain HTML code. There's a type
| | 06:04 | attribute that you place on these tags
that determine how this information is
| | 06:09 | encoded. Now the default is text. So
if you don't specify it then that's the
| | 06:12 | default value, and if the type is text,
then the element contains just plain
| | 06:17 | text with no HTML in it.
| | 06:19 | You can see there's an example of
that right here. If the type attribute
| | 06:23 | contains the string HTML, then the
element contains entity escaped HTML. And
| | 06:30 | you see an example of that down here.
So in this example of the content tag
| | 06:34 | contains type of HTML and then inside
the content, I have escaped out the angle
| | 06:40 | brackets that you would
normally see on HTML tags.
| | 06:43 | So instead of putting the term b and
then new title with a closing b in order
| | 06:48 | to make these words bold, I have to
convert the angle brackets into their
| | 06:52 | entity escaped equivalence. The
ampersand and then less than with a semicolon
| | 06:56 | and then the ampersand gt with a
semicolon. There's a list of these and
| | 07:00 | I'll get to those in a minute. And then
finally, if the type is equal to XHTML, then
| | 07:05 | the element contains, XHTML code and it
is wrapped up in a single div tag. And
| | 07:11 | that's an important thing you need to realize.
| | 07:14 | So here you see an example of that.
The contents contains XHTML and I have a
| | 07:19 | div tag with the XHTML name space on it
and then right inside the div tag I can
| | 07:25 | put XHTML code straight in there and
I have to escape it or anything like that.
| | 07:29 | So I found a pretty good list on the
web that lists all of the entity escaping
| | 07:34 | characters that you can use inside
XHTML and HTML. So if you want to follow
| | 07:39 | that link, there you will see all the
different ways that you can entity escape
| | 07:42 | characters in HTML for inclusion in Atom.
| | 07:46 | Okay brings us to the close of the
ATOM format. Hopefully, you learned enough
| | 07:51 | now to go out and make your own Atom
feeds or at least read existing ones and
| | 07:55 | now we are going to move on the next chapter.
| | Collapse this transcript |
|
|
3. XML and JavaScriptUsing XML support in browsers| 00:00 | One of the greatest improvements to
come along in recent years in the browsers
| | 00:03 | has been the dramatic improvement in
the way that they support XML natively and
| | 00:08 | that's going to be the subject
of this section of the course.
| | 00:11 | Using the modern browsers like Internet
Explorer 6.0 and later and Firefox 1.0
| | 00:16 | and later you can work with XML right
inside the browser environment. You don't
| | 00:20 | have to resort to server side stuff,
you can just work with it right there in
| | 00:23 | the browser using JavaScript
and other standard technologies.
| | 00:27 | Now each Browser supports a slightly
different set of functions and objects and
| | 00:32 | properties for working with XML. Now
Firefox had the benefit of being written
| | 00:36 | after the DOM Level 2 Specification
came along. So they used the DOM Level 2
| | 00:41 | methods for things like creating and
loading XML. Internet Explorer had XML
| | 00:46 | support a little bit earlier and since
the DOM had yet not addressed some of
| | 00:50 | these issues, they used their own AP
I for things like Loading XML and creating
| | 00:54 | it from scratch and they used the
MSXML2.DOMDocument ActiveXObject.
| | 01:00 | Now the main differences are in
creating and transforming documents, things
| | 01:04 | like parsing and serializing and we'll
get into all these terms in a moment.
| | 01:07 | But the important thing to remember is
that the DOM API for working with XML
| | 01:12 | content like nodes and document
elements and so on, that's consistent across
| | 01:16 | the browsers.
| | 01:17 | So what can you do with the built-in
capabilities of the browsers? So if you
| | 01:22 | want to work with XML in the Browsers
there are a bunch of things that you can
| | 01:24 | do just natively working with their
existing capabilities. You can create new
| | 01:28 | XML documents from scratch, you can
also load documents from the network or
| | 01:33 | from local files, and you can do these
independently from the AJAX Objects that
| | 01:38 | you might be familiar with. This
capability has existed for some time now in
| | 01:42 | the Browsers.
| | 01:43 | You can load XML documents directly
from a string content using XML parsing
| | 01:48 | that's built into the browser itself.
You can transform XML using XPath and
| | 01:53 | XSLT and that's the subject
that we'll cover a bit later.
| | 01:57 | You can also serialize an XML document
to a string. The word serialize refers
| | 02:01 | to the process of taking structured
content like XML and saving it out to a
| | 02:06 | format that can be persisted somewhere
usually a string or file or something
| | 02:10 | like that. And you can
manipulate XML content using the XML DOM.
| | 02:14 | We are the point now where we can take
a look at how Firefox supports XML. So
| | 02:19 | let's go ahead and do that.
| | Collapse this transcript |
| Understanding XML in Firefox| 00:00 | Okay, so as I mentioned earlier the
browsers have the ability to create XML
| | 00:03 | documents from scratch and to do
them in Firefox, you call the function
| | 00:07 | document. implementation.
createDocument. That's the method that you use in
| | 00:12 | order to create a new document and you
can see the example right here. This is
| | 00:16 | how it's called. It takes a few
arguments. The first argument is the Namespace
| | 00:20 | URL to use and we are not going to
get too deeply into namespaces right now
| | 00:24 | because they are fairly complex objects,
but you should just know that you can
| | 00:26 | pass an empty string for this.
| | 00:28 | The second argument is the name of the
RootTag that you want to be at the base
| | 00:33 | of the XML document. And again you can
pass them to strings for that, but if
| | 00:36 | you pass a string here that will become
the root tag that's at the base of the
| | 00:41 | document. And the third argument is the
document type and again this is bit of
| | 00:45 | an advanced concept.
| | 00:47 | For now we are going to use the
constant null and in fact in most of the real
| | 00:51 | world situations, this is what you will
use any way. So I'm not going to go too
| | 00:54 | deeply into that and as I mentioned
the NamespaceURL and sRootTag can both be
| | 00:58 | empty strings, but typically what you
will want to do is at least pass a tag
| | 01:03 | name as the second argument to
save yourself a little bit of typing.
| | 01:07 | Now once the document has been created,
you can use the standard DOM methods to
| | 01:12 | create content. Here is an example of
that. This is a complete example in and
| | 01:16 | of itself. You see at the first line
what we are doing here is we have got a
| | 01:19 | variable named xmlDoc and we are
calling the createDocument method, and here
| | 01:24 | I'm passing an empty string for the
namespace and word myroot as the RootTag
| | 01:31 | name and the third argument is null.
| | 01:33 | So once the XML document has being
created we can start making content using
| | 01:37 | the standard DOM methods. So in this
example I'm creating a new paragraph tag
| | 01:41 | using the createElement function.
That's going to create this P tag and then
| | 01:45 | I'm going to create some text to go
inside the paragraph, and I do that using
| | 01:49 | the standard DOM createTextNode function.
| | 01:52 | Once I have done that, I append the
text into the paragraph and I append the
| | 01:57 | paragraph into the document. And if we
were to look at this in XML text form,
| | 02:03 | the result will look like this. We have
myroot, because that's the RootTag that
| | 02:06 | we passed into the createDocument
function, and there's our paragraph tag and
| | 02:10 | this the string 'this is some text.'
So this is an example of how you can
| | 02:14 | create XML right in Firefox.
| | 02:16 | Now you can also create an XML document
by directly parsing a text string that
| | 02:22 | contains XML code. In some instances
this may be easier and faster because if
| | 02:27 | you have a small piece of XML code
that you need to parse in, this is just a
| | 02:31 | few lines of codes.
| | 02:32 | So the way you do this in Firefox is
by creating a DomParser object and then
| | 02:37 | you call it's parseFromString method
to parse the data. We can see an example
| | 02:41 | for that right here. So here I have a
variable named oParser and it's an object
| | 02:46 | reference to this newDOM Parser object
that I'm creating and again this is an
| | 02:50 | example that works in
Firefox, we'll cover IE in a bit.
| | 02:53 | Then we have a variable here named
xmlDoc and xmlDoc is being assigned the
| | 02:57 | result of the Parser's
parseFromString method. So this is another way of
| | 03:01 | creating a document from scratch. You
can see here that I'm passing in the XML
| | 03:06 | code in text form and this is the same
text that we had in our previous example.
| | 03:10 | And the second argument to
parseFromString is the MIME type that you are going
| | 03:15 | to assign to the content and for XML
this is going to be application/xml and
| | 03:19 | the result of this will be an XML
document just like we saw in the previous
| | 03:25 | example. The only difference here is
that we are creating from a string rather
| | 03:29 | than using the DOM methods to create the tags.
| | 03:31 | We are not done yet however. There are
other ways to get hold of XML documents.
| | 03:37 | You can load XML content from a URL and
this can either be from the network or
| | 03:43 | from the local file system, which
is useful when you are building your
| | 03:46 | application and debugging it. And the
way that you do this is by using the load
| | 03:50 | method and XML content can be loaded
either synchronously or asynchronously and
| | 03:55 | we'll look at both examples.
| | 03:56 | So the example here, I have a variable
named xmldoc and I'm creating it using
| | 04:02 | the createDocument method like I did
earlier on. In this case, however
| | 04:06 | I'm going to load it synchronously.
Now the default is to load documents
| | 04:10 | asynchronously. So you have to
explicitly set the async property on
| | 04:15 | the document object to false, if you want
to load things in an asynchronous fashion.
| | 04:20 | So once I have done that, I call the
xmldoc.load function and I pass in the URL
| | 04:25 | where I want to load things from. Now
I can just give it a file name and it
| | 04:29 | will load it from the same directory
that this page came from, or I can give it
| | 04:33 | an http address and it will load from
that address as well, subject to all the
| | 04:37 | security instructions that your browser has.
| | 04:40 | Okay, you can also as I mentioned load
from a URL asynchronously. And what that
| | 04:46 | means is when you call the load function,
the load function is going to go off
| | 04:51 | and start loading the document but
it's going to return immediately, so your
| | 04:55 | script can continue executing. And
when the document has finished loading, an
| | 04:59 | event will be fired by the browser
and you can use that event to call a
| | 05:03 | function that needs to be called
when the document finishes loading.
| | 05:06 | In this example I have got my same
document and I have created it up here using
| | 05:11 | document.implementation.
createDocument and I'm passing in root and a null
| | 05:17 | value. And then I set the async flag
here to be true. Now when I do that I need
| | 05:22 | to do things a little bit differently.
I need to set the onLoad event handler
| | 05:26 | for the document to be a function
it's going to be called when the document
| | 05:29 | finishes loading. In this case I have
set it to be a function called docLoaded
| | 05:34 | and I'm passing in the xmldoc
as an argument to that function.
| | 05:38 | So now when I call the load function
the browser is going to go ahead and go
| | 05:43 | off and start loading the document,
but any JavaScript statements I have
| | 05:46 | following the load statement here are
just going to keep right on executing. So
| | 05:51 | you shouldn't execute any statements
that depend on the document being loaded
| | 05:54 | until your asynchronous event handler has
been called, and that's this function right here.
| | 06:00 | So in this case docLoaded will be fired
when the document has finished loading.
| | 06:05 | Now that we have seen how to work with
XML and Firefox, it's time to look at
| | 06:10 | some real life examples.
| | Collapse this transcript |
| Using XML in Firefox| 00:00 | Okay, so here we are in the code.
What I'm going to do now is write a few
| | 00:05 | example functions that exercise some
of the methods we just learned about for
| | 00:10 | working with XML in Firefox.
| | 00:13 | So this is the document that I have here.
I'm at my starting point. Let me just
| | 00:17 | scroll down so you can see the code.
We have got four functions that we need to write:
| | 00:21 | createXMLDocument, loadXMLDocument,
loadXMLDocumentAsync and
| | 00:26 | parseXMLDocument.
| | 00:27 | So this is going to be a little test
harness for us to try out our newfound XML
| | 00:33 | skills in Firefox. So I'm going to
write each one of these functions and you can
| | 00:36 | see that in this script block that is
going to execute right here at the bottom
| | 00:40 | of the script tag. So I'm not going
to anything fancy like set up event
| | 00:44 | handlers or anything like that.
| | 00:46 | So let's write the createXMLDocument
example first. Remember to create an XML
| | 00:52 | document the first thing that we need
to do is have a variable to hold the document.
| | 00:55 | So I'll write in the xmlDoc. And then I'll
write document.implementation.createDocument.
| | 01:09 | Okay. Now also recall that createDocument
takes a few arguments so I'm going to
| | 01:13 | pass in an empty string for the name
space and I'm going to pass in the
| | 01:16 | string myroot for the root tag name and
passed in null for the last parameter.
| | 01:23 | So this will create the document.
| | 01:25 | Now we are going to use the DOM
functions to create the content like we saw in
| | 01:28 | the slide earlier. So I'm going to
write var oPara = and to create elements,
| | 01:36 | we tell the document to create them.
So we say xmlDoc.createElement and
| | 01:42 | we're going to create a p tag here.
| | 01:47 | Now we are going to create the text to
go inside the paragraph. So we'll say
| | 01:50 | var oText = xmlDoc.createTextNode and
inside the TextNode we are going to put
| | 02:02 | 'This is some text'.
| | 02:05 | Now we need to append that text
into the paragraph. I'm going to write
| | 02:09 | oPara.appendChild and that's going to
append in the TextNode. Now we need to
| | 02:18 | put the paragraph into the
documents, so I'm going to say
| | 02:20 | xmlDoc.documentElement, because
the documentElement recall from the DOM
| | 02:27 | is always at the root of the XML
document. And we are going to tell
| | 02:32 | the documentElement to appendChild
and that's going to be the oPara.
| | 02:39 | So now we are going to see a trick
that we have not directly covered in the slides,
| | 02:44 | yet this is how to serialize
an XML document into a string and we are
| | 02:49 | going to do this so that we can call
it an alert function and display the
| | 02:52 | contents. So what I'm going to do now
is write alert and that's going to show
| | 02:56 | the XML code.
| | 02:58 | What I'm going to do now is type in
new XMLSerializer and that creates a new
| | 03:05 | XMLSerializer object and this works in
Firefox and to get the text content of
| | 03:12 | the XML document or more accurately
to get a text representation of all the
| | 03:17 | tags and all the content, I'm going to
call the serializeToString method and
| | 03:28 | I just need to pass
in the XML document node.
| | 03:31 | Now I can do this for any node in the
document; it doesn't have to be the XML
| | 03:35 | document itself. But in this case,
I want the text for the entire document,
| | 03:40 | so I'm going to pass in the
root node of the document.
| | 03:43 | So now we are at a place where we can
try this out. So I'm going to browse this
| | 03:48 | in Firefox. Browse With, and you can do
this using whatever tool you happen
| | 03:54 | to be using if it has some built-in way of
launching a browser. If not just save the file,
| | 03:58 | go out to the file system and
bring it up in your browser.
| | 04:01 | So we're going to launch Firefox here.
You can see that the alert is being called
| | 04:06 | and sure enough there's the root tag,
myroot, and there's the paragraph we created
| | 04:10 | and there's the text content.
| | 04:12 | So it seems to work just fine so let's
move on to next example. So now we are
| | 04:17 | going to create the loadXMLDocument
feature and the loadXMLDocument example
| | 04:22 | what we are going to do is load a
local file and the local file is this right here,
| | 04:27 | the businesscard.xml file. So
this is saved in the same directory as the
| | 04:33 | page we are working on. So I'm going
to go switch back to the code here.
| | 04:36 | So to load a document, remember what we need
to do. First, we need to have our variable to
| | 04:42 | hold the document, so
we'll write var xmlDoc and let's say
| | 04:47 | document.implementation.createDocument.
| | 04:55 | Now in this case, I'm going to pass
empty strings for both the root tag
| | 04:59 | and the name space, because we are
going to load an entirely finished
| | 05:01 | document, so I don't need to pass a
root tag in. then I'm going to type in null
| | 05:06 | for the third parameter. So now
I have created an empty XML document.
| | 05:10 | Now I'm going to load this document
synchronously. So to do that I need to say
| | 05:14 | xmlDoc.async = false. Otherwise this
would default to true, so I need to
| | 05:21 | explicitly do this. And it's always a
good idea to explicitly write what your
| | 05:26 | intentions are anyway rather than rely
on implicit behavior, because for all
| | 05:30 | you know in the future that make change.
| | 05:32 | So once I have done that I just need
to write xmlDoc.load and since the file
| | 05:38 | is in the same folder, all I need to
do is write the name of the file here.
| | 05:43 | So that's businesscard.xml and we are
going to use the same serialization trick in
| | 05:52 | all of our examples. So I'm just going
to ahead and paste that in down here.
| | 05:55 | So let's go ahead and comment out the
createDocument since we already know that
| | 05:59 | that works and now we are going to
called loadXMLDocument. All right, so let's
| | 06:03 | go ahead and view this in the browser.
| | 06:10 | And you can see that the XML document
| | 06:13 | loaded properly and it got serialized
out to a string and here it is in the alert.
| | 06:18 | All right, so far so good.
We're two for two, let's keep on going.
| | 06:21 | Now we're going to do the same thing.
We're going to load the XML document, but
| | 06:25 | we're going to do it asynchronously. So
once again I'm going to create the document.
| | 06:29 | I'm just going to copy and paste this
line here and now this time we are going
| | 06:33 | to do it asynchronously. So to do
it asynchronously, I need to set the
| | 06:37 | xmlDoc.async = true. Now again this is
the default, but I find it's better to
| | 06:43 | be explicit.
| | 06:45 | Now we are going to set the onload
handler for the xmlDoc. So that's this guy here,
| | 06:49 | xmlDoc.onload, and we are going
to write = function and inside this function,
| | 06:57 | we are just going to say alert.
And we are going to do the same
| | 07:03 | serialization trick that we have been
doing. So I'm going to copy that and
| | 07:08 | I'm going to paste it into my alert here.
| | 07:12 | So now that we have written the asynchronous
version, let's go down and comment out
| | 07:16 | the one we know works. So I need
to add the actual call to load the document.
| | 07:23 | So I'm going to write
xmlDoc.load and it is going to load the
| | 07:32 | businesscard.xml. So let's
go browse this in Firefox.
| | 07:41 | There we go and you can see
that it loaded asynchronously.
| | 07:45 | So that's an example of loading the
file both synchronously and asynchronously.
| | 07:49 | So now I'm going to close the browser.
| | 07:52 | The last example that we are going to
write now is parsing a document from a
| | 07:57 | text string and so what we are going
to do is write the parseXMLDocument
| | 08:04 | function and that's this guy right here.
So let's go comment out the previous
| | 08:10 | example. So what we are going to do
now is write var oParser = new DOMParser.
| | 08:25 | Once we do that we need to call the
parser's parseFromString method and assign
| | 08:31 | that to be an XML document, so we'll
write var xmlDoc equals and now it's
| | 08:38 | just a matter of calling oParser.
parseFromString and we are going to pass in
| | 08:51 | the text example from our earlier slide,
the whole myroot thing. So I need to type
| | 08:58 | myroot and we need to give a closing
tag myroot and we are going to put the
| | 09:05 | paragraph in here along with a closing
paragraph and we are going to write in
| | 09:11 | 'This is some text'.
| | 09:15 | So now we have a text string that
represents a complete XML Document and once
| | 09:21 | we have called the parseFromString
method, once again we are going to use the
| | 09:24 | XML Serializer trick to get a string
that we can alert and that's going to be
| | 09:28 | this guy right here.
| | 09:29 | All right so now we are ready to try
this example in Firefox. Oh, actually
| | 09:33 | before I browse it, I forgot that
there's actually-- parseFromString
| | 09:37 | actually takes another argument,
which is application/xml, pass in the MIME type
| | 09:46 | that the file is going to be in this XML file.
| | 09:48 | All right, so now we are ready to browse
this is Firefox, so let's go ahead and do that.
| | 09:59 | And you can see that we parsed
this document from the string and
| | 10:02 | we are showing it in the alert here.
| | 10:05 | Okay, so that's using XML in Firefox.
I think we are ready to move on now and
| | 10:11 | look at the same kind of
capabilities that Internet Explorer provides.
| | Collapse this transcript |
| Understanding XML in Internet Explorer| 00:00 | All right, let's take a look at how
XML is handled in Internet Explorer.
| | 00:04 | Now that we have seen how Firefox
handles XML, it's IE's turn.
| | 00:08 | So to create a new XML document in
Internet Explorer, use a slightly different
| | 00:12 | syntax. You used the ActiveXObject to
create the instincts of the DOMDocument
| | 00:17 | object type and the most recent
version of this DOMDocument.6.0.
| | 00:22 | The way that you do this is
shown in the example down here.
| | 00:26 | So I have a variable in xmlDoc and
instead of calling the document's
| | 00:31 | implementation create document function
like I do in Firefox, what I do here is
| | 00:36 | create a new ActiveXObject and I
pass in the string MSXML2.DOMDocument.6.0.
| | 00:42 | This will create an XML document the
same as it does in Firefox and in this
| | 00:48 | example, you can see I'm doing pretty
much the same kind of thing with creating
| | 00:51 | document content using the DOM methods.
From this point on, it's pretty much
| | 00:56 | the same method as it is in Firefox.
| | 00:58 | So now here we are creating an element
named rootTag and appending that into
| | 01:02 | the document and creating a paragraph,
creating some text, appending the text
| | 01:07 | into the paragraph and putting the
paragraph into the document. So once you
| | 01:11 | have got the document created, the
DOMAPI for manipulating document content
| | 01:17 | using this data DOM functions
is consistent across the browser.
| | 01:21 | Now just like Firefox, you can
parse XML directly from a text string in
| | 01:27 | Internet Explorer and the way that
you do that is actually really pretty
| | 01:31 | straightforward, you don't have to
create a DOM parser or anything like that,
| | 01:35 | IE makes this pretty easy. All you
need to do is create the XML document like
| | 01:39 | you did in the previous example and
then it's a simple matter of calling
| | 01:44 | the loadXML method on the document object.
And in this case, I'm passing in a
| | 01:50 | string which represents a
functionally complete XML file.
| | 01:55 | So these two lines of code do --
essentially what the Firefox example does,
| | 01:59 | only we don't have to create any
objects other than the document itself
| | 02:02 | because the XML document in IE has a
convenience function that just loads XML
| | 02:07 | right from the string.
| | 02:08 | You can also load XML from a URL in
Internet Explorer, just like you can in
| | 02:13 | Firefox and you used the load method.
Now amazingly enough the code for
| | 02:18 | Internet Explorer is exactly the same
as that for Firefox. I know it's amazing,
| | 02:24 | but they got this both the same. The only
difference is how you create the document.
| | 02:29 | So you see here I have got the
ActiveXObject and in Firefox, you used
| | 02:32 | the create document method but
other than that it's exactly the same.
| | 02:35 | The async property is the same
and the load method is the same.
| | 02:39 | Now again the default behavior is to
load documents asynchronously, so I need
| | 02:44 | to explicitly set the async property
to be false before I called the load
| | 02:49 | method if I want this to be handles
synchronously. And you can see that
| | 02:53 | the example that I'm using here is the same
as the example I used in the Firefox example.
| | 02:59 | And of course, you can load a
document asynchronously. It's a little more
| | 03:04 | involved than what you need to do in
Firefox but not much. So here's the line
| | 03:10 | of code where we create the XML document.
In this case, I'm explicitly setting
| | 03:14 | the async property to true because we
want to load the document asynchronously.
| | 03:19 | Now instead of handling the onload
event like you do in Firefox, we have an
| | 03:25 | event here called onreadystatechange
and this is Internet Explorer's way of
| | 03:29 | handling the asynchronous events. So
onreadystatechange takes a function.
| | 03:34 | I'm declaring that here and inside the
function there's a property, a check called
| | 03:38 | readystate, and readystate can be
one of a bunch of different values.
| | 03:43 | I won't go into all of them here because
you can find documentation on that pretty easily.
| | 03:48 | All you need to know is that when
the readystate property is equal to 4
| | 03:51 | that means that the document has been fully
loaded and documents go through various stages.
| | 03:56 | They go through, you know,
the request stage, the data is being
| | 03:58 | downloaded stage and so on. When state
reaches 4, it's been loaded and so
| | 04:03 | we call our docLoaded function with the
document object as an argument and then
| | 04:09 | we go ahead and called the load function.
And in this case, any statements that
| | 04:13 | follow the load function will just
go ahead and keep right on executing.
| | 04:16 | So you don't want to execute any
statements that depend on the document being
| | 04:20 | loaded until your call back function has been
called and the document has been fully loaded.
| | 04:26 | Okay, so now that we have seen how
Internet Explorer handles loading and
| | 04:30 | creating and parsing XML, let's go take a look
at some real life examples on how to do this.
| | Collapse this transcript |
| Using XML in Internet Explorer| 00:00 | All right, this is the code for the
Internet Explorer examples and just like in
| | 00:06 | the Firefox examples we have got some
functions we need to fill out in order to
| | 00:10 | accomplish the same kind of tasks that
we did in the Firefox example previously.
| | 00:14 | So you can see it's pretty much the
same file. It's got empty content right now
| | 00:19 | and we have the four functions that
we are going to fill out to demonstrate
| | 00:23 | creating, loading, and parsing XML,
the same way that we did in the early
| | 00:27 | example of Firefox.
| | 00:29 | So let's begin by creating an XML
document in IE, and we call to do that,
| | 00:33 | it's a little bit different than Firefox.
We write xmlDoc = and now we write new
| | 00:39 | ActiveXObject and the
ActiveXObject we need to instantiate here is
| | 00:44 | MSXML2.DOMDocument.6.0. Okay, so now
that we have created the document,
| | 00:58 | we need to put some content into it.
| | 01:01 | So what I'm going to do here is say
xmlDoc.appendChild and we are going to
| | 01:10 | create the root tag here. So we are
going to say xmlDoc.createElement and
| | 01:18 | we'll call it my root just like in the
previous example. Then we'll create a
| | 01:28 | paragraph, xmlDoc.createElement,
paragraph element and now we'll create our
| | 01:40 | text element. Some text. Okay, and
we'll put the text inside the paragraph and
| | 02:04 | now we'll put the paragraph inside document.
| | 02:15 | Now to serialize XML content to a
string in Internet Explorer, it's incredibly easy.
| | 02:23 | You don't have to create any
XMLSerializer object or anything like that.
| | 02:28 | All you need to do is watch this
because if you blink, you may miss it.
| | 02:31 | I'm just going to say alert (xmlDoc.xml).
So the XML property is a convenience
| | 02:40 | property provided by E on
every DOM note in an XML document.
| | 02:45 | So just by referencing this property
you can get a string representation of
| | 02:48 | that note. It's a really incredibly
easy way to do it. So we are going to alert
| | 02:54 | the XML content that's in this document
after we finish creating it. All right,
| | 02:58 | so we are ready to go ahead and test
this out. Let's bring it up in IE,
| | 03:01 | see what happens. Okay, so you can see
it worked. Here is the <my root> <p> and
| | 03:08 | this some text, just like we saw
in the previous Firefox example.
| | 03:13 | So now that we have created the create
XML document example, let's look at how
| | 03:19 | we load an XML document. First let me
comment out the create XML document example,
| | 03:25 | because we have already completed
that one. So recall that loading
| | 03:29 | an XML document in IE is the same as in
Firefox. The only difference is the way
| | 03:34 | that we create the document. So we do
that right here. I'm just going to copy
| | 03:37 | that and paste it in. So once I have
done that I'm going to load this document
| | 03:42 | synchronously, which means I have to
explicitly set the XML document's async
| | 03:48 | property to false. Then I call xmlDoc.
load because this is the same code for
| | 03:56 | Firefox we call. So we type in
businesscard.xml. That's the file we are going
| | 04:05 | to load and we are going to use the
same serialization trick that we used in
| | 04:09 | the previous example just alerting
the XML property on the XML document.
| | 04:14 | So now we are ready to test things out
and let me make sure I have commented
| | 04:17 | out the previous example, and I have.
All right, so let's bring this up in IE
| | 04:21 | and see what happens. Okay, you can see
that it worked. So the businesscard.xml file
| | 04:31 | got loaded and here we are looking
at in the alert. Everything seems to be
| | 04:35 | working fine. So let's go back
and go on to the next example.
| | 04:38 | So the next example we are going to
load the same document, but we are going to
| | 04:41 | do it asynchronously. So before I do
that, let me go down here and comment out
| | 04:46 | the previous example. So remember that
loading the document asynchronously is
| | 04:53 | a little different than doing it in
Firefox, but we need to create the document.
| | 04:58 | So I'll just copy that and paste that
in here and we are going to use the same
| | 05:02 | serialization trick. So I'm going
to copy that and paste that in here.
| | 05:07 | So now in this case, I'm going to set
the xmlDoc.async property to be true
| | 05:14 | because we are doing this
asynchronously. So instead of using the onload
| | 05:18 | handler like in Firefox, if I can type
in xmlDoc.onreadystatechange and I set
| | 05:27 | that to be a function. Now in this
function, which will be called when
| | 05:34 | the document's readystatechange event
gets fired, I need to check to see if
| | 05:39 | the readystate is equal to 4.
| | 05:41 | So if xmlDoc.readyState == 4, then
we are going to just alert the XML
| | 05:55 | document's XML content and that
pretty much means we don't need to do this here.
| | 06:02 | All right, so now we are pretty
much just about ready to try this.
| | 06:06 | All that's left to do is xmlDoc.load and
this will kick off the loading process.
| | 06:15 | So we'll type in businesscard.xml.
All right, and I should be ready to go.
| | 06:21 | So here we are creating the document,
setting the async to true. We have defined
| | 06:26 | the readystatechange function and then we
load the document. So let's go ahead and
| | 06:31 | try this in the browser.
| | 06:32 | Okay, seems to be working. Let's
refresh just to make sure. Yeah, there's it.
| | 06:41 | It's loading. Everything seems to work.
All right, let's move on to the final example.
| | 06:45 | Final example is parsing
XML document from a string. To parse
| | 06:53 | a document from a string in Internet
Explorer is a little bit easier than it is
| | 06:58 | in Firefox and that you don't have to
create any objects to do this. There's a
| | 07:03 | convenience function called loadxml,
which does this for you, but we still have
| | 07:07 | to create the document.
| | 07:08 | So I want to copy that line there and
paste it in here and we still want to use
| | 07:13 | the serialization trick to alert the
content. So I want to copy that from here
| | 07:18 | and paste that here. Now the only
thing we need to do is write xmlDoc.loadxml
| | 07:26 | and the loadxml function takes a string,
which contains the XML code that
| | 07:32 | we want to have loaded.
| | 07:33 | I am going to just write that to my root,
on to my root and we are going to put
| | 07:46 | in the paragraph tag and the close
paragraph tag and this is some text.
| | 07:55 | That's all there's to it. So we have commented
out the previous examples, we have got
| | 07:59 | parse XML document ready to be called.
So let's go ahead and view this in
| | 08:03 | the browser to see what happens.
| | 08:09 | And there you have it. So the XML is
being loaded and we are displaying
| | 08:13 | the content in the alert. Okay, so now
you have seen how to work with XML in
| | 08:17 | Internet Explorer and in Firefox.
So let's move on to our next lesson.
| | Collapse this transcript |
| Serializing XML to a string| 00:00 | Now as I demonstrated in the
previous examples both Firefox and Internet
| | 00:04 | Explorer have a notion of
serializing XML to a string and serializing is
| | 00:11 | basically the process of converting
an XML document to a format that can be
| | 00:15 | saved like a string.
| | 00:17 | Now serialization can take other forms
as well, but string is the most common,
| | 00:21 | especially in most of the real world
scenarios you will probably run into. And
| | 00:26 | you would like to do this for several
reasons. First, you can use it for saving
| | 00:30 | XML content either to a file or some
other persistent method, whether it's a
| | 00:35 | stream or something like that.
| | 00:36 | You can also use this to interchange
XML content with another system, something
| | 00:41 | that works with text processing
like say regular expressions which we
| | 00:45 | investigated in the practical and
effective JavaScript title, also available
| | 00:50 | here at Lynda.com.
| | 00:52 | And you also might want to do this to
aid in debugging. So if you are working
| | 00:56 | with XML and you've got a whole bunch
of logic, there will probably come times
| | 01:00 | when you want to display some
debugging messages that contain XML content and
| | 01:06 | in order to do that you need to
serialize the XML to a string before you can
| | 01:09 | write it out to a console or display it in a
alert or whatever debugging method you use.
| | 01:14 | And as I showed, Firefox and Internet
Explorer both have ways of serializing
| | 01:21 | strings. Firefox provides support for
an object called the XMLSerializer and
| | 01:28 | you just call the serializeToString
method on that object. Whereas Internet
| | 01:32 | Explorer makes things really easy, it
just simply provides an XML property on
| | 01:36 | each XML node in the document, which
provides a text representation of the
| | 01:42 | document tree starting at that node.
| | 01:44 | And just to review, to perform Firefox
serialization to a string, you can see
| | 01:49 | in the top example we've got a
document fragment where we create a paragraph
| | 01:54 | element with some text inside of it
and to serialize it out, we instantiate a
| | 01:59 | new XMLSerializer object and call
the serializeToString function with the
| | 02:04 | document as the argument.
| | 02:05 | And as I mentioned earlier that argument
to serializeToString, this can be any node.
| | 02:10 | It doesn't have to be the document.
You can serialize starting at any
| | 02:13 | node in the tree. So we could have
passed in the paragraph or the text or
| | 02:16 | anything like that.
| | 02:17 | And in the Internet Explorer case we
use the XML property on the XML document,
| | 02:23 | and again you can use this on any node.
It doesn't have to be the document, but
| | 02:28 | this is how you get the string
representation of XML in Internet Explorer.
| | 02:32 | Okay, so now that we have covered using
XML in Firefox and IE and we have seen
| | 02:37 | how to serialize the data, it's
time to move on to our next lesson.
| | Collapse this transcript |
| Understanding cross-browser actions with the Sarissa library| 00:00 | At this point we could take what we
have learned using Firefox and Internet
| | 00:04 | Explorer and the ways that they handle
XML differently and we could sit down
| | 00:10 | and write our own cross browser
library for handling XML code.
| | 00:15 | Now thankfully we don't need to do that,
because someone else already has and
| | 00:20 | that library is known as the Sarissa
Library and that's what we are going to
| | 00:25 | cover in this section.
| | 00:26 | So the Sarissa Library is free open
source library that's used for handling XML
| | 00:32 | data and it was written by a guy named
Emmanouil Batsis and it's available for
| | 00:36 | download from the sarrisa.
sourceforge.net website.
| | 00:41 | The Sarissa libraries essentially provide
a cross browser way of working with XML.
| | 00:47 | Processing it, loading it,
serializing it, transforming using XSLT,
| | 00:52 | a whole bunch of cross browser support
for common XML functions, and the API that
| | 00:58 | it uses basically takes the best of
both browsers, Firefox and IE, and creates
| | 01:05 | one unified API for
working with the XML content.
| | 01:10 | Now to use the Sarissa Library in your
projects essentially all you need to do
| | 01:16 | is include the sarissa.js file in
the document where you want to use the
| | 01:21 | functions and we'll take a look at
few examples of that in a few moments.
| | 01:25 | Creating a document using the
Sarissa Library is accomplished by calling
| | 01:30 | getDomDocument() method with the
namespace and root tag that you want to use
| | 01:36 | and this code here works cross browser.
So unlike Firefox or IE, you don't have
| | 01:42 | to do any browser detection or write
any special code; all you need to do is
| | 01:46 | call the getDomDocument method.
| | 01:48 | So looking at the previous examples
that we have used in both the Firefox and
| | 01:53 | IE sections, what we have done here is
changed the way we instantiate the XML
| | 01:57 | document to just call the Sarissa
Library's getDomDocument method and we are
| | 02:02 | passing in two empty strings here,
and then it's just a matter of using the
| | 02:06 | standard DOM functions to
create the document content.
| | 02:10 | So what Sarissa does is it creates an
XML document that's native to the browser
| | 02:15 | that you are in. But the way that it
crates the document is cross browser. And
| | 02:20 | the reason why this works well is
because recall that the DOM API for
| | 02:24 | manipulating XML content once you have
a document is the same in both browsers.
| | 02:28 | So once we have the document
instantiated using this line, the rest of the code
| | 02:33 | is cross browser just by its very nature.
| | 02:36 | Sarissa also provides support for
loading an XML document. In fact the
| | 02:42 | asynchronous version for loading the
document is the same as it is in the
| | 02:47 | Internet Explorer. So if we look at
the example here, you can see that the
| | 02:52 | example for loading an XML document is
slightly different because of the way
| | 02:57 | you get the document. So here we are
calling the getDomDocument function on the
| | 03:01 | Sarissa Library and from that point
forward it's pretty much the same in IE and
| | 03:06 | Firefox. You set the async property to false
and you just load the XML you want to load.
| | 03:10 | For the asynchronous case, the Sarissa
Library emulates the Internet Explorer
| | 03:15 | method inside of Firefox for you. So
here we have our loadXMLDocumentAsync
| | 03:22 | function and here we are
instantiating the document by using the Sarissa
| | 03:26 | Library's version and we set the asyn
property to true. Now Sarissa implements
| | 03:31 | the onreadystatechange and readystate
properties in Firefox, so you can write
| | 03:35 | the same code in IE and Firefox
and it will just work cross browser.
| | 03:40 | The Sarissa Library also implements
the Serializer and Parser objects that we
| | 03:46 | saw in the earlier Firefox example,
only these versions are cross browser and
| | 03:52 | you can call the code the same way that
you would in Firefox. So if we look at
| | 03:56 | our parse.xml document example from
earlier, you can see that the example
| | 04:00 | pretty much looks exactly the way it
would in Firefox only now once you have
| | 04:04 | included the Sarissa Library, this
code also work in Internet Explorer.
| | 04:08 | Okay, we have enough to see if we can
go back, rewrite our earlier Firefox and
| | 04:15 | Internet Explorer examples using the
Sarissa Library, and test them in cross
| | 04:19 | browser environment.
| | Collapse this transcript |
| Creating Sarissa examples| 00:00 | Okay, so here we are in the code.
In this example, we're going to use the
| | 00:05 | Sarissa library to go back and
rewrite our earlier Firefox and IE examples
| | 00:10 | using this cross-browser
library for handing XML.
| | 00:15 | You can see here in the code that this
is pretty much the same document that
| | 00:19 | we started out with both in the Firefox
and IE examples with one major difference.
| | 00:26 | I have included the Sarissa library
here using this script tag. Again, I've
| | 00:33 | provided the URL for downloading
Sarissa in the slides part of the section.
| | 00:38 | So you can go ahead and do that on
your computer and just get the sarissa.js
| | 00:43 | library and you'll be good to go.
Let's go ahead and start rewriting our
| | 00:47 | example to use the Sarissa library and
see if it works cross-browser, like it
| | 00:52 | says it does.
| | 00:53 | For creating a document, what we need
to do is have a variable named xmlDoc.
| | 00:58 | Now remember that what we're going to
do here is use the Sarissa version of
| | 01:02 | getting a DOM document. So I'm going
to type Sarissa.getDomDocument and
| | 01:11 | I'm going to pass in two empty strings.
| | 01:15 | Okay, so now that I have the document,
I'm going to go ahead and start creating
| | 01:19 | the document content using the DOM. So
I'm going to write xmlDoc.appendChild.
| | 01:30 | We're going to create a root element
named myroot. We're going to create the
| | 01:42 | paragraph tag, like we did in the
last examples. We'll create the TextNode.
| | 02:04 | Okay, so far so good. We'll put the
text inside the paragraph and we'll put the
| | 02:17 | paragraph inside the document.
| | 02:29 | Now we're going to do the serialization trick,
| | 02:33 | so that we can display an alert
containing the XML's document content.
| | 02:38 | So we're going to say alert
(new XMLSerializer().serializeToString())
| | 02:52 | and that is going to take the XML document
as an argument. Okay, it looks like we're
| | 03:00 | ready to test this out. We are creating
the document, creating some content and
| | 03:06 | then calling this
XMLSerializer().serializeToString.
| | 03:09 | Now this looks pretty much like the
Firefox example we did earlier, except for
| | 03:14 | this code right here which
instantiates the document using Sarissa.
| | 03:17 | So we'll try it in Firefox
first to see what happens.
| | 03:24 | Okay, you can see that it worked.
| | 03:26 | Here we have the XML document and
it's being displayed in this alert.
| | 03:31 | All right, so now the real test is
does this work in IE? So let's see if that works.
| | 03:37 | Okay, and it does. So now we've
created a document using a cross-browser
| | 03:42 | syntax that works in both IE and Firefox. All
right, let's continue onto the next example.
| | 03:47 | So in the next example we're going to
use Sarissa to load an XML document. So
| | 03:52 | let me comment out the previous
example right there. Okay, so for loading a
| | 03:57 | document, remember, this is the easy
part. So we need to create the document.
| | 04:01 | So I'm going to go ahead and copy
the line from up here that does that.
| | 04:06 | We're going to use the same serialization
trick to show the content. So I'll copy
| | 04:11 | that line and paste that in down here.
| | 04:13 | Now to load the document, we need to
set the async property to false to make
| | 04:19 | sure that we load it synchronously.
Then it should be a simple matter of saying
| | 04:24 | xmlDoc.load. We're going to load the
same businesscard.xml file that we loaded
| | 04:32 | in the previous examples. Okay, so
let's make sure that we've got that function
| | 04:37 | being called and it is.
| | 04:38 | So let's first try it in Firefox.
Okay, there it is and you can see it's working.
| | 04:50 | Now let's try the same thing
in IE. Okay, so far so good. We're two for two.
| | 04:57 | Let's move on to the
asynchronous example. So I'll comment out
| | 05:03 | the previous one here.
| | 05:05 | The asynchronous example follows IE's
model. So we're going to go ahead and
| | 05:10 | copy the line where we instantiate the
document. In this case, we're going to
| | 05:17 | be doing this asynchronously. So let's
say async = true. Now what we need to do
| | 05:25 | is define the onreadystatechange(). So
I'll write xmlDoc.onreadystatechange = function().
| | 05:40 | In the onreadystatechange, we need to
check to see if the xmlDoc's readyState
| | 05:48 | property is equal to 4. If it is,
we're going to do our little serialization
| | 05:55 | tick to show that it loaded. So we'll
copy that guy there, paste it in down
| | 05:59 | here. Then all we need to do after the
readystatechange is load the document.
| | 06:04 | So I'll say xmlDoc.load and it's the
same businesscard.xml file we've been
| | 06:14 | using in our example so far. Okay, so
that should be pretty much all I need to
| | 06:19 | do in order to test the asynchronous
loading. So let's bring this up in Firefox
| | 06:25 | first. Make sure I've got it running
and I do, okay good. Let's bring up
| | 06:28 | Firefox. All right, there it is. So
far so good. One more time in IE, okay.
| | 06:42 | This is looking pretty good.
| | 06:44 | So far we've made through three of the
examples and the cross-browser promise
| | 06:48 | is holding up. So the final example is
parsing the XML document from a string.
| | 06:55 | Now the Sarissa library implements a
cross-browser version of the DOMParser object.
| | 07:01 | So that's what we're going to use.
We'll write var oParser = new
| | 07:10 | DOMParser(); We'll write var
xmlDoc = oParser.parseFromString();
| | 07:27 | Remember that when we call
parseFromString(), it takes two arguments.
| | 07:30 | We need to pass in the string that we
want parsed as well as the MIME type. So,
| | 07:35 | we'll pass in "application/xml" as the
MIME type. We're going to type in our
| | 07:45 | <myroot>, paragraph tags,
and This is some text.
| | 07:58 | Okay, so now we're rewritten the
example to look pretty much like it did in the
| | 08:04 | Firefox example using the DOMParser
object. The only thing we would love to do
| | 08:08 | here is the serialization alert trick.
So I'll copy that and paste it in here.
| | 08:15 | All right, let's hold our
breath. Try it out in Firefox.
| | 08:25 | All right, so it worked in Firefox.
We were able to parse from a string.
| | 08:28 | That's good news. Finally, let's try it in
Internet Explorer and there it is. So using
| | 08:37 | the Sarissa library you can see how
we were able to implement cross-browser
| | 08:41 | examples of handing XML code.
| | 08:44 | So I strongly encourage you go
download this Sarissa library. It's free.
| | 08:48 | You can download it from SourceForge.net
and I've provided the URL. It provides
| | 08:52 | pretty robust handing of
XML content across browsers.
| | Collapse this transcript |
| Understanding the ECMAScript standard (E4X)| 00:00 | Okay, so the last technology I'm
going to cover in this section is called
| | 00:05 | ECMAScript for XML. It's typically
known by its abbreviated name E4X.
| | 00:10 | ECMAScript for XML or E4X is an
international standard. It's defined by the
| | 00:18 | ECMA-357 specification that was
adopted back in December 2005.
| | 00:23 | The whole purpose behind ECMAScript
for XML is to add support for XML as a
| | 00:31 | first class datatype in ECMAScript,
which forms the basis for languages like
| | 00:36 | JavaScript and ActionScript, if
you've ever used Flash or Flex.
| | 00:40 | The real power of ECMAScript is
that it lets me treat XML as a built-in
| | 00:45 | datatype. So you see the example I've
got here where I've got a line of code
| | 00:49 | declaring a variable named j and
setting it to the numerical value 3 or a
| | 00:54 | variable like myStr and
setting it to a string variable.
| | 00:57 | Using E4X, I can do something like this.
I can declare a variable named myXML
| | 01:03 | and just send it to XML code right in
the script without having to do any DOM
| | 01:08 | manipulation or any other kind of
fancy tricks. This is a very powerful
| | 01:12 | feature. It's one of the great things
about ECMAScript as a scripting language
| | 01:16 | is that it provides this kind
of support for working with XML.
| | 01:19 | Its whole purpose, as I said, is to
allow you to work with XML as a native
| | 01:23 | datatype. So where can you find an
implementation of E4X? Well, the known
| | 01:30 | implementations as of this recording,
Firefox 1.5 and later has native support
| | 01:36 | built in for E4X. IE does not
support this technology yet.
| | 01:41 | Adobe ActionScript version 3.0 and
later, which is present in Flash Creative
| | 01:47 | Suite 3.0 and Adobe AIR and Adobe
Flex as well as Acrobat version 8.0 and
| | 01:52 | later, both in the Reader and in the
full Acrobat versions and version 1.6 of
| | 01:59 | the "Rhino" version of the JavaScript
Interpreter engine as well as Aptana's
| | 02:05 | "Jaxer" AJAX Application Server which uses
the Mozilla code on the back end on the server.
| | 02:11 | These are so far some of the better-
known implementations. So if you're using
| | 02:15 | Firefox or if you're writing code in
ActionScript version 3.0 or later, you can
| | 02:21 | use E4X in your code.
| | 02:23 | So there are two main ways of creating
XML using E4X. The first, which we've
| | 02:29 | already seen, is to just assign XML
code directly to a variable in your script.
| | 02:34 | The second way is to use the XML
object as a constructor using the new XML
| | 02:41 | operator. This is what both examples look like.
| | 02:44 | So you've seen the first one already
in the previous slide where I have got a
| | 02:48 | variable and I'm assigning just an XML
code straight to it. The second example
| | 02:52 | down here is using the XML
constructor. Both of these are functionally
| | 02:56 | equivalent, you can use either one of
these. The new XML syntax is obviously a
| | 03:01 | bit more object-oriented. So if that's
your preference, you can go that way,
| | 03:05 | but either one of these is
perfectly fine and valid.
| | 03:08 | What's nice about E4X is the way that it
interacts with JavaScript's native types.
| | 03:13 | So you can evaluate JavaScript
expressions using the brace syntax and
| | 03:20 | I've got an example of that here.
Suppose I had a variable name which contained
| | 03:25 | a text string and I wanted to have
that inserted into XML based on an
| | 03:30 | expression evaluation.
| | 03:32 | I can do that using the brace
syntax as you've seen here. So if I write
| | 03:36 | something like var myXML, and using
E4X I just type out some XML code. I put
| | 03:41 | the name of this JavaScript variable
inside these two braces. Then the result
| | 03:46 | will be as if that variable name gets
substituted using the contents of that
| | 03:50 | variable in the XML. This is a really
powerful way of working with XML code and
| | 03:55 | mixing it with JavaScript logic.
| | 03:56 | E4X provides several ways of
accessing content that's in XML. So it's very
| | 04:03 | straightforward and the good news is
you use common JavaScript syntax in order
| | 04:08 | to do it. So for example, the bracket
and dot operators work the same way for
| | 04:14 | E4X content, as they do for objects.
The at operator, which is a syntax that's
| | 04:20 | borrowed from XPath, is used to
access attributes on a tag. We'll be seeing
| | 04:26 | live examples of this in a few moments.
| | 04:28 | You can also use arbitrary selectors.
For example, if you have an element that
| | 04:33 | has attributes on it and you've got
multiple of these elements with attributes,
| | 04:38 | you can do some basic filtering by
checking to see if an attribute is equal to
| | 04:42 | a certain value to filter out the
selection of certain nodes in your E4X XML
| | 04:48 | content which is really cool
and we'll see how that works.
| | 04:51 | You can also use some built-in functions,
like length() to see how many child
| | 04:55 | tags a parent tag has. Creating XML
content in E4X is also really easy and
| | 05:00 | straightforward. In fact, one of the
main sources of the power of E4X comes
| | 05:05 | from the way it starts to draw the line
between what's a JavaScript object and
| | 05:10 | what is XML content. So for example,
you can use JavaScript operators like the
| | 05:14 | brackets I've already shown, you can
also use the += syntax, etcetera, to
| | 05:20 | create new content in the XML.
| | 05:23 | You can also modify the XML data
directly in place by placing an E4X expression
| | 05:29 | on the left-hand side of an assignment
operator. So imagine I had a bunch of
| | 05:33 | XML content in E4X format. I wrote myXML
.title. Imagine there's a title element
| | 05:39 | and I send it to the string, I would
actually modify the contents of that XML
| | 05:43 | element directly in place by
using common JavaScript notation.
| | 05:48 | Deleting XML content is also really
straightforward. You just use the delete
| | 05:51 | keyword. For example, if I had XML
content and I want to delete the title
| | 05:56 | element, I would just simply type
delete myXML.title. The title element or all
| | 06:02 | of them, if there was
more than one, would be gone.
| | 06:04 | I can also delete individual attributes
by using the at syntax. So for example,
| | 06:09 | to delete the name attribute from
the title, I would simply write delete
| | 06:12 | myXML.tittle.@name. As I mentioned
earlier, you can delete multiple instances
| | 06:17 | of a given element by just
referring to the name of the tag.
| | 06:21 | So a couple of things to note about E4X
and the way it inter-operates with the DOM.
| | 06:27 | It's important to note that E4X
content are XML objects, they're not DOM
| | 06:34 | objects. So E4X content and DOM XML
are not the same. The reason for this is
| | 06:39 | because E4X creates its own object
types and they don't directly operate with
| | 06:44 | the DOM API that's provided in the
browser. However, we can be clever about
| | 06:49 | this in a couple of ways.
| | 06:51 | You can achieve some measure of
interoperability by using the two-string
| | 06:55 | operator on the E4X content. Then you
can go ahead and pass that to a DOMParser
| | 07:01 | object, which will create a DOM
representation of the XML code for you, because
| | 07:06 | remember using the DOMParser we can
create XML content directly from strings.
| | 07:12 | As long as we have a string that we can
pass to a parser, we can create a fully
| | 07:15 | formed DOM document. Remember going the
opposite direction, you can serialize a
| | 07:20 | DOM document to a string using the
XMLSerializer class and you can pass that to
| | 07:27 | the XML constructor to create E4X content.
| | 07:32 | Okay, so I think we've reached the
point now where we've had enough theory,
| | 07:36 | let's go ahead and look at E4X in action!
| | Collapse this transcript |
| Using E4X| 00:00 | Okay, time for some examples with E4X.
So I'm here in the code and I have got
| | 00:06 | my E4X example file open and let me
just scroll through the document so you can
| | 00:11 | see what we are going to be doing.
| | 00:13 | So right up here at the top, I have
got a couple of variables defined.
| | 00:18 | One of them is a string that contains
my name and the other is XML code that
| | 00:24 | I'm assigning to a variable named myXML.
And we have got some functions that we are
| | 00:29 | going to write that demonstrates some
examples and that's pretty much it for this file.
| | 00:35 | So we are going to write these
functions in order and we are going to exercise
| | 00:39 | some of the capabilities of E4X.
Now, remember as far as browsers are
| | 00:43 | concerned, this only works in Firefox.
E4X works also in the latest version of
| | 00:50 | ActionScript but for purposes of our
demonstrations, I'm only going to be
| | 00:53 | showing you these in Firefox
because it has native support built-in.
| | 00:57 | So the first thing I'm going to do is
you can see I have created an alert down here
| | 01:02 | and the alert is going to show
the contents of the myXML variable once
| | 01:07 | the XML has been loaded. And we are
using the same BusinessCard data that
| | 01:13 | we have been using in the external file
cases for loading XML in Firefox and IE
| | 01:20 | and what I have done is I have
copied that XML data right here into my
| | 01:23 | document. So you can go ahead and
copy it along with me if you don't have
| | 01:27 | access to the sample files.
| | 01:29 | The other thing I want to point out is
that I have got a brace syntax here in
| | 01:34 | the name field, so I have replaced my
name with that syntax and you can see
| | 01:38 | I have assigned that variable up here.
So the first thing I'm going to do is save this
| | 01:42 | and bring it up in Firefox to
see if it works and you can see that it does.
| | 01:49 | So the BusinessCard logic got
parsed correctly and you can see that
| | 01:55 | the name inside the brace syntax got
replaced with the expression that evaluated to
| | 02:00 | the string content that's my name
and rest of the XML will be just fine.
| | 02:04 | Okay, so let's go back to the example.
So now that we know that that's working,
| | 02:09 | I'm going to go ahead and comment that
line out. So the first thing that we are going
| | 02:12 | to do is write a few exercises
to see how we can access E4X content. So,
| | 02:20 | the first thing we are going to do
is write a variable named name and
| | 02:25 | we're going to do something very simple.
We're going to say myXML.name and we'll alert that.
| | 02:32 | So, using the dot syntax in E4X you can
refer to any one of the elements in the
| | 02:39 | XML data. You don't have to be
very fancy about how you access it.
| | 02:43 | Here I'm just referring to the variable and
referring to the name of the element that I want.
| | 02:47 | Let's see how that works. I'm going
to go ahead and bring up the browser and
| | 02:55 | you can see that that's bringing up
the text content of the tag that contains
| | 03:00 | my name. So it's not giving me the
entire tag including the angle brackets and
| | 03:04 | the tag name; it's just the text content
which is pretty useful. And in fact,
| | 03:07 | in most real-world scenarios
that's what you want to have happen.
| | 03:10 | Let's get a little more sophisticated.
I'm going to declare another variable
| | 03:15 | named phones and I'm going to assign
that myXML.phone. And what I'm going to do
| | 03:23 | here is alert phones.length, plus a
string, plus the phones variable and
| | 03:39 | I'm going to say plus phone tags found
and we are going to alert the contents of
| | 03:47 | the phones and we'll
comment out these two guys, okay.
| | 03:52 | So what I doing now is accessing the
phone. Notice that there's more than one phone.
| | 03:58 | So let's see what this does.
I'm going to bring up the browser.
| | 04:06 | See, you can see in this case that by referring
to a tag that has more than one instance
| | 04:10 | in the XML data it actually came back
with an array of those tags and you can
| | 04:15 | also see that the length function
returned the number of tags that were in that
| | 04:20 | array. So if you get a little bit more
powerful here, this is really great stuff.
| | 04:24 | Using this kind of information,
I would be able to do things like write loops
| | 04:28 | and that kind of stuff so let's
move on and get a little bit more
| | 04:31 | sophisticated. So now we have seen how
to access an entire array of XML data.
| | 04:37 | Let's write some code that accesses
a specific element in the array.
| | 04:43 | So I'm going to say var phoneType =
myXML.phone. Only this time I'm going to use
| | 04:54 | the bracket syntax to get a specific
phone. So I'm going to use index zero and
| | 05:00 | I'm going to write .@type.
| | 05:03 | So let's take a look at the XML code
to see what's going on here. You can see
| | 05:07 | that phone 0 is going to refer to the
first index in the array of phone tags.
| | 05:13 | The @type syntax refers to the type
attribute that's on that phone.
| | 05:19 | So if all goes well, this should return the
string mobile. So let's test that out.
| | 05:26 | alert (phoneType) and let's see if
that works and let's open the browser.
| | 05:39 | Okay, and you can see that worked.
| | 05:40 | So that's an example of extracting an
attribute from the XML code. Let's keep
| | 05:47 | on going. What I'm going to do now
is do some basic filtering. So suppose
| | 05:53 | I wanted to find the phone that
corresponds to a particular value of the type
| | 06:02 | attribute. Suppose I didn't know
which index I wanted to get the, say, fax
| | 06:07 | phone and I had to find it somehow.
So what I'm going to do is write var phone
| | 06:14 | = myXML.phone. And then in
parenthesis I'm going to write @type = fax.
| | 06:28 | Now, this is where we start getting
really powerful E4X. You can see that
| | 06:31 | we are doing some pretty interesting
filtering operations here using a very simple
| | 06:35 | complex JavaScript-like syntax.
So now I'm going to alert the phone.
| | 06:44 | So what this should do is come back with the
phone that matches the phone tag that has the
| | 06:50 | type of fax on it.
So let's go ahead and do that.
| | 06:56 | And sure enough it works.
| | 06:57 | So, okay, let's move on to some more
stuff. E4X introduces a new looping
| | 07:05 | construct into the ECMAScript or in
this case JavaScript syntax and perhaps
| | 07:12 | you have used for loops. You might have
even used for in loops. ECMAScript for XML
| | 07:19 | introduces the concept of the for each
loop. So I'm going to write for each var
| | 07:27 | tag in myXML, alert tag. So the 'for in'
construct in JavaScript loops over all
| | 07:40 | the properties of a given object, but
in this case what I'm looping over is
| | 07:45 | I'm looping over for each tag that's in
the XML code, and let's see what happens
| | 07:50 | when I run this.
| | 07:56 | So there's the name, right. That's
the content of the name tag. All right,
| | 08:00 | there's the first phone, there's the
second phone, there's the third phone and
| | 08:06 | there's the email address. That's a
pretty powerful construct to be able to
| | 08:10 | access XML in E4X syntax.
| | 08:14 | So enough with accessing examples.
Let's move on to creating XML using E4X.
| | 08:22 | So what we are going to do now is switch
gears a little bit and we are going to
| | 08:25 | write some example code that creates
new XML content on the fly using this XML
| | 08:32 | as a starting point.
| | 08:36 | So in the create E4X function what I'm
going to do here is I'm going to write
| | 08:40 | myXML.BusinessCard, because that's the
root tag in the XML that we have above.
| | 08:48 | I'm going to write += and I'm
going to write phone type = home
| | 08:58 | and I'll write some sample code in
here and make sure it works okay and
| | 09:05 | we'll write 415-555-1111. Okay, and now
we're going to alert myXML to see what happened.
| | 09:17 | Okay, so if all goes well, this should
be adding a new phone tag to the end of
| | 09:25 | the BusinessCard content, at the inside
of the closing tag right here. So after
| | 09:30 | this email that's what we are going to
do. We're going to put some new content
| | 09:34 | in there. Let's see what happens
when we run this in Firefox.
| | 09:42 | And you can see that here is the XML that we
had originally defined and there's the new phone
| | 09:47 | that just got added and that's
my new fictitious home phone.
| | 09:52 | So let's move on to another example.
This time we are going to use the new XML syntax.
| | 09:58 | So let me comment these guys
out and this time I'm going to write
| | 10:03 | myXML.BusinessCard += and now I'm going
to write new XML. And in this case,
| | 10:15 | I'm going to write phone type = pager and
I'll put some fictitious pager in there.
| | 10:28 | So like 415-555-6789, and
we'll close off that phone tag.
| | 10:39 | So now I'm using the new XML. This is more
the object-oriented syntax and we'll just
| | 10:46 | copy and paste this alert in there.
Let's see what happens in this case.
| | 10:52 | And low and behold you can see that
the new XML syntax works just as well.
| | 11:01 | Moving on, let's try one more example
of creating new XML content. What we are
| | 11:07 | going to do is we are going to modify
the existing content of this BusinessCard
| | 11:13 | construct. What we are going to do is
we are going to modify the contents of
| | 11:17 | this phone tag right here. We are
going to change the number. So to do that
| | 11:22 | I write myXML.phone and since it is a
zero-based index I'm going to write 1 and
| | 11:32 | I'm going to write = <phone type="work">
and I'm going to write 800-555-0000.
| | 11:45 | Close off the tag, okay. So this
should replace what's currently in the first
| | 11:53 | phone tag and you can see the current
number is 555-9876. So when this code
| | 12:01 | executes the 9876 should be replaced
with these four zeros. All right,
| | 12:05 | let's run that to make sure.
| | 12:12 | And sure enough you can see
that's exactly what happens.
| | 12:15 | So using E4X it's really easy to
manipulate and change the content of XML using
| | 12:23 | standard scripting constructs.
Let's move on to the next example.
| | 12:29 | I'm going to comment these guys out.
| | 12:31 | So now we are going to look at the
ways to delete content from E4X using the
| | 12:36 | delete keyword. So the first thing we
are going to do is type delete myXML.name
| | 12:44 | and this should remove the name tag
from the XML construct. So when we do the alert,
| | 12:54 | this tag should not be there.
So let's go ahead and browse.
| | 13:03 | And you can see that it is gone. So far,
so good. Let's move on to the next example.
| | 13:08 | We're going to write delete. In this case we
are going to need a whole series of tags
| | 13:13 | and we are going to
write delete myXML.phone.
| | 13:19 | Now because phone refers to a tag
that has multiple instances, it's going to
| | 13:26 | delete all of them.
Let's go ahead and look.
| | 13:33 | Oops! Uh, there we go.
I left both alerts in there.
| | 13:36 | Okay, so you can see that now the only
thing left is the email tag, all right.
| | 13:41 | So we'll go ahead and click OK. For the
last example we're just going to delete one
| | 13:45 | single attribute. So what I'm going to
do is comment these guys out. So what
| | 13:50 | I'm going to do now is delete a single
attribute and I'm going to delete the
| | 13:53 | type attribute from the very first
phone tag. Type delete myXML.phone.@type and
| | 14:02 | that should just simply delete this
attribute right here. It's going to get rid
| | 14:08 | of the type = mobile. Okay, so let's go
back down to the code. All right, save
| | 14:14 | and we are going to bring this up in Firefox.
| | 14:18 | And you can see that the type
attribute is now gone, okay.
| | 14:22 | Let's move on to our last E4X example.
What I'm going to do here is show you
| | 14:26 | how you can work with strings and XML
and the DOM to interchange data between
| | 14:34 | E4X and DOM construct. So a couple of
things we need to do here. What we are
| | 14:39 | going to do is first create a DOM
parser. So we'll type var oParser =
| | 14:46 | new DOMParser, and if you haven't seen
the examples earlier on in this section,
| | 14:53 | you might want to go back and take a
look at the lessons where I cover what
| | 14:57 | the DOMParser does because
we'll be using it here.
| | 15:00 | So now I'm going to write var
xmlDoc = oParser.parseFromString and
| | 15:12 | parseFromString takes two arguments
and the second one is the mime type,
| | 15:18 | so I'm going to type application/xml.
And in the case of the first argument what
| | 15:24 | I'm going to do is get a little bit clever
and I'm going to write myXML.toString.
| | 15:29 | So I'm going to convert the XML code
in the E4X construct into a string.
| | 15:33 | I'm going to serialize it into a string.
I'm going to pass it off to the DOMParser.
| | 15:37 | This will give me back an
honest to goodness DOM document.
| | 15:40 | So now I'm going to do a little bit of
DOM manipulation on the content that we
| | 15:44 | just created. So I'm going to write
var oNode = xmlDoc.createElement and
| | 15:54 | I'm going to create an element named
createdInDom. To make it easier to
| | 16:02 | read there, okay.
| | 16:04 | So, I have got an element now called
createdInDom and now I need to get the
| | 16:09 | BusinessCard root tag and put this
at the inside of the business card.
| | 16:15 | So I'm going to write var oBC = xmlDoc.
getElementsByTagName and I'm looking for
| | 16:28 | the BusinessCard element and there's only
one of those. So I'm going to get the
| | 16:33 | zeroth element that comes back
from that array and I'm going to write
| | 16:38 | oBC.appendChild. So using the DOM,
I'm going to put in the node that we just created.
| | 16:48 | And now I'm going to write alert
| | 16:54 | new XMLSerializer().serializeToString
and we are going to serialize out
| | 17:03 | the XML document. Once we do that, we
are going to write myXML = new XML();
| | 17:15 | So we are going to convert the DOM back
into E4X content now and basically
| | 17:20 | we're going to take the same call that we
just did here. We're going to serialize
| | 17:24 | the DOM out to a string, pass that by
back to E4X, and then we'll just alert myXML.
| | 17:32 | Okay, so let's go over quickly what
we were doing here before I run the
| | 17:36 | example. So I have created a DOM parser
and I have converted the E4X XML into a string
| | 17:42 | and now I'm going to create an
XML DOM document out of that. When I have
| | 17:45 | the DOM document created, I'm going to
create a new element called createdInDOM
| | 17:49 | and I'm going to put that at the
end of the business card root tag and then
| | 17:54 | I'm going to alert to make sure it
worked. And then I'm going to take the DOM
| | 17:57 | and convert it back into E4X content
using the new XML construct. All right,
| | 18:04 | drum roll please. Let's see if it works.
| | 18:10 | Bring up Firefox.
| | 18:13 | Here you can see here is
the business card and there's my
| | 18:16 | createdinDOM element. So it looks like
that worked just fine. So we now have a
| | 18:24 | DOM document that we created from our
E4X content and I was able to manipulate it
| | 18:28 | using the DOM. Now when I click OK,
what should happen is the DOM should get
| | 18:33 | serialized back out to a string and
converted back into E4X content and it did.
| | 18:39 | There it is. You can see that it is
the E4X content. There's the createdInDOM
| | 18:44 | element that we created.
| | 18:45 | So this is a way of interchanging
content between the E4X constructs and the DOM.
| | 18:52 | So you can use E4X for what it's
good for, you can use the DOM for what
| | 18:57 | it's good for or if you need to
transmit information back and forth using
| | 19:01 | serialization and now you are ready to
go out and use this in your own projects.
| | Collapse this transcript |
|
|
4. Designing and Implementing an XML FormatUnderstanding XML formats| 00:00 | We have now reached the point in the
course where we have seen enough and
| | 00:03 | learned enough to design our own XML
format. Before we do that, I'm going to
| | 00:09 | show you an example of where we are
going to be using our XML format and why we
| | 00:14 | are going to be designing it.
| | 00:15 | So let me switch over to the browser
here for a moment. Here I'm in a browser
| | 00:19 | and we are looking at a fictitious
company called Teacloud. Teacloud is a
| | 00:25 | website and company that people go to
to learn about teas and brewing teas.
| | 00:30 | It's basically all about tea.
| | 00:33 | In addition to being a destination
site to learn about tea, Teacloud has a
| | 00:38 | product catalog and they sell products
online. So we are going to switch over
| | 00:41 | to the Our Products section. You can
see here that there are two different
| | 00:45 | categories of products that they have.
They have Kettles & Teapots and they
| | 00:48 | have Teacloud Teas.
| | 00:50 | So here in the Kettles & Teapots
section, you can see that there's a table of
| | 00:54 | products here and each one has a
picture along with the name and a price and
| | 00:59 | there's a description to go along with each.
| | 01:01 | So let's switch over to the Teas for
a moment. Now in the case of the Teas,
| | 01:07 | there's no picture but there's a name
and there's a description and the price
| | 01:12 | is for a given unit of weight,
in this case, it's for pounds.
| | 01:16 | So this is the site that we are going
to be working on. Our job is going to be
| | 01:20 | to implement this product catalog
using XML. Once we have designed the XML
| | 01:25 | format, we are going to look at the
code that's used to read the XML data in
| | 01:30 | and present it here in the webpage
both in Internet Explorer and in Firefox.
| | 01:36 | Okay, so let's go ahead and get started.
| | Collapse this transcript |
| Avoiding common design mistakes| 00:00 | Okay, so before we get started
designing our own XML file format, let's take a
| | 00:04 | look at some common XML design
mistakes. So we can make sure that we don't
| | 00:09 | repeat them.
| | 00:11 | Mistake number 1 is using the word XML
as the document root. I have seen this
| | 00:17 | from time to time. In fact, it's not
just using XML. It's using any word that
| | 00:22 | begins with XML. Now this is not
specifically an error but you shouldn't do this.
| | 00:27 | The reason is because the term
XML along with words that begin with the
| | 00:33 | xml is reserved for use by the
W3C and the XML specifications.
| | 00:39 | Besides root tags in documents should
be descriptive and they should reflect
| | 00:44 | the type of document that they are
representing. If you name your root tag XML,
| | 00:49 | you are not following that principle.
Of course it's XML. What else would it
| | 00:52 | be? In fact, if you were to add the XML
declaration above this, you would make
| | 00:57 | it even more obvious. So don't name your
root tags XML, choose a descriptive name.
| | 01:02 | Common mistake number 2 is including
information in an XML file that is not
| | 01:08 | itself XML. You can see an example of
that here. You can see this is a tag that
| | 01:14 | describes a file and the fileInfo. The
name is pretty clear but then there's
| | 01:18 | this thing called attributes. If you
have ever worked on a file system,
| | 01:22 | you would know that files have got
permissions on who can read, who can write and
| | 01:26 | who can execute them.
| | 01:27 | The problem with this is that you are
forcing the person consuming this XML
| | 01:30 | file to do further processing that
number 1, the XML Parser can't do for them
| | 01:37 | and number 2, is not in a very obvious
format. If I didn't know that files had
| | 01:42 | attributes like this indicating who had
read and write and execute permissions
| | 01:48 | and that these permissions were
grouped into three categories for the world,
| | 01:51 | group and individual, then
I would have no idea what this means.
| | 01:55 | So don't include data in an XML file
that has to be processed further beyond
| | 02:01 | what the XMLParser can already do.
You have got this powerful XMLParser.
| | 02:05 | You should make it do all the hard work.
Don't force people to consume your
| | 02:09 | document format. This is an example of
including a format inside the XML file
| | 02:14 | that's either proprietary or
specific to one system or another. You could
| | 02:20 | easily rewrite this format using XML and
that's what should have been done here.
| | 02:24 | The idea is to aim for clarity and
ease of use in the XML rather than
| | 02:29 | compacting the syntax down as much as
possible in order to save room. Don't be
| | 02:34 | afraid of being verbose in your XML
files. XML files are supposed to be as
| | 02:39 | self-documenting as they can possibly
be. I should be able to consume data in
| | 02:44 | an XML file without having to worry
about which system things came from or
| | 02:48 | having to use processing
techniques beyond what the parser gives me.
| | 02:53 | Mistake number 3 involves being too
precise with your tag names. So let's take
| | 02:58 | a look at this example. On the left
hand side, we have a snippet of XML code
| | 03:03 | that encapsulates a furniture order.
You can imagine that this had come from
| | 03:08 | some furniture store or a factory or
something. We have the same thing on the right.
| | 03:13 | Now on the left you notice that each
tag is named individually and they are
| | 03:18 | very specific. So this one here is
a sleepersofa. We have a queenbed,
| | 03:21 | coffeetable and so on. Whereas on the
right, we have got a much more loose
| | 03:26 | coupling between tag names.
| | 03:28 | So instead of calling this sleepersofa,
this over here is just sofa, then as an
| | 03:32 | attribute of isSleeper. Here we have
bed and beds can have different sizes. So
| | 03:37 | we have size = "queen" and so on down the line.
| | 03:40 | Now on the left, the reason why this
is a problem is because this causes an
| | 03:43 | unnecessarily tight, what we call,
coupling with the underlying processing
| | 03:48 | code. Imagine a situation where I had
some XML processing code that counted up
| | 03:53 | the number of tables in differential order.
| | 03:57 | Well, on the left hand side, if I add a
new table type and I call it something
| | 04:02 | else, like nightstand or endtable or
whatever. If I want to do that, I then
| | 04:06 | have to go back and change the
processing code because I have to account for
| | 04:10 | the new table name
because I have added a new tag.
| | 04:13 | In the example on the right hand side,
that code could be left as it is because
| | 04:17 | all I would do is add a new table tag.
I would simply describe it in the
| | 04:22 | attribute that I have here for type.
So whatever code I had that did things
| | 04:28 | like counted up tables or retrieved
all the table tags and added up their
| | 04:32 | prices that were in a different
attribute, all of that would remain the same.
| | 04:35 | The other problem with the code on
the left versus the code on the right is
| | 04:39 | that it makes writing things like
XPath expressions and XSLT templates a lot
| | 04:44 | more complex. Because again I have
gotten so specific with my tag names that
| | 04:49 | every time I add a new one or if I want
to change the code around or if I want
| | 04:53 | to change the order in which tags
appear in the XML, chances are I'm going to
| | 04:57 | have to go back and change the
corresponding XPath or XSLT templates.
| | 05:02 | So this is a bit more of an art than
it is a science. There's never any real
| | 05:07 | right or wrong answer, except maybe
in this case. The idea here is try to
| | 05:12 | choose tag names that represent base
level classes or objects and save things
| | 05:18 | for descriptions in attributes or child tags.
| | 05:22 | So a good example here is the
sleepersofa. This is clearly an adjective
| | 05:26 | describing a noun. So try to use
nouns as your tag names and save the
| | 05:33 | descriptive attributes for
things like adjectives and so on.
| | 05:37 | Okay, mistake number 4 involves being
far too compact with your syntax. Take a
| | 05:44 | look at the example here. Okay, there
I have got some XML code and it's really
| | 05:48 | compact, but I have no idea what any of
these means, like what is an f-o? Like
| | 05:53 | what is isSl? Just by looking at
this I have no idea what any of this is.
| | 05:57 | Well, I can make it a lot clearer,
suppose I did this. Now it's a lot clearer.
| | 06:01 | See now my tag names are a lot more
descriptive and we can see that this is an
| | 06:05 | evolution of the furniture order
that we had from the previous example.
| | 06:10 | Once again, we have drastically
improved the readability and usability of this
| | 06:16 | XML file just by being a lot more
descriptive with our tag names. So we can see
| | 06:21 | now that this is a furniture order and
it is grouped by living room and bedroom
| | 06:26 | and there are tags that represent items
that would go in each. We have expanded
| | 06:30 | up the tag names and attributes and
we have added some pricing information.
| | 06:34 | This is much easier to understand.
| | 06:36 | So the lesson here is don't worry too
much about making your code compact.
| | 06:40 | You need to aim for readability and
maintainability. Verbosity is not necessarily
| | 06:44 | a bad thing in XML.
| | 06:46 | Okay, so now that we have seen how not
to do things, let's take a look at some
| | 06:50 | design tips for how to make good
XML and that's our next lesson.
| | Collapse this transcript |
| Planning design and development| 00:00 | Okay, let's take a look at some design
and development tips that you can follow
| | 00:03 | when creating an XML format. To begin
with, create the requirement summary that
| | 00:08 | you are trying to implement because
this helps you ferret out the tags and the
| | 00:12 | attributes that you are going to need.
| | 00:14 | What you will find is when you write
things down, nouns will typically become
| | 00:18 | tags and adjectives will typically
become attributes. Although they may become
| | 00:23 | tag themselves, if they are somewhat
complex. Verbs will typically become functions.
| | 00:29 | Once you have done that, you can
identify the base tags. Now these are tags
| | 00:34 | that will serve as containers and
wrappers for other tags. The reason you want
| | 00:40 | to do this is because this ensures that you
get all of your collection tags in place first.
| | 00:45 | So in the previous example, we would
have things like maybe tables or rooms
| | 00:51 | encompassing things like the living
room or bedroom, sections of the furniture
| | 00:55 | order. Once you have done that, you
define the tags that are going to make up
| | 00:59 | the bulk of the data. These are your
base level objects. These are the things
| | 01:03 | like the sofas and tables that we saw in
the furniture example in the previous lesson.
| | 01:09 | Once you have done that, if you feel
up to it you can create the associated
| | 01:12 | schema or DTD. In fact, you should
probably create the schema or DTD in
| | 01:18 | parallel with your XML data or maybe
even beforehand. That's another tip that
| | 01:21 | we'll look at in a minute. Now we
are not going to do that in this title
| | 01:25 | because that's relatively involved and
we want to get straight to the data. So
| | 01:30 | we are going to skip that part.
| | 01:31 | Then once you have created the data,
you can go ahead and write the associated
| | 01:35 | style sheet or CSS or script that goes
along with your data. One of the common
| | 01:40 | questions that people who are
designing XML tags sets come across is when
| | 01:45 | should I choose between using tags and
using attributes? The short answer is
| | 01:53 | there's really no hard and fast rule.
You need to use your better judgment, but
| | 01:57 | there are some guiding
principles you can follow.
| | 01:59 | Typically, you should prefer tags
when you have to represent data that's
| | 02:04 | relatively complex or data that can
be broken down into multiple parts.
| | 02:10 | You should use attributes as modifiers
for data or containers for simple data.
| | 02:17 | Now another guiding principle that you
should follow on top of that is if data is
| | 02:22 | suitable for an attribute but it
could end up as multiple attributes on the
| | 02:27 | same element, then you should
probably use child tags instead.
| | 02:31 | So let's take a look at an example.
Here I have an XML fragment that defines a
| | 02:36 | movie. You can imagine that this XML
fragment is used in XML dataset for, say,
| | 02:44 | a video rental store. So here we have
a tag that defines movie. There's an
| | 02:49 | attribute inStock and that
can either be true or false.
| | 02:54 | So in this case inStock is clearly an
attribute. Why? Because it describes a
| | 02:59 | property of the movie. So this is an
adjective and it describes whether the
| | 03:02 | movie is in or out of stock. Now a
movie can either be in stock or out of
| | 03:07 | stock. There's no chance of inStock
appearing more than once on the movie tag.
| | 03:12 | It's a relatively simple attribute.
| | 03:14 | Although, consider a case of something
like title. Now you might be wondering
| | 03:18 | why isn't title an attribute? The
reason is because sometimes movies have
| | 03:22 | different titles in different languages.
So although I could have solved this
| | 03:27 | problem by putting something like
title-us = movietitle and title-fr =
| | 03:35 | theFrenchversion, that kind of defeats
the purpose of using child tags. What
| | 03:40 | I'm doing here is I'm using the base
tag title and then decorating that with a
| | 03:45 | language attribute that
describes which language that title is.
| | 03:49 | Same idea with price.
| | 03:51 | I could have placed price on the movie
as an attribute. In fact, if I only sold
| | 03:57 | the movie in the United States, I might
just do that. However, if you sell your
| | 04:01 | products in multiple countries for
multiple different pricing units or if
| | 04:06 | you have prices that reflect discounts,
that might not be such a great idea.
| | 04:10 | So follow these principles and you will
usually end up at the right conclusion.
| | 04:14 | You will probably find out pretty
quickly, if you did it, but in essence, use
| | 04:19 | tags for complex data, use attributes
for simple data and modifiers for data.
| | 04:25 | Remember that if an attribute could
possibly be used more than once, then break
| | 04:30 | it up into a child tag.
| | 04:32 | We talked about this a little bit
beforehand. You should create your schema or
| | 04:35 | your Document Type Definition, DTD
before or maybe during your design process,
| | 04:41 | not at the end. This is assuming that
you are even going to do this. The reason
| | 04:45 | for this is because it removes the
temptation to try to retrofit the rules of
| | 04:49 | the schema to fit your data, instead of
designing the data correctly right upfront.
| | 04:54 | Now if you follow this process, you
will catch design ever sooner, if you think
| | 04:59 | about your schema beforehand. If you do
this, you can also do data testing and
| | 05:03 | validation in parallel with the
development process. Like I said, we are not
| | 05:08 | going to do this in this example
because it is pretty involved and we want to
| | 05:11 | keep this a little bit
higher level and instructional.
| | 05:14 | Okay, so we have reached the point now
where we are ready to go ahead and start
| | 05:18 | designing our tag sets. So
let's do that in the next lesson.
| | Collapse this transcript |
| Creating the Tag set| 00:00 | Let's create the tag set for our
particular web site here. So as I indicated in
| | 00:05 | the last lesson, it's usually a good
idea to write things out. That will lead
| | 00:08 | you to the tags that you need to
create for your particular tag set.
| | 00:13 | So let's read what I have got here.
So 'In addition to being a place to learn
| | 00:16 | about tea, Teacloud sells products on
our site: kettles and teas. Kettles have
| | 00:23 | an associated product image, name
and price along with a description.
| | 00:26 | Teas have a name, price, and unit for the
given price and weight. They don't have an
| | 00:33 | associated product image.'
| | 00:34 | So if we go ahead and underline the
various nouns that we have here in
| | 00:40 | the paragraph, we can very quickly get an
idea of what our tags are going to need to be.
| | 00:47 | Let's switch over to the code
and take a look at our site and the XML file
| | 00:53 | that we need to build.
| | 00:54 | So I'm here in the code now for the XML
file that we need to build. I have got
| | 01:01 | a little bit of a start here by listing
out all of the products that the site sells.
| | 01:06 | So the first three here are teakettles,
and the next three are specific teas.
| | 01:13 | So we need to turn this
start file into a finished XML file.
| | 01:19 | So recall that when we wrote out our
paragraph describing the various nouns
| | 01:24 | and adjectives and so on. We very
quickly realized that there were products
| | 01:28 | that we sell. So it's
probably a good place to begin.
| | 01:30 | So let's begin by writing first of
all our XML declaration, because that
| | 01:35 | should come at every XML file. So we'll write:
| | 01:40 | <<XML version="1.0" encoding="UTF-8">>
| | 01:55 | Now this will usually be done for
you by your XML editor, but the default
| | 01:59 | encoding is going to be UTF-8 anyway.
I just want to be explicit about it.
| | 02:03 | So let's begin by making our base
class tag. Now this is the tag that someone
| | 02:08 | reading XML file would look at and
quickly get an idea of what the contents of
| | 02:12 | the file are.
| | 02:14 | So it seems to me that since we are
selling products for the TeaCloud site,
| | 02:18 | as good a name as any is going to
be something like teaCloudProducts.
| | 02:27 | So now that we have our base tag, we
can start putting in our container tags.
| | 02:32 | The container tags are going to be
tags that group together individual tags
| | 02:36 | that are related.
| | 02:37 | So recall back from the example when
wrote out the paragraph that our website
| | 02:42 | sells kettles and teas. So what I'm
going to do here is create a tag named
| | 02:48 | kettles and I'm going to create
one named teas. Okay, so far so good.
| | 02:57 | Now we have got three of each here.
We have three kettles and we have three
| | 03:01 | teas that are available for sale.
So I'm going to go ahead and inside
| | 03:04 | the Kettle section I'm going to create a
tag named kettle. Each kettle is going to
| | 03:12 | have information associated with it.
So at this point let's stop and take
| | 03:16 | a look at some of the kettles.
| | 03:18 | So we can see that each teakettle has
a name and it has a price and there's a
| | 03:22 | path in the assets folder
that has its related image.
| | 03:27 | So if we scroll over here and look in
the images, we can see under Products,
| | 03:34 | under Kettles... So there are images
that correspond to each one of these guys.
| | 03:39 | Now image paths are not likely to be
different for each one of these tags in
| | 03:43 | this example. In fact, there's only one.
So what we are going to do is
| | 03:46 | we'll make the image an attribute,
and we'll just copy this data right here.
| | 03:55 | We are also in the place to put the name.
So we'll make the name an attribute
| | 04:00 | as well and we'll just
put that right up in here.
| | 04:10 | We have got one piece of information
left to go in attributes. That's price.
| | 04:14 | Now in this case, we only sell in the
U.S. let's say and we only have one price.
| | 04:17 | So I'll make that an
attribute. That's 49.95 for this one.
| | 04:25 | Last but not least, we have the
description. So the description is some pretty
| | 04:29 | long text and I'm going to make that
the text content of the kettle tag.
| | 04:37 | So now we just need to do this for each
one of the kettle tags. We'll just copy
| | 04:45 | that information to each one.
| | 04:47 | So here is the Earl's Grey one.
In fact, I need to entity escape that
| | 04:52 | apostrophe right there. So I'm going
to write ampersand, apos, semicolon.
| | 04:57 | You need to do entity escaping when you're
putting things like quotes inside XML files.
| | 05:03 | We'll have that be that price
and the image here is Earl's Grey instead.
| | 05:13 | So I'll replace that.
| | 05:16 | And the description. Okay, that's the
second one. We have got one more to go.
| | 05:25 | So we'll do the description...
| | 05:30 | and we'll do the name...
| | 05:36 | and we'll do the price...
| | 05:40 | and we'll change the image name. Okay.
| | 05:48 | Looks like our XML file is taking
shape nicely here. So now we just need to go
| | 05:53 | ahead and do the teas. So inside the
Teas section, I'm going to go ahead and
| | 05:58 | make a tea tag for each one of these guys.
| | 06:02 | Now each tea product has a name and a
price, but there's no image. So we don't
| | 06:07 | have to worry about that. But each one
does have a description. So I'll take
| | 06:10 | the description for each and put that
inside the tea tag. What I'm going to do
| | 06:16 | is make three copies here, because
I've got three teas. I'll copy this one
| | 06:23 | and paste that one in and
then finally this one. Okay.
| | 06:31 | Now I just need to deal with the names and the
prices. So the names are pretty straightforward,
| | 06:36 | because they are the same as
in the kettle case. So I'll just
| | 06:39 | have a name and we'll do some copy
and pasting here and this one too...
| | 06:52 | and finally on this one.
| | 06:59 | Now we need to deal with the prices.
You will notice that the price is not just
| | 07:03 | a price. It's per unit. So we could
just take the approach of saying, well,
| | 07:09 | let's just call this price = and then
26.95 per pound. Now I don't suggest you
| | 07:17 | do this and here's why. What you are
doing is you are combining two different
| | 07:20 | pieces of information in one single attribute.
| | 07:23 | Let's suppose we have some processing
code on the back end that calculated
| | 07:27 | the total price of an order that someone
put together and they ordered a couple of
| | 07:32 | kettles and a couple of teas. Well,
adding up the prices for the kettles is
| | 07:36 | pretty straightforward, because you
have got the numbers in here, but to do
| | 07:39 | that for the teas you have got to go
through some additional processing to
| | 07:42 | strip off this per pound indicator.
| | 07:46 | So it's not a good idea to try to
combine different pieces of information and
| | 07:50 | besides, suppose in the future we
decide to sell teas per ounce for some
| | 07:55 | really expensive teas, or we decide to
switch over to the metric system and
| | 07:59 | sell teas per kilogram.
You want some easy way to change that.
| | 08:04 | So what I'm going to do is leave off
the per pound unit and make a separate
| | 08:10 | attribute named "unit" and that's
going to be pound. I'm going to do that for
| | 08:16 | each one of these guys. price = and
this one is 16.95 and the unit is pound,
| | 08:25 | and then finally this one as well. In
this case the price is 18.95 and the unit
| | 08:35 | is pound. Okay.
| | 08:40 | Well, that's pretty much it. We are
now done creating the XML file format.
| | 08:45 | So we'll save it and now it's time
to integrate this with the HTML code.
| | 08:51 | That's the next lesson.
| | Collapse this transcript |
| Integrating XML with design| 00:00 | Okay, so in this lesson we are going to
take some of the concepts that we have
| | 00:03 | learned up until now and specifically
in the previous chapter to integrate
| | 00:08 | the XML file that we have just
built into our TeaCloud website.
| | 00:12 | So let's go ahead and open up our XML
file that we have just built. So recall
| | 00:18 | that this was the XML file that we
created to represent the kettles and teas
| | 00:23 | that our TeaCloud site list for sale
online. So we go to the index page.
| | 00:30 | This is our TeaCloud site and to see the product
listing, we go to the Our Products section.
| | 00:36 | Now on to the Our Products section
there are two individual files, one for
| | 00:41 | teas and one for kettles. So under the
product section here, what we are going
| | 00:46 | to do is open the file for the kettles
and tea pots, and that's this one here,
| | 00:52 | and we are going to open up the HTML file
for the teas and that's this file here.
| | 00:57 | Okay, so you can see that there's
nothing in the page where the product listing
| | 01:00 | is going to appear. So let's switch over
to the Code view and see what's going on.
| | 01:07 | So I'm going to switch to the Split
view here, I'm going to click down here.
| | 01:12 | Okay, now I'm going
to go to the Code view.
| | 01:15 | So you can see that there's an empty
table right here in the teas page that's
| | 01:20 | got an id of products on it and it has
no content, and we are going to take a
| | 01:24 | look over at the kettles page because
this one is also similar. I'm going to
| | 01:29 | click there, go to the code, and you
can see that in the kettles page,
| | 01:32 | same idea, right? We have a table that's
empty and it has an id of products on it.
| | 01:37 | So that's very important for a reason.
Our XML processing code is going to
| | 01:41 | build the content for these tables and then
place it in there when it's done being built.
| | 01:48 | So now let's scroll to the top of the
file and you can see a couple of things.
| | 01:56 | First here, I'm including the cross
platform sarissa.js library, which
| | 02:02 | we covered in the previous chapter, and
this is recall the Library that allows me
| | 02:06 | to use XML functions across browsers like
Firefox and IE. And the next script line here,
| | 02:13 | this is the code that's going to
be used to build our products list.
| | 02:19 | Let me scroll down a little bit and
you can see that in my windows onload function,
| | 02:25 | in addition to the other stuff
going on, I have a function here
| | 02:28 | called buildProductsList.
| | 02:30 | The story is pretty much the same for
the kettles side. So up here, we have
| | 02:35 | the sarissa library. That's that
right there and here is my products.
| | 02:39 | Now the function called
buildProducts is being called with two arguments.
| | 02:43 | There's this products string and in
this case there's kettles string. Over here
| | 02:48 | it's products and teas.
| | 02:50 | So the buildProducts function is going
to build a list of products depending on
| | 02:55 | whether it's teas or kettles, and
you can see that each one of these
| | 02:59 | corresponds to the name of a tag here in the
XML data and let me switch back over here.
| | 03:08 | So let's take a look at the code
and see how this works.
| | 03:13 | Okay, so I'm just going to go ahead and click
on buildProductsList and that opens up the code.
| | 03:21 | So here we are in the JavaScript code that
is responsible for building the products list.
| | 03:25 | So the first thing that this
buildProductList function does is it retrieves a
| | 03:31 | DOM document reference
using the sarissa library.
| | 03:34 | So recall from our earlier examples in
the previous chapter that this gets us
| | 03:38 | an empty document and what
I'm going to do now is load
| | 03:43 | the teacloudproducts.xml file. So we are
going to set the asynchronous property to false,
| | 03:47 | because I want this load synchronously
and then on the document I just created,
| | 03:52 | I call the load function, which
loads the teacloudproducts data which
| | 03:56 | we created. Once I have done that,
I get a reference to the table which was
| | 04:02 | supplied as the id argument which is
how we call this function. So this is
| | 04:06 | the products table back in the HTML file.
| | 04:10 | Once I have that, I create a new table body,
because this tea body element is
| | 04:14 | going to hold the created table for
the products. Now I check to see which
| | 04:19 | string was passed in and remember the
second argument was the products that
| | 04:24 | we're building a list for. So it's going
to be one of two cases. Either it's going
| | 04:27 | to be teas or it's going to be
kettles and that is this argument here.
| | 04:34 | So in the case of teas, what we do is
we have a variable here that we declare
| | 04:40 | and that is the result of the
getElementsByTagName on tea and this is the tea
| | 04:48 | set of elements that we created in our
XML file and that's going to come back
| | 04:53 | within an array of tea tags. So we
store aside the number of tea tags by
| | 04:58 | calling the length property on the array.
| | 05:01 | Now we are going to loop over each one
of these guys and build up the table row
| | 05:05 | and the table cells that's going to
contain each piece of data. We do this line
| | 05:11 | of code here, so we create a new table
row element and then we create a table cell
| | 05:15 | to go inside of it and we append
the table cell into the table row.
| | 05:20 | Now once we have done that, we need
to extract the name of the tea and the
| | 05:24 | price per unit to build up the string
representation of the product. So we are
| | 05:29 | going to create a div that's going to
hold all this information and we do that
| | 05:32 | by calling the create element function.
So we create a div. Then we create
| | 05:36 | a paragraph that's going to hold the
text for the tea node. Then we create a span,
| | 05:40 | because I'm going to wrap the tea
name in a bold text to make it stand out.
| | 05:46 | So once we have created these three
elements, we retrieve the information from
| | 05:50 | the XML data. So as we are looping
through this loop here, we are counting over
| | 05:55 | the contents of this array. So each
one of these elements raise a tea tag.
| | 06:00 | So for each tea tag, we retrieve the name,
the price and the units. So this will
| | 06:05 | be the name of the tea, the price and
this is going to be the weight unit in
| | 06:09 | either pounds or
whatever we make it in the future.
| | 06:12 | So we have set the span's className
attribute to be productName and in order
| | 06:17 | to make this work across browsers, we
have to fix this. Because className is
| | 06:21 | used in IE, because class is a
reserved word in JavaScript, whereas Firefox
| | 06:26 | doesn't have that problem.
| | 06:28 | So what we're going to say is oSpan.setAttribute.
I'm going to quickly modify this to say
| | 06:33 | window.event so if that's not
equal to null, then we know we are in IE,
| | 06:39 | versus being in a non-IE browser.
We can just say class and we are going to set
| | 06:46 | that to be the CSS style productName,
and I've defined a CSS style and
| | 06:50 | we can look at that really quickly.
I'm just going to open the CSS style file here.
| | 06:54 | We scroll down to the bottom.
You can see I have defined a productName class
| | 06:59 | that just simply sets the font
weight to be bold and I have also decided
| | 07:04 | to create a style sheet for the table cells,
which puts a border around the cells.
| | 07:12 | So now that we have done that, we set
the content of that span to be the name
| | 07:18 | of the tea, which we have got right
here, and we put that span into the paragraph
| | 07:23 | and then we add on to the end of
that text, this string right here.
| | 07:27 | So it will be an open parenthesis with
the price and then a forward slash and
| | 07:33 | then the text of the unit attribute
with a closing parenthesis. So that's
| | 07:38 | going to end up looking something
like this, like 49.95/lb. That's what
| | 07:48 | it's going to end up looking like.
| | 07:53 | And then we put
that content also into the paragraph.
| | 07:55 | So now we have a paragraph
containing the tea name and the price.
| | 07:59 | Now we need to retrieve the
description from the interior of the tea tag,
| | 08:03 | the text that we put inside the tea tag.
To do that, we use the firstChild DOM
| | 08:09 | property on the tea tag. That gets us
the text node and then the data property
| | 08:15 | on the text node gets us the actual
text data. So once we have that in the
| | 08:19 | description, we create another
paragraph tag, and add that description into it
| | 08:25 | by calling the appendChild function.
| | 08:29 | Once we have done that, we add that
paragraph to our parent div and put the
| | 08:33 | div inside the table cell. We do that,
we add the completed table row to
| | 08:38 | the table body and then the
loop goes back up and runs again.
| | 08:43 | So this will loop over each one of the
teas and build up table rows for each one.
| | 08:48 | Okay, so let's save this. I'm going
to go back over to the teaclouds.xml file.
| | 08:53 | Okay, you can see here is all
the kettles and teas. So for each tea,
| | 08:58 | we have the name, the price, we have
the unit, and we have the text inside.
| | 09:02 | So we have now extracted each piece of
information. I need to save this as the
| | 09:07 | teakettles.xml file. So we Save As,
take off the word start, there we go.
| | 09:15 | Let's go head back to the index page.
I'm going to preview this in the browser.
| | 09:21 | So we'll go over to the Products page
and you can see here that we have now
| | 09:25 | built the teas and we have built the
teakettle. So you can see the teas are
| | 09:31 | showing up properly. There's
the bold name with the 26.95/lb as
| | 09:36 | the parenthesis string and there's the
description. Let's make sure it works also
| | 09:40 | in Internet Explorer. I'm going to go to the teas.
You can see it's working there as well.
| | 09:48 | Now let's go back to the code and take
a look at how we build up the kettles.
| | 09:55 | The kettles is pretty much the same.
The only different in the kettle section
| | 09:58 | is that we have in addition to the
name and the price and the description,
| | 10:04 | we also have an image that needs to be
inserted. So the code is pretty much the same.
| | 10:09 | You can see here where we get the teas
and save aside the number of items,
| | 10:14 | we have a loop. What we are doing
here is the same thing only in this case
| | 10:17 | we are doing it for
the kettle tag, not the teas.
| | 10:21 | So now the loop goes through and it's
the same idea. We create a table row and
| | 10:28 | a table cell. Now the difference here
is we have two table cells because one
| | 10:32 | holds the image whereas one holds the
product information. So the first table cell,
| | 10:36 | we create an image to hold the
image that's going to represent the kettle.
| | 10:41 | We then get the path for that
image and that's going to correspond to
| | 10:49 | this attribute right here. So we are
retrieving the image part and that's this.
| | 10:55 | So we also retrieve the name attribute.
| | 10:58 | Once we have the image, we set the
source of the image to be the path that we
| | 11:02 | retrieved and to be nice and accessible,
we set the alt attribute of the image
| | 11:08 | to be the name of the product as well.
So now that we have created the image
| | 11:12 | and we have gotten the attributes,
we then append the image into the table cell.
| | 11:16 | Okay, that's the first part of it.
| | 11:18 | The second part of it is to create the
second table cell now which will hold
| | 11:22 | the product information and this is
pretty much the same as in the tea case.
| | 11:25 | We create a div and a paragraph and
a span and these are going to hold
| | 11:29 | the pieces of data. The div
is going to wrap everything up.
| | 11:31 | So we retrieved the price using
getAttribute on the kettle that we are
| | 11:36 | currently looping over and once again
referring back to XML data, you can see
| | 11:41 | in the kettles each one has a price
right here. So we retrieved the price,
| | 11:48 | we do our little setAttribute and in this
case, we got to do the same thing again
| | 11:52 | for both IE and Netscape. So if we're in IE,
we use className; otherwise we'll use class.
| | 12:03 | Once we have set the class so
that things show up properly, we then
| | 12:07 | append the name and the once again,
we do our little string trick. Only this time
| | 12:13 | we don't have a unit of weight.
We just have the price inside parenthesis.
| | 12:19 | So we put that inside the span,
we put the span inside the paragraph,
| | 12:23 | we put the paragraph inside the div.
| | 12:26 | Now we get the description.
The description is the same process.
| | 12:35 | the text out of the text node.
We then create a paragraph with that
| | 12:39 | We get the firstChild of the kettle tag and
that gets us the text node and the data gets us
| | 12:40 | description inside of it and we put
that inside the div, put the div inside
| | 12:47 | the second table cell, and put the table row
into the body, and then whole loop completes.
| | 12:54 | Once we are done, we tell the table
to put the table body inside the table.
| | 13:00 | Okay, so let's save, let's go back to
the index, and I'm going to browse this in IE.
| | 13:08 | Go to the Products section and
you can see here that the image is being set
| | 13:12 | and you can see that the Alt tag is
showing us the name of each image as we
| | 13:16 | mouse over. Here is the name and the
price and it's showing up as bold text
| | 13:22 | inside that span and here is the
description. Switch over to Teas,
| | 13:27 | yup, all that's working fine.
| | 13:30 | Now let's browse in Firefox. Okay, here
we go. There's the name and the price,
| | 13:39 | the image and the description and let's
switch over to Teas and everything worked.
| | 13:45 | Okay, so now you have seen an end-to-
end example of building an XML tag out
| | 13:50 | from scratch and integrating it
into your web pages in a cross browser
| | 13:55 | fashion. That brings us to the close of
this lesson. Let's move onto our next one.
| | Collapse this transcript |
|
|
5. Real-World DOM AlgorithmsUnderstanding the uses of DOM algorithms| 00:00 | During the course of working with XML
in the real-world there will be certain
| | 00:04 | situations that you see coming up
again and again, and in this section we'll
| | 00:09 | talk about ways that you can deal with those
situations using some real-world DOM algorithms.
| | 00:15 | So as I was saying, when you're
working with XML there's a number of common
| | 00:18 | processing tasks that you are going to
have to perform and you will come across
| | 00:22 | these fairly regularly. There are some
common algorithms that can be written as
| | 00:27 | some standard functions which you
can reuse in your specific projects.
| | 00:32 | So I have got a few here. I'm going to
provide about a half dozen of these. The code
| | 00:36 | I'm going to show you is code that you can
just take and re-use in your own projects.
| | 00:41 | So I'm going to start off by talking
about the concept of node traversal and
| | 00:45 | what this basically means is there
are common situations in processing XML
| | 00:49 | where your code will need to visit
nodes in an XML document, whether it's an
| | 00:56 | XML data document or an XHTML file,
whatever. It's common practice that
| | 01:01 | you will have to write some code that
visits either all the nodes or some subset of
| | 01:06 | nodes in the XML file and node
traversal is the way that you do that. The word
| | 01:11 | traversal means you visit nodes in a
certain order and we'll look at both
| | 01:15 | depth-first and breadth-first node traversal.
| | 01:18 | The rest of the algorithms I provide
here are pretty useful functions that you
| | 01:22 | can use in your code that performs some
useful utility functions. For example,
| | 01:28 | there's the isContainedBy function and
you can use that function to see if a
| | 01:32 | given node is contained within a
given type or within a specific node.
| | 01:39 | The containsNode function is the opposite.
You can use that to see if a certain node
| | 01:44 | or an element in a document
contains another node, either of a given type
| | 01:49 | or a specific instance.
| | 01:51 | The hasSibling function is used to
see if a node has a sibling of either a
| | 01:57 | specific type or a specific instance.
Finally, the getElementsByAttr function
| | 02:03 | can be used to get elements if they
have an attribute that matches a specific
| | 02:08 | value. Let me begin by
talking about document traversal.
| | 02:11 | Document traversal is a process by
which you process the nodes in an XML
| | 02:16 | document and this is a fairly common
task in just about any real-world setting
| | 02:22 | and there are two common ways of
traversing an XML document. One of them is
| | 02:25 | called depth-first and one of
them is called breadth-first.
| | 02:29 | The depth-first version refers to the
fact that each node a document is visited
| | 02:33 | from the top down to the bottom.
Actually, it might be more accurate when
| | 02:38 | you see the example to call it bottom to top.
The idea is that for any node before
| | 02:42 | we process it, we first visit all of
its child nodes. So in the most extreme
| | 02:47 | example, we would start with the
document route, go all the way down to the
| | 02:50 | leftmost leaf node and then work
our way back up. Now in breadth-first
| | 02:55 | traversal, all of a node's siblings
are processed in order before the child
| | 03:00 | nodes of each one.
| | 03:01 | Okay, so now that I have described
what document traversal is, let's take a
| | 03:05 | look at an example in action.
| | Collapse this transcript |
| Understanding depth-first document traversal| 00:00 | Okay, let's start by looking at a
depth-first traversal pattern and how it
| | 00:06 | looks when we are operating on an XML
file. So let's imagine that the structure
| | 00:10 | you see here is the node structure
of an XML file and at the top of the
| | 00:15 | document you have the A tag as the
route and then underneath A you have got B
| | 00:20 | and then underneath B you have C, D
and E and so on down to F and G here and
| | 00:25 | then on the other side of A,
you have got H, I and J.
| | 00:28 | So if these are all tags in an XML
document and we wanted to do a depth-first
| | 00:34 | traversal of this document starting at
the A tag, this is pretty much how it
| | 00:39 | would look. First, we would pass the A
tag to the function that started off the
| | 00:45 | traversal and it turns out that A has
two child nodes. So we would visit the B
| | 00:50 | node first and then from B we would
see that B has child node as well. So,
| | 00:56 | before we process B we would travel
down to C and it turns out that C also has
| | 01:00 | two child nodes. So before we do any
processing on node C, we would first
| | 01:03 | travel down to node F.
| | 01:05 | Now node F is the bottom of the tree
and that's called a leaf node and it has
| | 01:10 | no child nodes. So, we would do
whatever processing we have for node F,
| | 01:15 | we would travel back up to node C and
we're not done here yet. There's one more
| | 01:19 | child to go so we would travel down to
node G, we would process the other leaf
| | 01:24 | node which is node G here,
then we go back up to C.
| | 01:27 | Now we would process node C. We would
do whatever we would have to do on the node,
| | 01:31 | if we needed to do anything at
all and then from C we would go back up to
| | 01:34 | B. So this would go on. We would go
back down to the next child and then back
| | 01:38 | up to B and then down to E, back up to
B and then we'll hit back up to A.
| | 01:43 | Then we would do the other side of tree,
down to H, down to I, process I, back up to
| | 01:49 | H, down to J, process J, back up to H.
Now all the child nodes are done so
| | 01:55 | we would process H and then
we would finally process A.
| | 01:59 | So if you look at the order in which
the nodes were visited, the order would
| | 02:04 | look something like this. We would
first process F and then G because those are
| | 02:09 | the two leaf nodes in the far left.
Then we would do C, because at that point
| | 02:13 | all of its child nodes would be done.
Then we would do D and E because those
| | 02:17 | would be the remaining child nodes of B.
Then we would travel back up to B and
| | 02:22 | then the whole process would start on
the other side of the tree. We would go
| | 02:25 | all the way down to do I and J and
then back up to H, and then back up to A.
| | 02:30 | To implement this as a function using
code, we would do something like this.
| | 02:36 | Now this is a function called depth-
first traversal and it takes as an argument
| | 02:39 | the node in the document that you want
to start on. Now this is a pretty bare
| | 02:44 | bones example, it doesn't do anything
like checking for a node type or a node
| | 02:48 | name or anything like that. It just
handles visiting all of the child nodes in
| | 02:53 | order from left to right,
starting at the very bottom of the tree.
| | 02:57 | So the way it works is you call this
function on the node where you want to
| | 03:01 | start processing at and it doesn't
need to be the document route, it can be
| | 03:03 | anywhere in the tree. What happens is
the very first thing we check to see if
| | 03:07 | the node that we were given is not
equal to null, and if it's not, then we need
| | 03:11 | to check to see if it has any child nodes.
| | 03:13 | So we declare a temporary variable
and we get the nodes that we were given
| | 03:18 | first child and we have the for loop
to check the condition where the node is
| | 03:22 | not equal to null to make sure we can
keep going. Then to advance, we would get
| | 03:26 | the node's next sibling. This is how
we would travel left to right across a
| | 03:29 | node's child nodes.
| | 03:31 | Before we do anything however, you
notice that we're calling the depth-first
| | 03:34 | traversal again inside this for loop.
This is called recursion. It's a function
| | 03:39 | that calls itself and this is what's
going to get us all the way down to the
| | 03:44 | bottom of the tree. Because you see
what we were doing essentially is each time
| | 03:47 | we come through here, we get the first
child, then we call this function again,
| | 03:52 | only now the node has
been set to the first child.
| | 03:54 | So we come in here and then we get that
node's first child and so on and so on
| | 03:59 | as long as it has child nodes until we
reach null. And if it's null, then we
| | 04:05 | can fall out of this loop and if we
reached the bottom of the tree and there
| | 04:09 | are no further children to process,
then this fall is out and we do any leaf
| | 04:14 | node processing in here. Actually, it's
really processing for any node but it's
| | 04:18 | going to start with the leaf nodes.
| | 04:20 | Perhaps, it's most illustrative to see
this in action as a live code example,
| | 04:25 | so let's jump over to a code
and take a look at that now.
| | Collapse this transcript |
| Filling out the depth-first function| 00:00 | Okay, so this is the code for the depth
-first traversal. What we are going to
| | 00:04 | do here is fill out this function,
the depthFirstTraversal function.
| | 00:09 | So before we do that, let's take a
look at the rest of the file. It's a HTML file.
| | 00:12 | A couple of things to point out.
I have included the Sarissa library here
| | 00:17 | so that I can do things that will
work across browser with the XML. I have
| | 00:22 | declared a function here called loadXMLData.
| | 00:26 | The loadXMLData function essentially
sets up the test XML for this exercise.
| | 00:32 | It defines a string and you can see here,
<a><b><c>, this is the sample document
| | 00:37 | that we saw back in the slide. So
this essentially constructs the same XML
| | 00:43 | structure that we just looked
at during the lesson portion.
| | 00:46 | So I'm using the DOMParser object
and again, this is going to work
| | 00:50 | cross-browser now because I have got
the Sarissa library included. The parser
| | 00:55 | creates an XML document, which I store
here in this global variable which is
| | 00:59 | parsed from the TestData string.
| | 01:02 | Then down here, I have my window.onload
function which loads the XML data and
| | 01:08 | then does the depthFirstTraversal and
then shows an alert, which is this global
| | 01:14 | variable string which we are going to
be alerting on. That will be built up
| | 01:18 | over time in a moment. What it's going
to do is it's simply going to record the
| | 01:22 | document nodes in the order that we visit them.
| | 01:25 | Now if this were a more practical
example, we would be processing each node as
| | 01:30 | we visited it for some reason, but
since this is an illustrative example,
| | 01:34 | we are just going to build up the string
in the order that we visit the nodes.
| | 01:39 | So this is the function we need to
write. So what we are going to do is first
| | 01:43 | check to see if the node that we were
given is equal to null. Because if the
| | 01:48 | node is equal to null, then we can't
really operate on it. So we'll say if
| | 01:51 | (oNode != null). Okay,
then we can do our operation.
| | 01:58 | So we are going to write that for loop.
The purpose of the for loop, remember,
| | 02:01 | is to get us down to the lowest level
of the tree first and then work our way
| | 02:06 | back up and process each node after
we have already visited all of its children.
| | 02:14 | So I'm going to write
for (var theNode = oNode.firstChild;)
| | 02:24 | Then we need to make sure that theNode
is not equal to null because if theNode
| | 02:31 | is equal to null, then we've reached
the end of the child list. To advance it,
| | 02:36 | we say theNode = theNode.nextSibling.
| | 02:44 | So all we are going to do inside
this loop is call this function again,
| | 02:48 | depthFirstTraversal, only this time we
are going to call with theNode. So after
| | 02:54 | this for loop completes, because
we've run out of child nodes, what we are
| | 02:58 | going to do is in the VisitOrder string,
we are going to say g_sVisitOrder.
| | 03:04 | I'm going to append a value to it. We are
going to append the node name to indicate
| | 03:11 | that we were here, +, and we'll put
some space in there to make it look good.
| | 03:19 | So we have reached the point now where
we can try this out in the browser.
| | 03:22 | You can see that after this function
completes, we are going to just alert whatever
| | 03:27 | this string is. So let's go
ahead and bring this up in IE.
| | 03:34 | So you can see that what happened was,
we visited the nodes in the same order
| | 03:38 | as we indicated back in the slide. So
we went all the way down to the child
| | 03:41 | nodes f and g, then we went back up to
c, d and e and then up to b, and then
| | 03:46 | all the way down the right-hand side
we did i and j, back up to h, back up to a.
| | 03:50 | Then we finally visited the
document parent node. That's the #document
| | 03:55 | element right there.
| | 03:58 | Now let's try the same thing in
Firefox to make sure that works there.
| | 04:05 | It's the same result. You can see that we
have visited all the child nodes first even
| | 04:09 | though we passed in a, a is the very
last root tag that we visited, followed by
| | 04:14 | the document element, which is the parent
element in the DOM tree of the root tag.
| | 04:19 | That's the example of doing depth-
first node traversal. Let's move on now and
| | 04:24 | take a look at how we would do
a breadth-first node traversal.
| | Collapse this transcript |
| Understanding breadth-first document traversal| 00:00 | Now that we have seen how to do a
depth-first traversal of the nodes in a document,
| | 00:05 | let's now do a breadth-first
traversal. Here in this diagram that
| | 00:09 | you see we have the same XML node
structure that we had in the previous example,
| | 00:13 | only now we are going to traverse the
nodes in a separate order. This is called
| | 00:17 | breadth-first and in breadth-first
traversal you pick a node to start at
| | 00:22 | whether it's A here or B here,
whichever one, and the idea is you visit all of
| | 00:28 | the node's siblings and child siblings in order.
| | 00:31 | So for example if we were to start off
with node A, we would first visit node B
| | 00:36 | and we would process B right at this
moment. We wouldn't wait to come back to it.
| | 00:39 | Then we travel over to H and we'll
process node H. Now we have processed
| | 00:44 | all of the nodes in sibling order under A.
Then we'll travel all way over to C
| | 00:49 | because now we are going to go do all
of these child siblings and from C we
| | 00:53 | would go to D and then over to E and
from E we would go right back to the left,
| | 01:00 | go down to F and over to G. And when
that happens we would travel all the way
| | 01:04 | back over to I and then to J.
| | 01:06 | So in this case the node visit order
would be A and then B and then H and then
| | 01:12 | we would do C, D and E as the siblings
underneath B and then we would do F and
| | 01:16 | G as the siblings underneath C and
then we would go over to I and J which are
| | 01:23 | the sibling nodes underneath H.
| | 01:26 | The algorithm that does this is here.
This is called the breadth-first
| | 01:31 | traversal and it takes the node that
we want to start at. And the first thing
| | 01:35 | we do is check to see if the node is
not equal to null and that it actually has
| | 01:40 | some child nodes that we can process.
| | 01:43 | So we have our For loop and we say for
theNode = oNode.firstChild and we make
| | 01:50 | sure that the node is not equal to null
and when we want to advance the node
| | 01:53 | to get to the next sibling. And now
what we are doing is we are changing the
| | 01:56 | visit order to process it right here
upfront before we go down to the next
| | 02:01 | level and this is my string that I'm
building up to illustrate the order in
| | 02:06 | which we are visiting the nodes but
the code you would write in here is
| | 02:09 | whatever processing you want to do
for the node when it gets visited.
| | 02:12 | Okay. So that For loop is going to
execute. It's going to do all the sibling nodes.
| | 02:16 | Now we need to go down to the
next level, which is where we get the
| | 02:20 | first child for the node. We have the
same kind of loop only now we do that
| | 02:24 | recursive function call right here.
| | 02:26 | So we call this function back into
itself only this time we are going to
| | 02:29 | process all of the child node siblings.
So once again it's probably best you see
| | 02:34 | this in action in order to understand it.
| | 02:37 | Okay so here we are in the code for
the breadth-first example and you can see
| | 02:41 | it's the same code that we looked in
the depth-first example. The difference is
| | 02:46 | that the function name has changed but
everything else is the same, the sample
| | 02:51 | document structure is the same, and I'm
including the Sarissa Library so I can
| | 02:55 | do this cross platform. So this is
the function you need to write here.
| | 02:59 | The loadXMLData is going to load our
sample document structure using the parser
| | 03:04 | and after this function gets finished,
we are going to display an alert with
| | 03:09 | this string in it. So let's write
the code and the code is going to look
| | 03:14 | something like this. We are going to
if oNode != null and oNode.hasChildNodes
| | 03:26 | then we are going to do a loop and
the loop will say for var theNode =
| | 03:34 | oNode.firstChild; theNode != null
and theNode=theNode.nextSibling.
| | 03:53 | Okay, so inside this loop we are
going to write g_sVisitOrder +=
| | 04:07 | theNode.nodeName+ and some pretty
printing to make it look good. So that's what
| | 04:17 | takes care of visiting the nodes and
processing them. Now we need to process
| | 04:21 | all of the sub-child nodes. So we are
going to do for and essentially it's the
| | 04:27 | same loop. So I'm just going to copy
this and paste that in and I'm going to
| | 04:36 | put the braces in and I'm going to
call breadthFirstTraversal(theNode).
| | 04:45 | That's the function. So let's see how it works.
| | 04:49 | I am going to view this in the browser
and you could see that the nodes were
| | 04:55 | visited in the sibling order. So we did
node A and then we did node B and H and
| | 05:01 | then C, D and E, which were all
underneath B, then we did F and G, which are
| | 05:07 | underneath C, and then we did I and J,
which are underneath H. So things executed
| | 05:13 | in the order that we did them. Before
I go any further let's make sure it works
| | 05:19 | in Firefox. That way I can see that it did.
| | 05:23 | Okay the same results. A, B, H then
| | 05:26 | C, D, E then F, G and then I, J.
| | 05:28 | Okay, so you are probably wondering if
you can maybe change the order in which
| | 05:32 | things operate. Instead of going left
to right, can you go right to left? And
| | 05:36 | the answer is yes you can. If you
wanted to go right to left for example,
| | 05:40 | instead of doing firstChild you
would do lastChild and then instead of
| | 05:44 | nextSibling you do previousSibling
and so on, and then would give you
| | 05:47 | the reverse order.
| | 05:49 | Now that we have seen how to do
breadth-first traversal and depth-first traversal
| | 05:53 | let's move on to
the rest of our DOM algorithms.
| | Collapse this transcript |
| Using the isContainedBy() algorithm| 00:00 | Another really useful DOM algorithm
is the isContainedBy function, which
| | 00:05 | determines if a node is contained
within another node and this is another
| | 00:09 | example that happens all the
time in real-world DOM usage.
| | 00:13 | The way the algorithm works,
the isContainedBy function is given two
| | 00:17 | parameters. The first parameter is
the node that we want to see if it is
| | 00:21 | contained within somewhere and the
TestNode is either a specific instance of a
| | 00:26 | node that we want to see if it's
the container or is a string that
| | 00:31 | illustrates the name of a type of node
that we want to see if it contains
| | 00:35 | the first argument.
| | 00:36 | For example, we could pass in an
instance of a node here and then for oTestNode
| | 00:41 | we can give it a string like body or
div and this function will return true if
| | 00:47 | the first argument is contained within
the node of that type. Or we can pass it
| | 00:51 | a specific instance of a node in which
case, the function will return true if
| | 00:56 | the first argument was contained
within the specific nodes specified by
| | 01:00 | the second argument.
| | 01:01 | So the way it works is, regardless of
whether we are testing for strings or
| | 01:04 | objects, we start off by declaring
a TmpNode and we set that to be the
| | 01:08 | parentNode of the node that we are
checking, the first argument. And then while
| | 01:13 | that node is not null, we in the case
of strings check to see if the nodeName
| | 01:19 | is equal to the string that we were
given, and if they match we return true
| | 01:23 | because we find the match in that case.
Otherwise, we just set the TmpNode to
| | 01:27 | be the TmpNode's parent and we
keep doing that inside the while loop.
| | 01:31 | Now eventually parentNode is going to
be null because there's no more parents
| | 01:34 | left and that will cause this to be
set to null, which will cause the while
| | 01:38 | loop to terminate, and if that happens
then this return false statement
| | 01:42 | gets executed. In the case of a
specific object instance, the same thing is done,
| | 01:47 | except instead of comparing the
node name of the temporary node, we just
| | 01:52 | compare the two object instances
to see if they match each other.
| | 01:56 | Okay, so let's take a look at a live
example in the code to see how it works.
| | 02:01 | Here we are in the code. This is the
same code that we have been using for the
| | 02:07 | previous examples. Here is my string
representing the sample document and it's
| | 02:11 | the same as the examples we have been
using so far. And this is the function we
| | 02:15 | need to write here. This is called
isContainedBy and when my window loads,
| | 02:22 | there's a couple of tests
that we are going to run.
| | 02:24 | First, we are going to get a reference
to the g tag, which if you look in the
| | 02:28 | string up here, you will see that g
is contained with inside b, which is
| | 02:32 | contained with inside c, which is
contained with inside b, which is contained
| | 02:37 | with inside a, which is contained
within the document element. So we'll get
| | 02:41 | the g tag and then we'll see if it's
contained by a node of type b. We'll test it
| | 02:47 | against a node of type h and then we'll
test it against a specific instance,
| | 02:51 | in this case, the documentElement itself.
| | 02:53 | So this one should evaluate to true
because g is in fact inside an element of
| | 02:58 | type b. We are not comparing a
specific instance. This one should evaluate to
| | 03:02 | false because g is not inside a node of
type h. h is way over here, so it's not
| | 03:10 | containing g. This one should return
true as well because the g tag is in fact
| | 03:17 | inside this document. So let's
go ahead and write the function.
| | 03:21 | So we'll start off by checking to see
if we are doing strings or objects. I'll write
| | 03:24 | if (typeof (oTestNode) == "string").
Then we are comparing node types, else if
| | 03:38 | (typeof (oTestNode)== "object"), then
comparing specific object instance.
| | 03:51 | All right, let's do the string case first.
So we are going to write var oTmpNode.
| | 03:57 | So we start off by getting the
parent of the node we were given and while
| | 04:03 | that's not null, so while we
have a node to test against,
| | 04:11 | we are going to keep on doing these
comparisons. So we'll see in the case of
| | 04:14 | string, if the name of this node matches
the name that we were given to look for,
| | 04:21 | then congratulations, we have got
a match and we return true. Otherwise,
| | 04:29 | we just get the next parent in the chain
and that's eventually going to run out parents.
| | 04:38 | So if that happens, then we return false.
| | 04:42 | Now for the object case, it's pretty
much exactly the same algorithm except
| | 04:47 | for the name comparison. We are not
going to be comparing names; what we are
| | 04:51 | going to be comparing is the specific
instance. So we'll take the node name off.
| | 04:56 | Let's go back and take a look. So
this should return true, and then false,
| | 05:02 | and then true. See what happens.
| | 05:08 | Okay, so the first one is true because
g is inside b and it's false because g
| | 05:13 | is not inside h and that's true
because g is inside the document. So let's go
| | 05:18 | and check the Firefox case.
| | 05:24 | And there it is true,
and false, and true, same result.
| | 05:29 | So, now you know how to check to see
if a node isContainedBy another node,
| | 05:34 | either a type or a specific instance.
Let's move on to the next example.
| | Collapse this transcript |
| Using the containsNode() algorithm| 00:00 | The containsNode algorithm is also a
really useful algorithm to use in real
| | 00:05 | world XML processing. The containsNode
algorithm takes a node and sees if it
| | 00:11 | contains another node of a given type
or a specific instance and the way it
| | 00:16 | works is it's given two parameters,
this node here and the TestNode and we are
| | 00:22 | going to check to see if the first
parameter contains a node either of the
| | 00:26 | given type in the case of TestNode
being a string or if TestNode is an object,
| | 00:32 | the specific node referred to by TestNode.
| | 00:35 | So for example, we can pass in for
TestNode the string div and see if Node
| | 00:41 | contains a type of tag named div or
paragraph or table or whatever. Or we can
| | 00:46 | give it a specific node and we can ask,
hey does oNode contain the specific
| | 00:52 | node that we are talking about over
here in oTestNode? Let's take a look at how
| | 00:56 | the algorithm works. We default our
look of variable bFound to being false
| | 01:01 | because we assume that
there's nothing going to be found.
| | 01:03 | And then we check to see if TestNode
is of type string. And if it is, we are
| | 01:08 | comparing the node name of the node
that we have given to the string to see if
| | 01:13 | they match and if they do, we return
true because we have the match. On the
| | 01:18 | other hand, if oTestNode is an object
then we are checking to see if oNode
| | 01:23 | contains the specific oTestNode. And
in this case, we don't compare the node
| | 01:28 | name, we compare the node itself
against the TestNode we were given and if they
| | 01:32 | match then we return true. If
there's no match then we need to do our
| | 01:36 | containsNode function over again.
Only this time we need to process all the
| | 01:41 | children contained within oNode.
| | 01:43 | And this is where we see the appearance
of our real life depth first traversal
| | 01:49 | algorithm, which we talked about at the
beginning of this chapter. For each one
| | 01:53 | of the child nodes contained underneath
oNode, we are going to check to see if
| | 01:57 | it contains the node that we are looking for.
| | 02:00 | Let's see the code in action
because it's probably a little bit easy to
| | 02:04 | understand that way. So I'm going to go
ahead and switch over to the code. Here
| | 02:09 | we are in the code. It's the same
example code I have been using through all
| | 02:12 | the examples up until now. So very
small HTML files you can see. Up here,
| | 02:18 | I have got my test data and the test
string is the XML file we'll be executing
| | 02:24 | against and this is the function
that we need to write, this containsNode
| | 02:27 | function right here.
| | 02:29 | What we are going to do is execute a
couple of test cases. We are going to get
| | 02:33 | a reference to the b tag right here.
You can see the b tag is right below the
| | 02:39 | a. It's near the top of the document.
And then we are going to check to see if
| | 02:43 | the b tag contained a tag of type g,
which you can see that it does. It's right
| | 02:47 | there. We are going to check to see if
it contains a tag of type h, which it
| | 02:52 | does not. You can see that the h is
all the way over here outside the b and
| | 02:56 | then we are going to check to see if
the document element contains the b tag,
| | 03:03 | the specific one, not the type.
| | 03:05 | So this one should return true because
g is inside b. This one should return
| | 03:10 | false and this one should return true
because the document does contain that
| | 03:14 | specific b tag. All right, so let's go
ahead and write the code. We are going
| | 03:18 | to start off with our local variable
bFound and we'll set it to false because
| | 03:23 | we are going to assume that no matches
exist and that's what we are going to
| | 03:27 | return from the function.
| | 03:29 | So the first thing we are going to do
is check to see if the type that we were
| | 03:34 | given for the TestNode is a string.
And if it is, we are doing the string
| | 03:41 | comparison. Otherwise, if the type
that we were given for oTestNode is an
| | 03:50 | object we are doing specific object
comparisons. And if that doesn't work,
| | 03:56 | we are going to execute our loop,
which does the depth first traversal, and
| | 04:06 | we are going to call containsNode again.
| | 04:11 | So let's write the comparisons.
First one is in the case of the string,
| | 04:15 | I'm going to check the node name which
all nodes have to see if it matches the
| | 04:21 | TestNode we were given and if it does,
we return true. And in the case of the
| | 04:29 | object, we check to see if the object
itself matches and if it does, we return
| | 04:38 | true, otherwise we have to do this loop
here. So we'll say var theNode, get the
| | 04:47 | firstChild. Okay, then we need to make
sure that the node is not null, because
| | 04:58 | we have to have something to compare against.
| | 05:01 | And we need to make sure that we
haven't found anything yet. So if we have a
| | 05:07 | node to search and we need to keep
on searching because we haven't found
| | 05:10 | anything yet, then we are going to do
the function call and we are going to
| | 05:14 | proceed to the next sibling down the
line if we have to. Let me throw on the
| | 05:23 | parameters here. This is theNode and oTestNode.
| | 05:28 | Okay, so we have our cursor function
call, we have got our test in place and we
| | 05:33 | have our comparisons in place and we
are returning the ultimate result right
| | 05:37 | here. So once again we are going to see
if b contains a g and an h and then we
| | 05:47 | are going to see if the document
element contains this specific b tag. So we
| | 05:51 | should have true, false and true. So
let's go ahead and view that in the
| | 05:56 | browser. true, false and true. That
works. Let's do it in Firefox and there we
| | 06:11 | go true and false and true.
| | 06:16 | Okay, so now you know how to find out
if a node contains another node, let's
| | 06:22 | look at our next example.
| | Collapse this transcript |
| Using the hasSibling() algorithm| 00:00 | This real-world XML algorithm is
called hasSibling and you basically use this one
| | 00:05 | to determine if a node has a sibling
node that either has a name of a given
| | 00:10 | type or a specific sibling. That is,
a node that exists at the same level as the
| | 00:16 | node we are interested in comparing on
either side of it. So let's see how it works.
| | 00:21 | So hasSibling takes two arguments:
oNode and oTestNode. So oNode is the one
| | 00:26 | that we are interested in testing
against and oTestNode is either going to be a
| | 00:31 | string, which indicates a type of node
that we are interested in finding, or
| | 00:36 | it could be an object, which indicates a
specific instance of a node that we are looking for.
| | 00:41 | So for example, using this function we
can check to see if a paragraph has a
| | 00:46 | sibling of another type, like another
paragraph or a div or something, or we can
| | 00:50 | check to see if a node has a specific
sibling in mind. You might want to check
| | 00:56 | to see if a button control has an
adjoining edit field, for example.
| | 01:02 | Let's take a look at how it works.
We start off declaring a temporary variable
| | 01:05 | that holds the previous sibling of
the node that we are looking at. So
| | 01:10 | the first thing we are going to do is
search to the left, then we are going to
| | 01:13 | search to the right if we don't have
any matches. So while we have a node to
| | 01:18 | compare against, we check to see if the
type of argument that we were given for
| | 01:22 | test node is a string and if it is, then we
compare the node name against the test node.
| | 01:27 | Otherwise, if it's an object, we are
looking for a specific instance.
| | 01:30 | In that case, we compare the two objects
together and return true if they match.
| | 01:34 | If we don't have a match, then we simply
get the previous sibling and this will
| | 01:39 | eventually become null if we run out of
siblings and we have no matches, which
| | 01:42 | will cause oTmpNode to null and then
this while loop will fall through.
| | 01:47 | When that happens, we look the other way.
We start looking at nextSibling. Now we are
| | 01:50 | going to search to the right and the
whole process starts all over again. So,
| | 01:54 | let's look at this in the
code and see how it works.
| | 01:56 | Okay, so here we are in the code and we
need to write the hasSibling function.
| | 02:03 | You can see that I'm using the same
example file that I have been using all
| | 02:08 | along. There's the sample XML code up
there. So what we are going to do in this
| | 02:12 | example is get a reference to the c tag,
which is right here. We are going to
| | 02:17 | check to see if the c tag has a
sibling of type e, of type i and
| | 02:24 | then we're going to see if it has a
specific instance of e next to it.
| | 02:30 | So we are going to get a reference to
the e tag that's up here and check to see
| | 02:35 | if this e tag is a sibling of it. So
you can see that the c tag here is in fact
| | 02:42 | a sibling. It has two of them actually.
There's a c, a d and an e, they are all
| | 02:45 | at the same level. i is all the way
over here so it's not a sibling.
| | 02:49 | So that should return false and since we
are getting the specific instance of e that
| | 02:55 | we compared up here, this one should
also return true. So in this case it says,
| | 03:01 | hey, does my node have a sibling of type e?
And this one says, hey, does my node
| | 03:05 | have this specific node as a sibling?
| | 03:08 | So let's go ahead and write the
hasSibling function. What we are going to do is
| | 03:12 | declare our temporary variable and
this is going to hold the node that we do
| | 03:18 | our comparing against. And we are going
to set it to be the previous sibling to
| | 03:23 | start with and while we have an
oTmpNode to compare against, we are going to do
| | 03:36 | the comparisons. So if the type of
oTestNode that we were given is a string,
| | 03:44 | we are going to do a string comparison.
Otherwise if we were given an object...
| | 03:58 | we are going to do an object comparison.
And if no match happens, then we'll just
| | 04:05 | simply get the next node.
| | 04:13 | Okay, and let me close this
off right there and move
| | 04:20 | this up to the right level.
| | 04:25 | All right, now if this does not result
in a search match, then we start on the
| | 04:31 | other side looking at next siblings.
So in that case, we are going to set
| | 04:35 | oTmpNode = oNode.nextSibling and in
this case, the logic of the top half just
| | 04:46 | repeats again. So I'll copy that,
paste it down here and in this case,
| | 04:52 | we are not looking for the previousSibling
anymore. We are looking for the nextSibling
| | 04:56 | and in the case of strings we
are going to compare the temp name.
| | 05:01 | So if (oTmpNode. nodeName == oTestNode)
then return true and in the case of objects,
| | 05:14 | we are not comparing the nodeName.
It's just the object itself and the same thing
| | 05:22 | down in the nextSibling case.
Just copy and paste those guys
| | 05:33 | and if this while loop falls to the bottom,
then we just need to return false.
| | 05:41 | Looks like we are ready to give this a
spin. Let's go ahead and bring this up
| | 05:44 | in the browser. So remember, we are
looking for true, false and true.
| | 05:54 | There's true, there's false and there's true.
Let's try it with Firefox.
| | 06:03 | There's true, false and true. Now, you know
how to check to see if a node has a sibling of a
| | 06:10 | given type or a specific instance. So now
we are going to move on to our last example.
| | Collapse this transcript |
| Using the getElementsByAttrVal() algorithm| 00:00 | Okay. So for this last example we are
going to write another algorithm that
| | 00:04 | uses a depth-first traversal, and
this is a really useful function.
| | 00:10 | It retrieves all the elements that have
an attribute that match a given attribute
| | 00:14 | value and it's called, oddly
enough, getElementsbyAttrVal.
| | 00:19 | It works by taking three arguments.
There's the node that we are interested in,
| | 00:24 | that's the starting point, and we are
going to look for elements inside oNode
| | 00:28 | here that have an attribute the same
as sAttName and a value that's equal to
| | 00:35 | sAttVal right here.
| | 00:37 | The way it works is we have an
internal array here named aNodes and we also
| | 00:42 | have a locally defined function
inside our outer function here. The locally
| | 00:47 | defined function is called
processNodes. This is our depth-first traversal
| | 00:51 | algorithm that's going to look
through all of the nodes and check to see if
| | 00:56 | they in fact have an attribute that
matches the value we are looking for.
| | 00:59 | Each time we find a match, we are going
to add that node to this internal array
| | 01:05 | right here, and when we are all done, we are
going to return that array back to the caller.
| | 01:10 | This is a really powerful function. You
can use it to quickly build up lists of
| | 01:15 | nodes that have attributes that have
a given value. It also illustrates
| | 01:19 | the concept of having a nested
function inside of another function.
| | 01:23 | Let's have a look at the code in
action. So here we are in the sample code.
| | 01:29 | This is pretty much the same sample
code that I have been using up until now
| | 01:32 | with a major difference. You can see
I have gone up to the testXML data here
| | 01:39 | and I have added some
attributes to some of the elements.
| | 01:44 | So I have added attribute type='test'
to the b, the g, the i elements, and on
| | 01:54 | the j element I have added type='blah'.
So for testing, we are going to be
| | 02:00 | looking for all the
elements that have type='test'.
| | 02:06 | So down here in the testing area,
you can see that what I'm doing is
| | 02:11 | I'm getting a reference to the top level
a tag, that's the root of the document,
| | 02:17 | and I'm calling getElementsbyAttrVal
on the top level tag and I'm looking for
| | 02:23 | elements that have a type
attribute equal to the string test.
| | 02:28 | I am going to have a string variable
down here. It says, "Nodes with attribute
| | 02:32 | 'test': "; and then I'm going to have a
loop that's going to build up a list of
| | 02:36 | nodes that match the result. So this
result right here that's returned by
| | 02:41 | getElementsbyAttrVal, this is going
to be an array of all the elements that
| | 02:45 | match this criteria right here.
| | 02:48 | Let's go ahead and write the code. So
I'm going to clear my local array and
| | 02:56 | initialize it to being empty, and then
I'm going to write my internal function.
| | 03:01 | This is my internal processNodes
function. I'm going to call this from within
| | 03:08 | my getElementsbyAttrVal function. So
I need to call processNodes and this is
| | 03:14 | what kicks off the process here.
| | 03:17 | At the end of the day, we are going
to return the aNodes array. And for
| | 03:23 | processNodes, we are going to pass in
oNode and we are going to pass in the
| | 03:27 | aNodes array, because that's going to
be modified and then we are going to pass
| | 03:32 | in the name and attribute
values that we are looking for.
| | 03:39 | So the processNodes function, that's
going to take these arguments right here.
| | 03:45 | So let me go and Copy those guys onto
this function right here, and I'm going
| | 03:52 | to change the names slightly just to avoid
confusion while we are reading the codes.
| | 03:56 | So I'm going to call this aNodeList,
and that should be good enough. So inside
| | 04:04 | processNodes, we are going to check
to see if (oNodes.nodeType; and this is
| | 04:11 | something that all nodes have. They
all have a nodeType. We are going to
| | 04:14 | compare that against the
constant value on the node class to be a
| | 04:18 | node.ELEMENT_NODE and the reason we
are going to do that is because only
| | 04:22 | node.ELEMENT_NODES can have attributes.
| | 04:25 | So there's really no point in comparing
other kinds of nodes, like textNodes or
| | 04:29 | comments or CDATA sections. All of
these are XML data types, but they can have
| | 04:35 | attributes, so we might as well exclude them.
| | 04:37 | Then we are going to have a local
variable called sAttr and we are going to get
| | 04:44 | the attribute that we are looking for
from the node, and we do that by calling
| | 04:49 | the getAttribute DOM function with
the sAttName. If this is not equal to null,
| | 04:56 | then we need to check to see if it's
equal to the value that we are looking for.
| | 05:02 | We do that by saying if (sAttr ==
sAttVal). And if they match, then we say
| | 05:14 | aNodeList.push() and we are going
to push the node onto the array list.
| | 05:24 | Now that we have done that, we have to
recursively call the function to do our
| | 05:29 | depth-first traversal. So we are going
to say for (var n = oNode.firstChild;
| | 05:42 | n!= null; n = n.nextSibling). And
inside this for loop, we are going to call
| | 05:57 | processNodes again and processNodes
will give the n argument, because that's
| | 06:04 | the child node that we are going to process.
| | 06:07 | We have to pass in the NodeList so
that we can keep on adding results and
| | 06:11 | we have to pass in the attribute we are
looking for and the value that we are looking for.
| | 06:19 | At this point, we should be in a place
where we can test this out. Let's look
| | 06:24 | down here. So remember, starting at
the a tag, we are looking to build a list
| | 06:30 | of all the elements that have the type
attribute set to 'test'. So if we look
| | 06:35 | back up in the sample XML, that should
be the b tag, the g tag, the i tag, and
| | 06:48 | it should exclude the j, because it has
a type attribute but it's not equal to 'test'.
| | 06:52 | It's equal to 'blah'.
| | 06:53 | Let's go ahead and run this in
the browser and see what happens.
| | 07:04 | So you can see nodes with attribute 'test'
b, g, and i were all found, and j was excluded
| | 07:11 | like it should have been.
| | 07:13 | So let's try it in Firefox.
| | 07:19 | Okay, same result.
You can see nodes with attribute
| | 07:22 | test b, g, and i, and j was
excluded the way it should be.
| | 07:26 | Okay. Once again, we have built a
real-world DOM function that you can use in
| | 07:31 | your real-world work. So please feel free
to go ahead and use that code in your projects
| | 07:36 | and that concludes this chapter.
| | Collapse this transcript |
|
|
ConclusionGoodbye| 00:00 | Okay, that concludes Real-World XML.
I hope you enjoyed working along through
| | 00:04 | the examples with me and I hope you
learned a lot. You should have a good
| | 00:08 | foundation now to go out and work with
XML in the real world, especially since
| | 00:11 | we saw how to use XML and
JavaScript in the browsers.
| | 00:15 | We walked through designing and
implementing our own XML format. We took a look
| | 00:20 | at some of the real-world XML formats
that are out there and used today.
| | 00:23 | We took a look at some real-world DOM
algorithms that you can use in your pages
| | 00:28 | and projects to make you more
productive in your work environment.
| | 00:31 | Hope you enjoyed yourself. Thanks for listening.
| | Collapse this transcript |
|
|