Join Gabriel Powell for an in-depth discussion in this video What's inside an EPUB file?, part of Creating Ebooks with InDesign CS4 or CS5.
In this lesson, I'm going to show you what's inside an EPUB file, and I'll explain how the various components work. An EPUB file is actually a package that contains a number of files that work together to create the eBook experience. Here's an overview of the components that make up an EPUB file. As you can see an EPUB file contains quite a few files a few folders as well. All of these components are packaged together into a single zip archive, that's given the EPUB file extension. Alright let's take a deeper look at the various components that make up an EPUB file starting with the mimetype file.
This file identifies the package as an EPUB file. Its stored as a plain text file, it has to be located at the top level of the EPUB package and it can not be compressed. All of the other files in the EPUB package are compressed, but if this file is compressed. The EPUB file won't be valid and you won't be able to open it up on an eBook reader. I currently have an EPUB file open, in an XML editor called Oxygen. The contents of this EPUB file, are showing over here on the left side, within the Archive Browser panel. Here you can navigate through the package.
When you want to open a file for viewing, or editing, simply double click it. So, here's the mimetype file. I'll go ahead and open it up. It contains just one line of code, which identifies this package as an EPUB file. Alright, I'll go ahead and close this file. So, the next file that I'm going to introduce you to, is the container.xml file. This XML file is stored within the META-INF folder.
It simply directs the eBook reader to the content.opf file, which in turn references all the files that make up the content of an EPUB file. I'll go ahead and open that up inside of Oxygen, so that file is located within this META-INF folder. I will flip the triangle to open that up and here it is, I'll double click it to open it up. So, as you can see there isn't a lot of code in this file, its main purpose is to reference the content.opf file which it's doing right here.
This is the file path to that file. All right. So, the next file that I like to introduce you to, is the encryption.xml file. This file is located within the META-INF folder It's actually an optional file, so you won't see it in every EPUB file that you create. It's only included in the EPUB files that you export from InDesign, when you choose to embed the fonts into the EPUB file as you export it. And it's used to encrypt the fonts that have been embedded into the eBook.
So, if you don't embed the fonts when you export an EPUB file, the encryption.xml file won't be created. I'll go ahead and open this file up into Oxygen. Now, all the code is currently located on one line. So, to make this more reader-friendly, I'm going to click this Format and Indent button up here, and that'll make it much easier to read. As you can see here, the encryption file, is referencing the fonts that are embedded into the package. Since there are three fonts, there are three references, and the fonts are located over here in the Fonts folder.
Alright. So, the next file that I'd like to introduce you to, is the content.opf file, this file is actually an XML file. This is the root file of an EPUB file because it contains the eBook's metadata. It identifies all the components of an eBook, and it describes the reading order for the contents of an eBook. So, this file plays an important role. Lets take a closer look at this file inside of Oxygen.
Here is the content.opf file, I'll go ahead and double click it to open it, you'll find three important sections within this file. There is the Metadata section, there is the Manifest, and at the very bottom is the Spine. The metadata element, is used to provide information about the publication as a whole. Metadata is an important part of every eBook because it describes an eBook, and it makes it searchable. So, here you can see the title of my book is Sample eBook, the creator is Gabriel Powell.
The subject is eBooks, there's a description, there's a date, a copyright, and an identifier. The identifier is used for the ISBN number of your eBook. And then there's a language element as well. The Manifest references all the files that are part of the publication, including the NCX file, all of the XHTML files, the images, and the CSS file.
And then the Spine element is used to determine the linear reading order of the publication. The order of these Item Ref elements is very important, because they determine the reading order of the content in the eBook. So, if I were to move this chapter one item ref to the top of the list, chapter one would appear first in the book when I open the eBook on an eBook reader. Alright so that's how the content .opf file works. Now I'd like to introduce you to the toc.ncx file.
This file is also an XML file. It provides an eBook reader with detailed navigation information. And its main purpose is to serve as a navigation map that's generally displayed as a menu in an eBook reader. Enabling you to jump directly to any of the major sections in the eBook. It's essentially a table of contents for an eBook. Let's open that file inside of Oxygen. Here it is. I'll double click it to open it up. When you export an EPUB file from InDesign, this NCX file is generated in one of two ways.
If you specify a TOC style when you export the EPUB file, the NCX file is based on that TOC style. But if you don't specify a TOC style, then this NCX file is based on the names of your InDesign documents. This section of this file that contains the navigation guide is right here within the Nav Map label. This element contains several Nav point elements. Each Nav Point element is an entry in the navigation guide.
So, this is the entry for chapter one. And when you click on that entry, you're taken to this file right here, chapter two. So, that's how the toc.ncx file works. Now I'll introduce you to the template.css file. This file contains the CSS style sheets, which are used to format the content of an EPUB file. It's stored as an external style sheet, so when you change one of the CSS rules, all the text in the publication that that rule is applied to gets updated.
Let's open that file up inside of Oxygen. These are all the CSS rules that determine the formatting for the text in this eBook. Now the current EPUB specification defines a style language based on CSS 2. But not all CSS 2 properties are actually supported. So, for a complete list of all the supported CSS properties. You'll want to go to the website www.idpf.org and then click on this Specifications tab. Locate the Open Publication Structure, right here, and then navigate to Section 3.3.
This section gives you detailed information about the CSS properties that are supported. All right, let's take a look at the next component of an EPUB file. Which are the XHTML files. An EPUB file can contain any number of XHMTL files. Together the XHTML files conatin the actual content of an EPUB file. When you export a single InDesign document as an EPUB file, only one XHTML file is generated. But when you export an InDesign Book file, an XHTML file is generated for each document in the Book file.
So, this EPUB file contains quite a number of XHTML files, one for the cover, one for the table of contents, and another for each chapter and section. And then there's the Images Folder. This folder contains the actual image files for the eBook. When you export an EPUB file from InDesign, the images are copied to this folder according to the image export options that you've specified. If your publication doesn't contain any images of course the images folder isn't created.
The EPUB specification support GIF, JPEG, PNG and SVG image types. However, InDesign only exports GIF of JPEG files. Unless you're using InDesign CS4, and you choose to copy the original images when you export the EPUB file. And the last folder in the e-pub package would be the fonts folder. This folder contains the font file for the eBook ,if you choose to embed the fonts when you exported the EPUB file from InDesign. Keep in mind that only open type fonts and supported true type fonts, are copied into this folder.
Postscript font cannot be embedded within an EPUB file so, here within the EPUB file that I have open, there is an images folder. It contains three images, one is a JPEG, and the two others are GIF images. And then there's a Fonts folder which contains three fonts that have been embedded into this EPUB package. Alright. So, now that I've introduced you to the various components of an EPUB file, and I've explained how they're used in an eBook. I'd like to go back to the website, of the International Digital Publishing forum.
I'll go back to their Homepage. The EPUB file format, is based on three open standards. The Open Publication Structure, the Open Packaging Format and the Open Container Format. If you want to dive deeper into the structure of an EPUB file, and learn even more. You can read the documentation for each of these specifications just click over here on the Specifications tab. And then check out each one of these documents.
The one that you should be most concerned with is the Open Publications Structure.
- Exporting an EPUB file from InDesign CS4 or CS5
- What's inside an EPUB file?
- Editing an EPUB file in Mac OS X or Windows
- Laying out pages
- Working with text
- Exporting graphics
- Creating a table of contents or navigation guide
- Inserting metadata
- Creating scalable images
- Validating an EPUB file