Join Anne-Marie Concepción for an in-depth discussion in this video Looking at the components of an EPUB file, part of InDesign CC: EPUB Fundamentals.
Now that we know how to open, or crack open an ePub, I want to take a look at the components inside an ePub. Again, we are not going to be spending a lot of time in this fundamentals course, editing the components of an ePub. We are going to everything in InDesign. But it helps a great deal to understand why we need to different kinds of prep to the InDesign file. If we understand how InDesign's going to convert that in to an ePub. How is InDesign going to convert, for example, all the chapter starts to an ePub's chapters? What's it going to do with the images? What's it going to do with all the paragraph and character styles and fonts and so on? When you export to ePub.
What happens to all that stuff? So here is the ePub. You can see it's got all the images and the text. It has the same formatting. It's got links. Some formatting's not quite what we want. But we're going to be working on that in the title. It's got a table of contents that brings us to different chapter starts. How does it do that? How does it know, for example, I had this great question from, somebody called me up last week and said. In an ePub, how does it know that when you are at the end of one chapter, when I'm at the end of Painting in France, that it's supposed to open up the next chapter Painting in England? How does it know how to go to the next chapter? And the answer is that it all depends on how things are written out inside the ePub.
So in the finder here, I've got that history of art, that ePub on my desktop, along with Alice's Adventures in Wonderland that we looked at in the first video. And I've cracked them both open by dragging them and dropping them on top of my ePub unzip Apple script that I talked about in the previous video. So let's take a look inside History of Art. Inside every ePub is actually a website. And that comes as a surprise to a lot of people who are new. To ePub's that, an ePub is actually a website.
If we just dive right in, into this folder, you can see that each one of these chapters that you see behind the Finder window, corresponds with a matching HTML file. So, History of Arts, hyphen one. If I double click it. It's going to open up in Safari. Painting in Flanders, Holland, and Germany. It's an actual HTML file and it's formatting is being applied by CSS. So inside this CSS folder inside this ePub folder, we have the CSS file.
And if you know anything about CSS, let me open this up in my favorite text editor called Text Wrangler. You know that CSS is essentially like paragraphing character styles. Here's paragraph style for a website. So InDesign is converting paragraph styles and character styles. All these settings here into CSS. And they're converting the text into HTML. So again, I've cracked open History of Art, and we have, inside the History of Art folder we've got two folders, and a single file called mimetype.
We're going to come back to this folder with the HTML and CSS in a bit. Let's take it from the top. Inside the meta INF folder, there's one simple file called container.xml and if I do a quick preview here in the finder. All it does is, it gives the eReader device the path. To this all important file called contents.opf. This file is kind of like the brains of the ePub. And it tells the Kindle or the iPad or the Adobe Digital Editions what to open when and what to open next as the reader swipes back and forth.
We'll look at that in a second. Some meta folders also contain other files like in this book there was some fancy embedded and so they got encrypted. If you didn't embed fancy, you're just going to use the device's fancy then like let's look at the Alice's adventures in Wonderland, we don't have an encryption file there. The mimetype, this special file that cannot be compressed when you rezip it, is this hugely complicated thing.
All it says is hey I'm an ePub file that's been zipped up, that's all. This is part of the IDPF specifications, and there you go. Let's just, go ahead and close those. The main event, though, happens right here in the OEBPS folder. Which actually stands for, Open eBook Publications Structure. I finally learned that, after all these years. I used to just call it, let's just call it the content folder. Because essentially all the good stuff, the guts of the epub is in here. All the HTML files, every chapter is broken up into its own, separate HTML file, or XHTML file.
Here is the all important content, that OPF file that we'll look at in a second. If all of the CSS, the Cascading Style Sheets, are put into a folder here in this document, there's three different style sheets. That all these files link to in one way or another. If there's images in the books, some books are just text, but if there's images, usually they are segregated into an images folder and maybe some subfolders, and all images are web friendly formats. They've all been converted to JPEGs, GIFFs, or PNGs, PNG files.
Then there are some XML control files that you really don't need to know too much about. And this file, the good thing to know is that InDesign creates these guys for you. For example, the table of contents file. This is an XML control file that creates the table of contents that you see right here. It's called the navigational TOC. Every eReader device, or eReading program has some sort of button or menu command that will show you the internal navigation so you can jump around. Some files also have an HTML TOC, that means an HTML table of contents that's part of the file. Not everyone does.
But the file that governs this part here on the left, is that little XML file that InDesign creates. And we have a video on how to create this file. Then we have that all important content. That OPF file that I mentioned earlier. That the file and meta. Points to. Let's open that up in text editor. I know it looks very intimidating. It's again remember InDesign creates this. You do not have to create this. And after all of the preamble about saying which version and so on up here.
There are three main sections in the OPF file. The first one is all the metadata. What's the title of this work? Who's the publisher? When was this created? And so on. After all the metadata. Then we have a section called the manifest. Just like a shipping label lists everything inside the box that the manifest of the, of the shipment. The content.opf file, manifest section lists every single file that is inside this ePub inside the OEBPS folder. So all the HTML files are linked and their given a unique ID.
And this answers that person's question about how does an eReader know that when I get to the end of this chapter if I just swipe that it should open up the next HTML file? And that is because it's going by according to the spine. The spine section says, when the person is done. With chapter one, the next thing that should open is chapter two. And it's relating to them by their unique IDs, which were called out up here in the manifest. There is a fourth optional section called the guide that some ebooks have. It's really not too common these days, but if you're using a Kindle or if you're creating an eBook for Kindle.
Then they really want you to include a guide section where it says like, where's the cover, where is the first page, it should be open at, that has text on it, if there's a glossary, or if there's an index, sometimes you can links directly to there. Not really required, but if there is a guide section it would go at the end of the content.opf. Now, that's probably more than you ever needed to know about what's inside the ePub. But as long as you understand that essentially an ePub is a website, that the pages and chapters are HTML files, that the formatting is done with CSS.
And that there are a few other files inclu inside here that manage the whole thing, then its a lot easier to understand what it is that we need to do in InDesign in order to get a beautiful ePub at the end.
- Understanding the difference between an ebook and EPUB
- Creating an EPUB workspace in InDesign
- Managing the sequence of content
- Creating a table of contents for navigation
- Adding metadata
- Cleaning up text formatting
- Optimizing images for EPUB export
- Previewing and validating EPUB files
- Converting EPUBs to Kindle MOBI format