Zipping ePub files

The ePub file format is just a zip file. But you have to be careful how you construct it

The ePub file format can be an intimidating beast. But in essence it's quite simple - a bunch of XHTML pages plus accompanying metadata in XML files all bundled up in an ordinary zip file.

That's good news because it means you can unzip the file, hack away at the contents (as we'll be seeing in the next post, about fixing Adobe InDesign CS3's ePub shortcomings) and then just zip it up again.

But there is a gotcha. The e-book readers that work with ePub files can be very picky about how the zip file is put together. And the ePub specification itself is highly specific. So here are a couple of quick tips on how to avoid problems.

I use OS X and Linux, but Windows users should be able to adapt these comments to their own environment.

Unzip the file

Clearly, your first step is to unzip the file. Linux users, being the geeks they are, will naturally head straight for the command line to do this, and quite rightly so. Mac users are accustomed to unzipping files by double-clicking on them, but this is unlikely to work with a file that has the .epub extension - you're more likely to end up opening the file in your favourite e-reader. Ditto for Windows users.

You could use a utility like StuffIt Expander. But given that much of the rest of what I'm discussing here will be done from the command line, you might as well go ahead and open a terminal window.

Let's assume your ePub file is called MyEbook.epub. For the sake of simplicity, you might want to have this sitting in a folder by itself.

From the command line, cd to the folder holding the file. Then simply:

unzip MyEbook.epub

The zip file will spew its contents into the folder. Hack away to your heart's content until you're ready to zip up the file again.

Zip the ePub file

Among the files you'll have found among the contents of the ePub document is one called mimetype. This is the critical one.

E-book readers require that the mimetype file is the first one in the zip document. What's more, to be fully compliant, this file should start at a very specific point - a 30-byte offset from the beginning of the zip file (so that the mimetype text itself starts at byte 38).

If this sounds intimidating, don't worry. It's actually quite easy to achieve if you're careful.

At this point, you might want to move the original MyEbook.epub file out of the way (or delete it, if you're working with a copy, which is a sensible thing to do). To start creating your ePub file, use the following:

zip -X MyNewEbook.epub mimetype

I've given the new ePub file a different name in case you ignored my advice about moving the original out of the way.

The key element here is that -X flag. It tells zip to ignore file 'extras' - metadata such as permissions, etc. If you don't use this flag, the contents of the mimetype file will be placed at the wrong position in the zip file. E-book readers may complain that the file contains formatting errors. And tools such as epubcheck (more of that in the next post) will tell you that the ePub file has the wrong mimetype - even when it has the correct mimetype, just in a slightly incorrect position. That can lead to all sorts of confusion.

You can then go ahead and add the rest of the files to the MyNewEbook.epub zip/epub file you've just created. Which files you need to add will depend on how the ePub file was put together in the first place.

I use InDesign CS3 for creating ePub files. These contain the mimetype file plus two directories - META-INF (containing one metadata file) and OEBPS (containing the book files themselves, images, more metadata etc).

So I use the commands:

zip -rg MyNewEbook.epub META-INF -x \*.DS_Store

zip -rg MyNewEbook.epub OEBPS -x \*.DS_Store

Some explanations necessary here. We start each line with two flags:

-r    (recursive)

This means move down through any directories/folders recursively, ensuring that everything in the folders specified gets included

-g    (grow file)

This means add to an existing zip file rather than creating a new one or overwriting an existing one. If you don't use this flag, the file you started with the mimetype file, above, will get overwritten.

-x \*.DS_Store    (exclude)

This is just for Mac users. It tells zip to ignore the .DS_Store hidden file that is found in most Mac OS X folders.

And that's it. To make things easier, I've put these commands into a shell script which is in my PATH for Bash sessions.

If you're uncomfortable with the command line, and still prefer to use GUI-based zip utilities, the above should give you enough information on which to make sensible choices about settings.

 

Comments (25)

Tags: self-publishing publishing books e-books ePub technical zip

Please note: comments on this site are moderated - partly to eliminate spamming and partly to avoid wasting space and bandwidth. Any comments deemed offensive, juvenile, stupid or pointless are deleted.

Write a comment

  • Required fields are marked with *.

If you have trouble reading the code, click on the code itself to generate a new random code.
Security Code:
 
Showing comments 1 to 10 of 25 | Next | Last
Sven
Posts: 20
Comment
Great
Reply #25 on : Tue April 27, 2010, 15:34:57
great article though it did not work for me.. I'm using terminal for the first time. I got to unzip but zipping eludes me.. could you do a step by step screen shot maybe.. thank you
sven
admin
Posts: 8
Comment
Re: Zipping ePub files
Reply #24 on : Wed April 28, 2010, 01:27:47
Sven, I'll add some screen shots when I get the time, but that's not going to be soon - things are very busy right now! In the meantime, there's tons of stuff on the web about using zip...
Rohit
Posts: 20
Comment
Good article
Reply #23 on : Sun May 02, 2010, 06:51:53
This is a really good article to start with ePub file creation.
keep up the good work :)
Sambodhi Prem
Posts: 20
Comment
Terminal says: "-bash:"
Reply #22 on : Sat July 03, 2010, 17:06:29
Hi

I'm trying to unzip my .epub file, but I'm getting stuck with Terminal.

Terminal refuses to unzip my .epub file, instead it says:

-bash: cd/Users/sambodhiprem/Desktop/Yes tests: No such file or directory

Could this perhaps be a permissions issue? I would very much appreciate your help.

The complete Terminal session is as follows:

Last login: Sun Jul 4 10:17:44 on console

Inner-Sky-7:~ sambodhiprem$ cd/Users/sambodhiprem/Desktop/Yes\ tests unzip MyEbook.epub

-bash: cd/Users/sambodhiprem/Desktop/Yes tests: No such file or directory

Inner-Sky-7:~ sambodhiprem$


Sincerely,

Sambodhi Prem
Titirangi, New Zealand
admin
Posts: 8
Comment
Re: Zipping ePub files
Reply #21 on : Sun July 04, 2010, 02:32:52
Your main problem here is that you need to issue the cd and unzip commands as separate commands, not all on one line. You also need a space between cd and the path.
Sambodhi Prem
Posts: 20
Comment
Re: Zipping ePub files
Reply #20 on : Sun July 04, 2010, 18:21:17
Ah, that worked! Thank you very much!

best regards

Sambodhi

ps here's some of my music for you :-)
http://sambodhiprem.bandcamp.com/album/seven-waves-of-knowing
Sambodhi Prem
Posts: 20
Comment
Re: Zipping ePub files
Reply #19 on : Tue July 13, 2010, 02:00:25
Hi

You wrote:

"To make things easier, I've put these commands into a shell script which is in my PATH for Bash sessions."

Would you be able to point me in the right direction how to make a shell script?

At the moment I'm pasting the three command lines into Terminal, that works great, but it would be good if there's a way to automate this.

Would making a small apple script be a good way to go about it?

Sincerely

Sambodhi Prem
admin
Posts: 8
Comment
Re: Zipping ePub files
Reply #18 on : Tue July 13, 2010, 02:07:07
Sambodhi, you could start here:

http://bit.ly/google-shell-scripting

That's a Google search on shell scripting
Soumen Das
Posts: 20
Comment
PROBLEM
Reply #17 on : Fri November 19, 2010, 22:05:53
I have found a problem in using Epubcheck Version 1.0.5

ERROR: s1.epub: length of first filename in
archive must be 8, but was 22
Red
Posts: 20
Comment
Re: Zipping ePub files
Reply #16 on : Thu January 27, 2011, 13:07:24
Soumen Das,
it is basically about that MIME type issue - it is not in the right place, as far as I recall.
Showing comments 1 to 10 of 25 | Next | Last