Express Delivery by Kamyar Adi/flickr.com. Used through a Creative Commons license.

6 great file formats to send to ebook designers (and 2 awful ones)

Express Delivery by Kamyar Adi/flickr.com. Used through a Creative Commons license.So you’ve decided to have an ebook designer convert your book into ePub (iBooks/Nook/Kobo) and mobi (Kindle) formats. Great! What formats are best to send?

I mentioned in a rant a couple of weeks ago how not to do it. However, what should you do?

The designer/conversion house will hopefully tell you how they want the book delivered. Well, an ebook is essentially web pages in a box, so if you hand the designer clean HTML, she or he will want to kiss you; it’s very simple to turn that into an ebook. Beyond that, a good ebook designer should be able to work with whatever format you give them, whether it’s an InDesign or Quark file, a Word or Pages or OpenOffice doc, a PDF (see more below, however) — heck, I’ve managed to work from a copies of the print book, and typed or handwritten manuscripts. Still, it’s good to be prepared.

What are the best formats to send, in terms of keeping your cost low and your quality high?

Well, I gave you my preferences in order, in that list above: HTML, InDesign/Quark, Word/Pages/OpenOffice/etc, PDF from the print designer, PDF made from a scanned copy of the book, hard copy of the book, long-hand. Whew.

Whether another ebook designer wants the InDesign or Quark file depends on whether she or he also works on print — and has a recent copy of the software (not an inexpensive piece of software). The last couple of iterations of ID in particular (CS6 and CC) have gotten the ePub export feature to the point that it’s actually worth using. It also depends on whether the formatting of the book is complicated, including lots of images and tables, say.

Word is useful in that it’s the lingua franca of text documents — almost everyone works in Word (or can export to .doc or .docx files), and so the workflow is predictable. For me, Word is quite acceptable, but less so than InDesign, because the workflow to turn that Word doc into clean HTML that retains the formatting takes more steps. Does that make sense? OpenOffice and Pages both have their own ePub export functions — the files they create are messes, from a coding point of view, but no worse than the HTML exported by Word.

By the way: a piece of advice for folks working in Word (or Pages or…) who are submitting to either a print or ebook designer: don’t mess around too much with the formatting of your manuscript. Italics and boldface are fine, but if you’re going to have repeated stylistic elements, such as chapter heads, epigrams (quotes at the beginning of a book or chapter), extracts (extended quotations within a chapter) or whatever, use the Styles feature in the app to label those elements consistently. It doesn’t matter what those styles look like in the manuscript — the designer’s going to change that. But if you apply styles consistently so that all of the body text is one style, all of the chapter heads another, etc., then it’s simpler to import the file into either InDesign or into HTML that can then be imported easily into an ePub document. It will make your life and the designer’s much, much easier, and will save you headache and money in the long run.

I think you can understand why I don’t love having to start with a print copy of the book — or a raw manuscript. It takes a lot of time and effort to turn paper and ink into bits that a computer can use. But sometimes, that’s what you’ve got.

So what is my least favorite file format to work from? PDFs from the print designer. Unless the PDFs are properly tagged, it’s difficult to convert them into HTML that’s clean and looks nice, and even if they are tagged you’ll often end up with the kinds of print-only weirdness that I mentioned in my earlier rant. I spent about thirty hours last month cleaning up extra spaces and line breaks from a PDF that had created a beautiful print edition — but made a mess out of creating the ebook. It was a job that I’d expected to take about two hours, because I’d been told I’d be getting the InDesign file. Eesh.

Actually, my LEAST favorite files to work from are PDFs scanned from the print book, which I’ve had to deal with on a few occasions when I was converting an out-of-print paper-and-ink book into digital format. The fun and games in tracking down OCR artifacts — “clay” instead of “day” or “fim” instead of “firm,” for example — and in teasing out the difference between hard line breaks and paragraph breaks is only the beginning.

Does that make any sense? Any questions? Any thoughts?

Photo: Express Delivery by Kamyar Adi/flickr.com. Used through a Creative Commons license.

0 thoughts on “6 great file formats to send to ebook designers (and 2 awful ones)”

  1. “I spent about thirty hours last month cleaning up extra spaces and line breaks from a PDF that had created a beautiful print edition — but made a mess out of creating the ebook.”

    To remove extra spaces – search/replace ” ” (2 spaces) with ” ” (1 space). Repeat until done.

    To remove extra line breaks – copy into WordPerfect, because WordPerfect can search/replace line breaks.

    1. Thanks so much, Rod!

      The problem, unfortunately, wasn’t double spaces — I search and replace those and a bunch of other habitual errors when I first get a manuscript. The problem was that the designer had added single spaces in the middle of words. Not searchable.

      And as I said, if I’m exporting from an InDesign file and don’t remember to search-and-replace the manual line breaks, they’ll show up in HTML as
      tags, which I can in fact replace. Unfortunately, I was working from a PDF here — there was no difference between a line break and a full paragraph break. Ergo my frustration.

      1. Yes, that’s a much harder problem. My next thought is, of course, spell check. After that, to replace legitimate spaces with something else — if possible, to automatically go thru the dictionary and replace all known words — say, replace ” apple” with “zxyapple” — then the illegitimate spaces would be a much higher proportion of total spaces, maybe 50% (because ” cen ter” would be unaffected).

        1. Yes — that was a solution that I definitely looked at. The problem is that there are a lot of words that won’t show up on a spell check — the example that I used in the article (because it’s one I ran into) was because becoming be and cause — most of the errors would have been caught by spellcheck, but since I was going to have to go through every @#$@@$ line anyway… :-p

Leave a Reply — Please!

This site uses Akismet to reduce spam. Learn how your comment data is processed.