Jump in the Convertible: Ebook Conversion Tools

This is the fifth installment in my series of posts about ebook creation. Like the others, it was originally posted on Joel Friedlander’s wonderful resource for indie publishers,TheBookDesigner.com

Over the last few months I’ve discussed preparing your manuscript and your images for conversion into ebook form. This month, I’m going to look more closely at a subject that I’ve touched on: choosing an ebook conversion tool. Just to review, I suggested that there were four basic ways to convert your manuscript into ebook format:

  1. From scratch
  2. Saving from a word-processing or page-layout application into ePub format
  3. Using a conversion app or online service
  4. Hiring a designer

We’re going to ignore option #1 — if you’re the kind of person who wants to dig that deep into the guts of ebook creation, I don’t think that you’re going to be patient with this process. I’m not going to dwell on option #4 (or the second half of option #3), since the emphasis of this series is how to create your own ebooks. Using a conversion service or ebook designer is always an option, and I’ll discuss later how to choose one. But for now, we’re going to look at choosing the software that you can use to create a book yourself. Here’s the list of software that I will look through with you: .[1]

  • Apache’s OpenOffice has a plug-in called Writer2ePub that allows you to save files as ebooks (open source office suite)
  • Scrivener (commercial writing app)
  • Apple’s Pages for OS X or iOS (consumer word-processing/page-layout app)
  • Adobe’s InDesign (professional page-layout app)
  • QuarkXpress (professional page-layout app)
  • Jutoh (ebook design and creation app)
  • iBooks Author (fixed-format ebook design app)
  • Calibre (ebook library app with file-editing utility)
  • Sigil (dedicated ePub file editor)

I’ll also talk briefly about the possibility of using one of these tools for conversion to the ePub ebook file, and then a text editor or web-design app to do any post-conversion editing.

For each app, I’m going to start with a formatted Word document from a book that I’m working on (my own novel Risuko, actually). Here’s the chapter head that we’re going to be looking at a lot:

The Music Lesson screenshot 1

I chose this file because it had just enough complexity to test these conversion tools, but not so much that it will break all of them. The fonts are part of what makes this file challenging; the drop cap at the beginning of the first paragraph is another part.

The manuscript has been prepared in the manner that I suggested in an earlier post; all of the styles are applied globally, and had names applied in Word’s Styles palette. The style for the chapter header above is called “Chapter-Head”; the first paragraph with the drop cap is called “Body-First.” Italics and boldface are applied simply with Word’s keyboard formatting (command-I and command-B on my Mac). The “Body-First” style has a built-in drop-cap rule — I didn’t have to style those letters separately, but simply added the “Body-First” style to the first body paragraph in each chapter.

Writer’s Tools

The first three apps all fall into the category of word-processing/writing tools.

OpenOffice logo

OpenOffice (free)

is an open-source office suite program. OpenOffice opens and edits just about every kind of every-day file that you can think of, and it can save into a number of file formats, including Word docs (.doc and .docx). Versions will run on just about every kind of computer you can think of.

Writer2ePub is a plugin that allows you to save your word-processing file into ePub format. You add the plugin in OpenOffice by using the Extension Manager from the Tools menu. Once the plugin is installed, you’ll get a floating palette that allows you access to the ePub tools:

OpenOffice screenshot1

This file was imported from a PDF rather than a Word doc. Nice!

That green E is the ePub logo, by the way. You have to save your file in OpenOffice’s native ODT format. Once you’ve done that, you can click on those buttons. The right-hand button opens the preference panel. The center button allows you to edit the ebook metadata, adding the title, author, cover image, etc. to the file.

The most important button for our purposes is the one on the left. You can export quickly, but unless you add “w2e_” to the beginning of every one of your text style names, they won’t export. This will leave all of your text unformatted.

If you do add the tag at the beginning of the style names, however, the ePub file that Word2ePub exports will do an okay job of converting the styles you specified:

OpenOffice screenshot2

And the HTML won’t have a lot of extra coding added. (We’ll see why that’s a problem soon enough.)

Scrivener logo

Scrivener ($45)

is a Swiss Army Knife of writer’s tools. It gives an author a single place to gather research, images, notes, outtakes — all in one file. It imports a wide variety of files — including the standard DOC, DOCX, and RTF word processing files. Scrivener doesn’t handle fancy text formatting, like the drop capital at the beginning of the chapter:

Scrivener screenshot1

Still, Scrivener does a wonderful job of getting everything where you want it, and it exports to the ePub format.

The Scrivener conversion feature is very fully featured. On Scrivener, the process is called “compiling,” since you can export multiple files into a single ebook— chapters, say. There’s a whole battery of compilation options, including metadata:

Scrivener screenshot2

There’s a way of adding a cover, a way of converting a table to an image so that it doesn’t break your ebook (another topic for another day!), and a lot more.

There’s also a wildly complicated formatting panel that actually gives you relatively little control over how text is going to look when it exports. There are a number of check boxes and options, but you don’t apply it with the text in front of you, you apply it in the Compile dialog box:

Scrivener screenshot3

It will export italics and boldface, and will handle chapter heads and such, but it won’t allow you to create, for example, a special style for a quoted poem or a sidebar or whatever.

Not wanting to recreate all of my styling, I selected all of my text and applied Preserve Formatting to my entire manuscript. This is what it produced:

Scrivener screenshot4

Not very good.

And it applied its own local format names to each paragraph (i.e., “p16” instead of “Body-Text”), but at least it applied the same name to all paragraphs in the same style — “p16” equals “Body-Text” throughout.

If I hadn’t already committed to formatting the text in Word, Scrivener wouldn’t have given me many tools to apply consistent, attractive styling to the exported ePub file. I can add the formatting later using CSS styling rules — but if I hadn’t already given the paragraphs styles in Word (before importing into Scrivener), I’d have had to start from scratch.

Apple Pages logo x200

Apple Pages (Mac OS X: $19.99 | iOS: $9.99)

is Apple’s answer to Word (for both Macs and iOS devices)— and it does most of the things Word does, plus some. It’s a serviceable piece of page layout software, though no one would confuse it with professional tools like InDesign or Quark. Of course, it’s only available on Apple devices. Still, it’s a nice piece of software.

Like the other apps, it can open a Word doc or RTF file natively. Alas, once again, I couldn’t get it to handle drop caps:

Apple pages screenshot1

The style names, however, came through perfectly.

The export function is simple — go to the File menu, select Export ->ePub, and you’ll get a straightforward dialog box that offers you the opportunity to add a few bits of metadata — title, author, category.[2]

The exported ePub file was divided up by chapters, which was nice, and retained much of its formatting:

Apple pages screenshot2

The style names were once again changed — paragraphs now have names like “s12” while local character changes have styles named “c9” and such. That’s not a problem

The style names are not consistent, however, and this is a problem.

Rather than add <i> for italics, for example, Pages created <span> tags — the span tag that Pages created for italics seems to be <span class=”c15” > — most of the time.[3] If the paragraph is in a different style (e.g., “Body-Text” instead of “Body-First”), then the character style is different as well. What this means is that I can’t globally change the style all of the italics or boldface characters — or those drop-capped initial letters. For those initial letters, for example, different instances are given different style names (“c14,” “c23,” “c28,” etc.) even though these were all styled identically, with the same paragraph style in Word.

This means that any changes that I make to style will apply only to that style on that page— it won’t apply throughout the document. Once again, this is not good.

All of these writers’ tools can export an ePub file. But none of them will export one that you’d want to sell — not without further cleaning up.

Page Layout Tools

Quark ExpressInDesign

The next section is a shorter one, dedicated to professional-grade print layout software, and field that’s dominated by Adobe’sInDesign (part of the Creative Cloud suite: $19.99/month) , with Quark ($849) continuing to hold its own.

These apps are intended for designing books, magazines, and other paper-and-ink documents, and they’re very sophisticated. You can control the placement and style of every letter on every page of a book, and you can work with a team of designers to create beautiful works of the printer’s art.

Both Quark and InDesign have had ebook export functions for some time, but until a couple of years ago, they were pretty awful — certainly no better than what we just saw from the consumer-level writing apps.

Recently, the ebook authoring capabilities of Quark and InDesign have gotten to the point where they’re actually useful — but the ebooks that they produce still aren’t quite usable out of the box (especially for more complicated formatting).

Both apps offer a huge amount of control over how styles export. InDesign allows you to say what HTML tag and class you’d like slapped on a particular style on export, and whether you want a page break before it. It allows you to leave in fine adjustments that you’ve made for the print edition or ignore them (usually the right idea). It can export ePub2 or the shinier (though not universally adopted) ePub3 format, can export a fixed-format ebook, and allows you to add custom CSS — style instructions — on the export. It even allows you to add JavaScript files.

Here’s the ePub file created by InDesign CC 2015:

InDesign screenshot1

You’ll notice, of course, that the fonts aren’t great. This is in part because Adobe is in the business of licensing fonts, and is very careful about what it adds where. The fonts are embedded in the ebook file, but they’re encrypted so that no one can go into the file and steal them. That’s great, but if I try to upload this file to Amazon or Apple, it will be rejected.

Still, the drop caps are perfect, and the formatting looks, for the most part, nice. When I look at the code, there’s some of the same silliness with renaming — but nowhere nearly as much. Again, italics show up as — this time, that’s fairly consistent, though, so I don’t have to go searching all over for variations — CharOverride-10 = italic, so if I want to change the styling of all the italics for some reason, I can change it in one place.

So, the files that the professional software exports are great — but still need to be futzed with.

Ebook Design Software

This is our third category, and it’s the software that does the futzing: software that is designed specifically for creating ebooks.

Jutoh logo

Jutoh ($39)

Jutoh is intended to be a page-layout app with a focus on digital distribution: format the file once and export it to a wide variety of formats. Jutoh can import Word docs, OpenOffice ODT files, HTML, and even ePub files. (The styling for ePub and HTML doesn’t all import, however.)

It can export into ePub2 and ePub3 formats, as well as mobi (Kindle) files, ODT files that can be used to create a PDF, and even a text-to-speech engine that will create a computer read-aloud track.

Like Scrivener, Jutoh allows you to pull together all of the files that you need to create the book into a single package.

You have to create a project — you can then import your manuscript:

Jutoh screenshot1

Like Pages, the formatting mostly looks good, but once again, drop caps aren’t supported.

As with every other app so far, Jutoh allows you to edit and format the text, add and move images, and do a lot of the other prep work.

The exported file is clean:

Jutoh screenshot2

The fonts once again don’t export. However, all of the global styles are intact — and italics are given simple <i> tags. So adding a customized style is simply a matter of creating (or adding) a CSS stylesheet that makes the files look the way we want them to.

iBooks Author logo x200

iBooks Author

When I first started this series, Apple’s iBooks Author (free)wasn’t a viable ebook editing app. It could only create fixed-format ebooks, and could only export directly to Apple’s iBooks Store.

Since then, Apple announced that iBooks Author would both import and export reflowable ePub files, and so it became possible to consider this app as an ebook editing tool.

Like most Apple products, iBooks Author is beautiful. It takes the interface from Pages and adds some unparalleled tools for ebook authoring. There’s no easier app for adding widgets — small, self-contained chunks of code that add very non-book-like functions to an ebook, whether that’s an internet video, a quiz, or custom HTML or JavaScript. Adding enhancements like embedded video or audio is a breeze. It’s a great tool for creating beautifully designed fixed-layout books — children’s picture books, for example, or cookbooks, or textbooks that require a particular relationship between the text and the images and other media. .[4]

On importing my Word file, iBooks Author did a nice job of maintaining the text style:

iBooks Author screenshot1

The downside? Like Pages, there are some text styles (here, our old friend drop caps) that can’t work. The app expects each chapter to be loaded separately, or all of the chapters in the Word doc will be listed as sections in a single “chapter.”

The exported file does a pretty good job of reproducing what made it through import:

iBooks Author screenshot2

That could almost be the same screenshot, except for the line-height of the first paragraph — look at the space between the first two lines.

When I look under the hood of the reflowable ePub file that iBooks Author creates, there are plusses and minuses.

First of all, iBooks Author doesn’t export in the older, more widely adopted ePub2 format, but only into Apple’s version of the more feature-rich ePub3 format. The app adds a few non-standard files that (while perfectly valid) may keep the file from passing validation when you upload it to some retailers’ sites: while I was able to upload the un-edited ePub file to Apple, Kobo, and Amazon, Barnes and Noble and Smashwords both rejected the file.

Also, not surprisingly, the same style name changes that made Pages a problem pop up again in the file created by iBooks Author. So making global styling changes won’t be easy.

While I’m very excited about the opportunity to use iBooks Author as a single tool to create fixed-format ebooks for the three major retailers that accept them (Apple, Kobo, and Amazon), I’m less sanguine about using the app to create reflowable ebooks — unless I’m creating a separate version for use on B&N, Smashwords, and the rest.

Calibre logo

Calibre (free)

Calibre was created not for ebook designers but for ebook readers. Think of it as iTunes for ebooks — the open-source ebook library app can download ebooks directly from many retailers, can sync with a number of ereader devices, and can organize your ebook library in a number of ways that make it much easier to find the book you’re looking for. And like iTunes, Calibre has the ability to convert between one ebook format and another: between ePub and mobi (old Kindle) or AZW3 (newer Kindle), say, or between LIT format files and ePub, so that you can read that old Microsoft Reader book on the iBooks app of your iPad or Mac. It can also convert between Word DOCX files[5] and ePub or any of a number of other digital text formats.

That conversion ability made Calibre a very handy for ebook designers as well, and so an ePub-editing utility was added a couple of years back. Together, they make Calibre the one app that allows you to go all of the way from importing a Word doc to converting the file to ePub to cleaning up the file that results.

Opening the conversion dialog once again gives us the chance to add metadata. It also gives a bewildering range of settings with which to fine-tune the conversion. (If you’re interested in mastering those settings, visit Calibre’s excellent help pages.)

Using the base settings, which are often pretty good, I got great results:

Calibre screenshot1

Notice that the initial drop cap imported properly, and that (most of) the fonts are intact — and in fact were embedded within the ebook file. The numbers dropped their font, but that’s not a big deal.

Unfortunately, the underlying styling got rewritten once again. Instead of every paragraph containing a drop cap having the same style name (“Body-First” in the Word doc), each <p>(paragraph) tag is given a different class (style) name (“block_33,” “block_39,” etc.). On the other hand, the styling for the drop cap itself seems to have been given a consistent class name throughout, so if I wanted to make the drop caps bigger or turn them red, I could do that by changing a single CSS rule.

Most of the “Body-Text” paragraphs are given the same class name, “block_19,” but I’d have to go through the whole ebook to make sure that were true. .[6] Standard formatting like italics and boldface are given standard HTML tags (<i> and <b> ), which is nice.

In terms of editing and validating ePub files, Calibre’s editing utility has a pretty good set of features; it’s also extensible through a number of plugins. If you know the Python programming language, you can even write a plug-in yourself! The main editing window shows the raw HTML, but there is a preview window open to the right that allows you to see how the changes you are making to the code affect the way that the ebook displays. Another window shows what CSS rules are in effect in a particular chunk of text, which helps figure out why those letters are displaying orange, for example. (Oops! Forgot to change the rule back after I searched for all of those class names!) You can also open and edit the CSS stylesheets and the specialized XML documents that are embedded in every ePub file, .[7]

The community that supports Calibre is very active, and so new features and fixes are being added every few weeks. Support is terrific — when I’ve been stumped by what appeared to me to be a bug in the conversion software, I got multiple responses within hours; when the problem turned out in fact to be a bug, a fix was posted within a week.

Sigil logo

Sigil (free)

Sigil is the tool that I use most often for ebook editing, more than Calibre or an HTML editor like Dreamweaver. This is for a couple of reasons:

  1. I’ve been using it for a long time (since 2011 — a lifetime in ebook development terms.)
  2. It can open ePub files directly, rather than having to import it into a library (as Calibre does) or unZIP them (as editing in an HTML editor requires) and then re-ZIP when finished.
  3. I can view the formatted text in the preview window, as in Calibre, or I can switch the main window to “Book View” and edit the file in WYSIWYG (“what you see is what you get”) mode — I don’t have to look at the HTML unless I want to.
  4. It’s use of Regular Expressions (a.k.a. RegEx, GREP, or wildcard searches) is more powerful and more familiar to me than Calibre’s, which allows me to make mass changes quickly.
  5. Have I mentioned that I’ve used Sigil for a long time?

Like Calibre, Sigil is open source; unlike Calibre, which has a huge community of consumers who use it just for its library functions, Sigil is aimed at designers only, and so it has occasionally gone through periods when development has essentially stopped.

However, Sigil development has been steady for the past couple of years, including the addition of Calibre-like plugins that add all sorts of nifty functions, from in-app validation to importing Kindle files to converting the file into ePub3 format. .[8]

The main disadvantage to Sigil as an ebook authoring tool is that it will only import valid HTML files. .[9] It’s the first piece of software we’ve looked at that won’t directly open or import the DOCX file we’ve been using. You can copy and paste from a Word or other document, and much of the formatting will (probably) translate over. However, for a complex file (like the one we’re using to test against), you’re going to have to convert the file into a format that Sigil is happy with.

Unfortunately, Word’s Save to Web command won’t create an HTML file that Sigil will import — Microsoft’s HTML conversion is notoriously idiosyncratic and results in incredibly bloated code that’s optimized for Internet Explorer — a browser even Microsoft no longer supports.

There are, however, a number of ways to get the manuscript into the format we can work with:

  • Use one of the apps above to convert the file into ePub format. (I frequently use InDesign, since I’m also working with a print document, but I’ve also used each of the others.)
  • Use a utility to convert the file into valid XHTML.[10]
  • Convert the document, and then copy and paste the code, rather than using the import function.

Any of those routes has disadvantages. However, for the purposes of this exercise, I opened the file that had been created by Jutoh, since it was the one that was the least problematic. Remember, this is what it looked like:

Sigil screenshot1

I played with the CSS for about ten minutes and this is what I got (not just in this chapter, but throughout the ebook):

Sigil screenshot2

Not bad! I’d probably play with the color, and the spacing between the chapter head and the body paragraphs is too large, but certainly a lot closer to what I was looking for.

Sigil, then, is a great tool for getting under the hood of your ebook and editing it intact. It’s got some quirks, it’s definitely not aimed at consumers (Autosave? We laugh at your autosave!), and its most powerful features take a while to learn how to use, but it’s the only app designed just for the purpose of editing ePub files. .[11]

Now, as I mentioned in the first post in this series, an ePub file is simply a (very carefully constructed) ZIP archive with a different extension (the three or four letters at the end of the file name). Change the extension to ZIP, double-click the archive, and you’ll get a folder full of all of the things in your ebook: XHTML files, image and other media files, stylesheets, and all of those exotic XML files that I talked about above.

The fun part is that you can use Dreamweaver or another HTML or text editor to edit those XHTML files. It will give you basically the same power to edit every part of the ebook. If you’re using Dreamweaver or another app with HTML previewing, you can see the edits effects as you make them, just like in Sigil. If you’re using a bare-bones text editor, you can load the ebook file into a web browser and it will display just fine — most ereaders are based on web browsers. (Apple’s iBooks, for example, is a specialized version of Mobile Safari.) Finish the editing and ZIP the directory back up, change the extension back to ePub, and you should be all set.[12]

 


[1]There are more tools out there — but these are the most commonly used and recommended ones. If you have a favorite tool and I haven’t mentioned it, please leave a comment!

[2]You can export an ePub file from a Pages doc on your iPad or iPhone as well, and then open it straight in iBooks. Never tried it on an Apple Watch, but I suppose that might work too.

[3]If the sight of HTML makes you break out in hives…. Well, I’ll be giving a quick lesson in HTML tags and CSS styles over the next few months.

[4]Amazon’s Kid’s Book Creator and Kindle Comic Creator have some of the same features, but where those will only allow you to create books for the Kindle, iBooks Author now has the ability to create ebooks that will work on Apple products, but also Kobos and — in theory — the newer Kindles at which the Amazon software is aimed.

[5]Though not DOC files; DOCX are based on XML, which is the file structure underlying HTML and many other modern file formats, while DOC files use a propriet

[6]Easy way to do this? Change the color of the “Body-Text”/”block_19” style to orange or something equally garish, and look for body paragraphs that didn’t change color.

[7]For example the OPF “manifest” flle that tells ereaders what is in the ebook and where to find it within the ebook’s internal structure; and the NCX file in ePub2 and some ePub3 files that give ereaders the navigation/table of contents info. I’ll talk about those — but not too much — in coming months.

[8] Neither Sigil nor Calibre is yet fully set up for ePub3, though they are both moving in that direction, and you can edit ePub3 files, even if not all of the features work properly.

[9]Strictly speaking, valid XHTML files.

[10]I use Apple’s TextEdit. Here are directions from Jane Friedman. The important thing is that the exported file be valid XHTML 1.1 — if your utility knows how to save a file in that format, you’re all set.

[11]In case you’re wondering: there isn’t really a tool for directly editing mobi files. That’s one of many reasons that I start with ePub and convert to Kindle formats later.

[12]If you’re using a Mac, you’ll need to use a utility like ePub ZIP/UnZIP for Mac.

Photo: bigstockphoto.com.

Leave a Reply