Indexes in ebooks
This is the first installment in a 3-part series on Indexes in ebooks, written by Stephen Ingle, president of WordCo Indexing Services. We’ll publish the final 2 pieces over the next few weeks. Steve will also be a guest on #eprdctn hour on September 9. Please comment below, send in questions, and participate on September 9! Now to Steve:
I love digital non-fiction books, especially history and politics. I love downloading them on the Kindle app on my iPad or my iPhone. But I have a gripe: why is the index often missing in the digital version of the book? I feel like I’ve been shortchanged. I like to see what topics come up in the book, how often they come up, and where the discussion is (i.e., at the beginning, middle, or end of the book). The index (even just a non-hyperlinked image of the print index) provides this. The index also breaks down major topics into subtopics so I can get an overview of how the author treats that topic.
I guess the usual (dare I say ignorant?) response from the publisher is something like this: “The page numbers don’t show up in the digital version, so the index is unusable,” or “Readers can just use Search.” Search works great for simple things (assuming you remember how to spell the name you’re looking for or the author’s particular terminology). But why not include the index, at the very least to help the reader know what to search for?
What exactly is an index anyway? A list of every name or term that comes up in a book? Not really. That would be more of a concordance. Essentially, an index provides an organized overview of the book’s contents. It is not just a search tool to “look something up.” It’s a meta-presentation of what’s in the book. Think about what that means: with a well-constructed and complete index, and enough time, you could pretty much reconstruct the gist of the book, in page order. Just add water!
An index is a beautiful thing. It adds value to the book. I would argue that indexes are not only still relevant, there are all kinds of ways they could take advantage of the digital milieu to improve the reader’s experience. At a minimum, index headings would be hyperlinked to the text. But why not use colors and typefaces to indicate relative importance or different categories of headings? And, since we’re dreaming, why not have collapsible headings that can be expanded with a touch? Why not enable some kind of user input, perhaps “searching” the index for certain types of headings (all people, perhaps, or all companies in a business-related book)?
Clearly, there’s lots of “potential.” Then why do the ebooks we see with indexes either have a “dead” (i.e., non-hyperlinked) index, or, at best, a hyperlinked version of the print index?
The answer is simple. It’s not only that it’s not feasible given the technology (there certainly are issues with differences between devices and platforms that render ebook files differently). There are two real reasons we don’t see a push for more and better ebook indexes: production costs and time. And since time is money, it’s really all about costs. So that begs the next question: how to adjust workflows to reduce the costs of creating decent (hyperlinked) ebook indexes?
First of all, let’s start with the basics. A dead index is better than no index. If the print book has an index, at the very least the ebook should include it, even if it’s not hyperlinked. That doesn’t cost anything. And it doesn’t leave the reader (me) feeling cheated.
But can we have even a basic hyperlinked index that’s not going to cost more, or at least not more than a nominal amount? An index, where if you touch the page locator, you are magically transported to the relevant location in the text? The answer is: “Yes, we’re getting there!” If the ebook file contains page markers (page list in EPUB3; read the IDPF’s guidelines here), it’s really a simple matter to hyperlink the page locators in the index to the actual page locations in the book. More on this next time.
Coming in Part II: Changing the Ebook Workflow
Stephen Ingle is the president and CEO of WordCo Indexing Services (www.wordco.com), located in Norwich, Connecticut. He created his first index (8 lines) at the age of 10. After graduating from Yale University with a degree in German literature, he went on to earn master’s degrees in German and Russian Area Studies. In 1988, Steve began freelance indexing part time while also working at the Modern Language Association (MLA) in New York. He began indexing full time in 1991. Steve has served on the national board of the American Society for Indexing. His company now employs a team of indexers and completes about 500 projects annually for a diverse group of clients. His interests include indexing as a business and indexes for digital publications.
It’s the labor and expense of creating that index that’s the issue. This article perhaps should have mentioned that, when the print version has an index, InDesign can create an index for the epub version using the page numbers of that print version as markers for the link. That makes the expense of an ebook index trivia.
MIchael, you’re right about that, but your point assumes the indexer is working within the InDesign document. That’s not always possible. A couple of issues come to mind: not all Indexers are familiar with InDesign; and if the Indexer does do the work in the InDesign mechanical, that step needs to be added to the production schedule (and the designer / typesetter has to be comfortable letting an Indexer manipulate the document).
And Michael, I neglected to mention that Steve addresses making InDesign indexes in his next post. Stay tuned!
Michael, thanks for your comment. Labor and expense of creating the index are certainly major issues. Digital production workflows can actually cut the cost of index creation: for example, by automatically including indexable items such as bolded glossary terms in the text. I haven’t seen how InDesign, on its own, can create a hyperlinked index from an existing print index that is imported. Unfortunately, creating a useful index within InDesign is nigh impossible because of the clumsiness of InDesign’s indexing feature. It is generally time- (and hence cost-) prohibitive. I have found it more efficient to do the indexing outside of InDesign.
Very well-written article! As a back-of-the-book indexer, working in InDesign would be a deal-breaker for me. My understanding is that InDesign’s indexing feature is no better than Word’s. When you’re used to dedicated indexing software, such as Cindex, using InDesign’s indexing tool would feel like slumming. On top of that, the program is freakin’ expensive! As a business expense, it’s not the sort of thing you want to invest in unless you’re going to index a lot of books that way.
Michael, are you suggesting I could create an index in Cindex in such a way that the locators become markers (using tags?) and hand off the index to someone who would then link those markers to the text? If so, I’m in!
I’m an editor, not an indexer, but I’m very interested in your piece.
Re your second paragraph, something I’ve always remembered: Walking into a bookshop many years ago with a new acquaintance who immediately flipped to the end of the first book, then looked up and explained sheepishly, “It’s because I’m a historian. We always look at the index first.”
Before Creative Cloud, InDesign did indeed create live indexes for PDFs, but not for ePubs. If the publisher is using one of the earlier versions, the index file is cut from the file book during the export. There is no way to force it to export, either.
A couple of comments here:
First> InDesign CC does produce an epub-hyperlinked index if you embed the index in InDesign. One commenter noted that PDF indexes are hyperlinked, and now so are EPUB indexes.
Second> The Kerntiff KPS tools allow indexers to write an index in their own database-like software, and then import it into InDesign and place the markers. It is so much faster than trying to write an index in InDesign itself. Placing the tags requires knowledge of InDesign, but KPS shows the entries page by page, so you just need to work on one page at a time. I’ve used it several times, and would never go back to indexing in InDesign again. It also means the indexer can develop the index without the live files, until the end when the index is edited and ready to insert. It drastically reduces the amount of time the indexer needs the files in her sole hands.
Third> Steve’s idea of unique IDs and working outside the files is the key to indexing for many reasons. Indexers need to have the time and space to think about access, usability, terminology, and best practices of their craft, which is hard when everyone NEEDS the files right now. We don’t work sequentially, we work all over the place, from A to Z, checking, comparing, tweeking terms. We first used unique ID indexing at Visio and Microsoft in software documentation, so that translators could work on an index transforming it into Spanish or French as one file, not scattered terms in code. Tools that use unique IDs have had several iterations. There are a set of scripts posted on http://www.wrightinformation.com that work with Pre-CC InDesign and common indexing software. They aren’t perfect, but they provide an approach to getting the work done more efficiently.
And fourth> as book technology goes further, having a database that is linked to unique IDs means lots of ways to leverage that “aboutness” information. We imagine an index term lighting up a whole passage. We imagine an index term telling you the most important places in the text, not just every place. We imagine using that metadata to merge with other books’ metadata to lead you out into other publications. It’s data. This is one format that can free it to tell readers more than just a page.
Great set of articles, Steve!
Oh, one thing I forgot. The time delay in InDesign’s user interface. It is a problem, and it exists. It’s not our imaginations. It can take 90 seconds to five minutes for an entry to update in some projects. We have tried so many things to figure out why, and have not been able to identify the cause. This is one thing the KPS add-ons really help with, as that delay disappears. My thinking it that it has to do with internal UNDO functions. Editing indexes in InDesign is a lot of editing. So KPS allows you to edit outside, before the index goes in, and avoids that horrendous ridiculous delay behavior.