Indexes in ebooks: Part 2

  • Sumo

indexIlloSteve Ingle of WordCo continues his deep dive into indexing for ebooks. He’s got insight into how he works, and why InDesign isn’t suited for his methodology.


Publishing non-fiction books in digital form often presents one unique challenge that does not exist with fiction: providing a usable index. In my previous post (“Indexes in ebooks”) I argued that indexes should be included in digital non-fiction books, and that they don’t need to be cost-prohibitive.

The index is an organized, highly structured and detailed summary of the book’s contents; it’s there to help the user assimilate information (aka “learning”). Of course the reader can use the index as a search tool to “look up” topics, but it’s much more than that: users can browse the index to get an overview of the book’s contents.

While a non-hyperlinked (i.e., “dead”) index reproduced from the print version of the book is better than no index, readers of digital non-fiction books have a right to expect, at the very least, a hyperlinked index, where headings or page locators are linked to specific locations within the digital book.

Ah, the hyperlinked index . . . There’s the rub.

It all sounds great, but how do we get there from here? As an indexer, I could manually insert tags into the ePUB file, but this is not a realistic option. It would essentially double or triple my work time. Plus it would mean holding up the digital release until tagging was completed.

Idea: there must be a way to do the tagging BEFORE the book is exported to ePUB.

I will address a viable solution to this quandary in my next post, but first, let’s look at one potential solution — or impediment — to embedded indexing. I’m talking about Adobe InDesign.

InDesign is a wonderful application that has enabled anyone with a workstation and a little knowledge to create professional-looking books. It has helped revolutionize the publishing industry. It even has an indexing feature that allows the publisher to turn to the indexer and say:

Here are the InDesign files. Now just please create the index. Maybe we can even pay you slightly more for your trouble.”

This actually happened to me. I took the workshop on indexing with InDesign, and made the substantial outlay to purchase the program in anticipation of lots more InDesign indexing. However, when I started doing the work, I realized there was a problem: indexing with any kind of tagging takes much, MUCH longer than indexing a print book. And the end product is not nearly as good.

To understand the scope of the problem, and the limitations of indexing in InDesign, publishers and software designers need to know something about how an indexer works. (We need better communication and understanding all around, but I’ll address that in my next post).

How an indexer works

The indexer doesn’t just feed a book into an automated indexing program that then spits out the index. He or she makes several passes through the book, getting a general overview of the subject, deciding how to structure useful headings and subheadings, all the while keeping in mind the probable audience of the book, length constraints on the index, as well as the deadline.

Indexers typically use database programs focused on index creation, such as Cindex, Macrex, or Sky. A good index is not produced sequentially (I don’t start with the “A” entries, and I don’t necessarily start with Chapter 1); it EVOLVES. The indexer uses his or her judgment to include certain categories of entries and exclude others. When the data is entered, the indexer edits the index, deciding where to add, delete, or consolidate entries. And that’s just the beginning: indexing software helps me perform complex operations on my file. For example:

  • as I create the index, at any time I can see the formatted index taking shape. If I want to see how I picked up a certain topic from a previous chapter, I can easily perform an instantaneous search to display all such entries.
  • as I enter an index heading, the software will auto-complete the entry based on previous headings.
  • if I want to copy a block of entries under one heading, but spanning several chapters, I can easily duplicate the selected records and add the new heading.
  • if I want to ensure that all terms beginning with “St.” (as in “St. Louis”) sort as “Saint”: again, it’s easy to search and replace with the proper sort codes.
  • I can visualize the index in various forms: draft vs. formatted, sorted letter-by-letter vs. word-by-word.
  • I can sort the index by page number, or search for records by page span, to assure that Chapter 2 was covered as well as Chapter 10.
  • I can have the program alert me when an entry is followed by too many page locators (I can set the number).
  • I can label index records for future reference if I want to revisit that concept later.
  • As I near the end of the indexing process, I might have an inkling that a certain concept that didn’t seem important at the beginning comes up in several places under two different terms. I want to consolidate all instances of these terms under one index heading. I simply perform a Boolean for both variants, save them to a separate area, make any necessary edits, and add them to my index. With the right software, this takes just a few seconds.

Recap: 1) indexing is way more complicated than most editors assume, and 2) a skilled indexer can utilize indexing software to make the job much more cost-effective, with superior results.

Unfortunately, InDesign in its present form is not up to the task. I realized this as soon as I started working with it on an actual project. Here are just some of the problems:

  • Not only do I need to click and drag with the mouse to highlight a word or passage, I have to indicate whether the reference continues for a specific number of pages, or until the occurrence of a specific style. If I am using the style option, I have to scroll through several screens of styles to find the right one.
  • It is very difficult to edit the index as it is taking shape. Because a book in InDesign consists of separate chapters, I need to generate the index anew from all of the chapter files every time I want to view it. This takes a lot of time.
  • There is no (easy) way to search and replace for entries using Boolean searches or patterns. Inserting forced sort codes (as in “St. Louis”), takes a lot of time and effort.
  • There is no way to quickly view the index, or selected, in various formats and sorts (alphabetically, by page).

In short, indexing in InDesign takes much too long, even to justify a significantly higher rate.

Reflowable vs hyperlinked indexes

Even IF indexing with InDesign were as easy as working with dedicated indexing software, its indexing capability was designed not with digital (hyperlinked) indexes in mind, but rather REFLOWABLE indexes. That is, if the publisher removes a chapter, the index page locators automatically update. Reflowable indexes have their place, such as with a book that is regularly published every year with relatively minor updates. While a reflowable index also requires tags, it’s not the same thing as a hyperlinked index.

What if . . .

While a few publishers have totally dispensed with InDesign in favor of a digital-first workflow, the current reality is that the majority are still using it as the foundation of their production process. So maybe what is needed is a totally different approach to indexing. What if the indexer didn’t have to do tagging at all, and what if the creation of digital indexes did not have to disrupt the publisher’s InDesign-based workflow? And what if we could use a new method to start building really useful indexes that took advantage of digital formats in a way that print indexes never could? To be continued….


Stephen Ingle is the president and CEO of WordCo Indexing Services (www.wordco.com), located in Norwich, Connecticut.  He created his first index (8 lines) at the age of 10. After graduating from Yale University with a degree in German literature, he went on to earn master’s degrees in German and Russian Area Studies.  In 1988, Steve began freelance indexing part time while also working at the Modern Language Association (MLA) in New York.  He began indexing full time in 1991. Steve has served on the national board of the American Society for Indexing. His company now employs a team of indexers and completes about 500 projects annually for a diverse group of clients.  His interests include indexing as a business and indexes for digital publications.

4 Responses to “Indexes in ebooks: Part 2”

  1. Quote: “There is no (easy) way to search and replace for entries using Boolean searches or patterns. Inserting forced sort codes (as in “St. Louis”), takes a lot of time and effort.”

    Oh please, don’t be silly. I’ve laid out and indexed scientific books 300+ pages long with InDesign. What you call “takes a lot of time and effort,” requires about five seconds for a cut-and paste and a quick edit to change the as-printed term to its as-sorted form. InDesign can even flip John Smith in a book’s text to Smith, John. And since InDesign maintains a database of index entires, those changes only need to be done once for each entry not each page reference.

    I respect professional indexers. They have skills that I lack when I create less sophisticated indexes. But InDesign does fill a need. It enables publishers under budget restraints to create useful indexes for their print and digital editions. The result may not be up to the standards of an index done by a professional with dedicated indexing software, but when the alternative is no index at all, it’s a good choice.

    And yes, I admit that repetitive tasks like indexing are clumsier than they should be with InDesign. A UI using small panels, long scrolling lists, and mouse clicks is a pain to use. The text is tiny and the need for fine-motor controls makes everything slow. Meeting with their design team, I suggested that Adobe create iOS and Android apps that’ll use touch-screen tablets and task-centered screens to speed up tasks such as indexing and applying paragraph styles to raw text. For indexing, that’d mean clicking within ID on a Mac or PC where an index entry is to be placed, and then scrolling, quickly to the proper index entry on an attached iPad and tapping on that entry. There might even be a way to automate the start and top points for an entry that’d be easier than counting paragraphs or pages.

    Quite a few common, repetitive tasks could be speeded up that way, tasks such as search and replace that allows choosing one of several replaces. That’d be great when mere hyphens need to be replaced with N-dashes, M-dashes or left unchanged.

    One advantage of ID is that the developers are well aware of user needs for time-saving functions. It doesn’t take much time saved to recover the $50 cost of being a full Creative Cloud member and even less for the $20 single-app membership. And I’m hoping that, in few years, Amazon will offer a $10 author plan like their $10 Photography plan with Photoshop. Adobe could include book templates to ease InDesign’s initially steep learning curve.

    Looking forward to hearing your ideas for a “totally different approach to indexing.” I believe we’ve only begun to tap the potential of digital publishing.

    –Michael W. Perry, Inkling Books, http://inklingbooks.prosite.com

  2. Meghan Jones says:

    Let me sing to you of an 800 page book in InDesign (I think it was around 40 files). It was published online in batches, and the index was also done in batches of around 200 pages each. Then came the fun part: bringing all those indexes together and making them match for the print edition.

    I timed it. It took a full 90 seconds just to open the dialog box. To edit a subhead took a full 3 minutes beginning to end. Adding an ‘(s)’ to a main heading took more than 6. The problem seems to be the way the index is stored in the file itself. Rather than a single object with the entry and a pointer to the text, it is apparently stored as pointers to the entries throughout the book. Every time you edit an entry, it has to re-build the entire index. I even called Adobe at one point and asked if I was doing it wrong (Tech’s opinion: I wasn’t). I haven’t indexed a book in InDesign since.

    Now, ASCIIDoc and HTML5 are a joy. Hopefully that’s what’s up next.

  3. Steve Ingle says:

    Indexing in InDesign has its place. I have indexed a number of books in InDesign. Michael Perry writes that “[t]he result may not be up to the standards of an index done by a professional with dedicated indexing software, but when the alternative is no index at all, it’s a good choice.” Agreed.

    The real question, however, is: does Adobe InDesign provide a convenient solution for ALL projects, of any complexity and size, involving hyperlinked indexes? Until InDesign offers the indexer the same flexibility that s/he can get with dedicated indexing software like Cindex, the answer is clearly No. Or perhaps “not yet.”

  4. Ella Lucero says:

    thanks for the useful stuffs you share here