Why Specs Change: EPUB 3.2 and the Evolution of the Ebook Ecosystem

  • Sumo

This is a guest post from Dave Cramer, a co-chair of the W3C Community Group, an ebookcraft favourite, and an #eprdctn old-timer.

It takes much more than a village to make an ebook. Authors, publishers, developers, distributors, retailers, and readers must all work together. EPUB* requires authoring and validation tools as well as reading systems. The EPUB standard depends on the HTML and CSS standards, among others. There are millions of existing EPUB 2 and EPUB 3 files out there. Change anywhere is felt everywhere.

Today there’s no reason to use EPUB 2, and yesterday is the best day to start producing EPUB 3.

As this ecosystem evolves, the EPUB standard itself sometimes has to change to keep up. When the Web moved to HTML5, enabling better semantic markup and better accessibility, it was clear that EPUB could benefit. EPUB 3.0, which was released in October 2011, supported HTML5 as well as scripting and multimedia. EPUB could now be used for more kinds of books, better books, more accessible books. EPUB 3 was a big deal, significantly different from, and better than, EPUB 2. Today there’s no reason to use EPUB 2, and yesterday is the best day to start producing EPUB 3.

Sometimes the need for change comes from innovation inside the ebook world. As Apple and Amazon developed fixed-layout ebooks in the early 2010s, the IDPF knew they had to create a standard, to avoid fragmenting the marketplace. Sometimes specs just have bugs, or implementations discover an ambiguity. Some changes are large, like moving to HTML5, and some changes are small, like allowing multiple dc:source elements in EPUB 3.0.1. EPUB 3.0.1 was ultimately a maintenance release, incorporating in the fixed-layout spec, slightly expanding what sorts of attributes were valid in EPUB, and fixing various bugs. Existing EPUB 3s didn’t need to change to support 3.0.1.

In 2016, the IDPF’s EPUB Working Group started working on a more substantive revision, which would become EPUB 3.1. The goal was to bring EPUB closer to the rest of the Web Platform, and make the spec simpler and easier to read. The former was done partly by trying to remove seldom-used features in EPUB that were not part of the larger Web, such as the epub:switch and epub:trigger elements. The Group also clarified the relationship with CSS, moving from an explicit profile of supported properties (which had little bearing on what was actually supported) to using the W3C’s own official definition of CSS, which evolves. It did the same with HTML, referring to the latest version of HTML5, whatever version that might be. But most of our ambitious ideas were scaled back or dropped, such as allowing the regular HTML serialization of HTML5 in EPUB. EPUB 3.1 was officially finished in January 2017, before the IDPF became part of the W3C.

But remember that a standard is only a part of the ecosystem. Two factors proved fatal to EPUB 3.1. First, there are hundreds of thousands of EPUB 3.0.X files already out there. EPUB 3.1 changed the value of the version attribute in the package file, and so those existing files would need to be edited to comply with the new spec, even if they didn’t use any of the removed features.

Second, the validation tool EpubCheck was never updated to support EPUB 3.1.  Unlike the web, the ebook ecosystem is highly dependent on formal validation. EpubCheck is the gatekeeper of the digital publishing world, the tool that verifies compliance with EPUB standards. But EpubCheck is in trouble. It’s maintained by a handful of volunteers, and has almost no resources. There’s a backlog of maintenance work and bug fixes to do. Fifteen months after the release of EPUB 3.1, it still is not supported by EpubCheck, and thus no one can distribute or sell EPUB 3.1 through the major retailers. The Publishing Business Group is currently working to ensure EpubCheck’s future. Stay tuned!

Better to live with some obsolete features, and guarantee compatibility, than require too much change. EPUB was having its ‘don’t break the Web’ moment.

EPUB 3.1 was a good spec—better-organized, easier to understand, clearer about the relationship between EPUB and the underlying web technologies. The EPUB 3.0.1 features it removed were seldom used, and often unsupported. But after 3.1 was completed, many people decided that, even if almost no existing EPUB 3 content was rendered incompatible with the new spec (aside from the version attribute), the price was too high. Better to live with some obsolete features, and guarantee compatibility, than require too much change. EPUB was having its “don’t break the Web” moment.

Early this year, Makoto Murata and Garth Conboy proposed that we roll back some of the changes in EPUB 3.1. This updated spec would be known as EPUB 3.2. The goals were:

  1. Guarantee that any EPUB 3.0.1 publication conforms to EPUB 3.2.
  2. Ensure that EPUB 3.0.1 Reading systems would accept and render any EPUB 3.2 publication, although graceful fallback may sometimes be required.

If you already have EPUB 3 files, you don’t need to make any changes to existing content or workflow to adopt the forthcoming 3.2 spec. You just have a few more options, much like the change from 3.0 to 3.0.1. If you don’t already have EPUB 3 files, start now (making 3.0.1)! There’s no reason to wait.

EPUB 3.2 will still be based on EPUB 3.1, and keep many of the changes in 3.1 that don’t affect compatibility, such as referring to the latest versions of HTML5 and SVG, and using the official CSS Snapshot rather than the old profile. 3.2 will also continue to include WOFF2 and SNFT fonts as core media types. Perhaps most importantly, making EPUB 3.2 closer to EPUB 3.0.1 will require much less work to upgrade EpubCheck.

We need your help!

The W3C EPUB 3 Community Group has started to work on EPUB 3.2, with the explicit goal of remaining compatible with all existing EPUB 3.0.1 files, while retaining the best features of EPUB 3.1. I expect this work to take six months or so; others are more optimistic. When final, EPUB 3.2 will become a W3C Community Group Report, as Community Groups do not create W3C Recommendations.

We need your help! Join the EPUB 3 Community Group at https://www.w3.org/community/epub3/. It’s free, you don’t have to be a W3C member, and everyone is welcome. Much of the discussion of technical issues will happen on GitHub; our repository is at https://github.com/w3c/publ-epub-revision/.

You can look at the early drafts of our spec, too:

  1. EPUB 3.2 Overview
  2. EPUB 3.2 Specification
  3. EPUB Packages 3.2
  4. EPUB Content Documents 3.2
  5. EPUB Media Overlays 3.2
  6. EPUB Open Container Format

*EPUB® is an interchange and delivery format for digital publications, based on XML and Web Standards. An EPUB Publication can be thought of as a reliable packaging of Web content that represents a digital book, magazine, or other type of publication, and that can be distributed for online and offline consumption.

4 Responses to “Why Specs Change: EPUB 3.2 and the Evolution of the Ebook Ecosystem”

  1. Section 2.6 of the EPUB’s specs appropriately describe what’s at the heart of EPUB’s ills and why the adoption of new versions has been so sluggish.

    “A key concept of EPUB is that content presentation adapts to the user, rather than the user having to adapt to a particular presentation of content.”

    That makes no sense at all. In the roughly half century of movable-type publishing, that has never been the case. The move to digital has not altered that necessity in the slightest. Only its implementation has failed.

    When I read an ebook, I no more want its “presentation of content” to adapt to my wishes than I want the content itself to do so. The basic nature of a book is that the author determines its content and the publisher determines its presentation, and both in ways that readers are most likely to enjoy. That means well-written words in an attractive format. As a reader, I don’t want either to “adapt” to my wishes. I want to be able to enjoyed the skilled contributions of both the author and the publisher. They are the creators and I am the consumer.

    Once, visiting the Library of Congress, I came across two interesting displays across a hall from one another. One was a beautifully written and illustrated medieval manuscript of the Bible. The other was one of the first printed Gutenberg Bible. The two were so beautifully alike that if the placards had been switched few would have noticed. Note what had happened. The first printed book virtually matched the “particular presentation of content” of its predecessor. It wasn’t a step back.

    Is that true of EPUB books some twenty years after they began. No, it’s almost impossible to recreate anything like the “presentation of content” of even a simple printed book with an ebook. The standard for EPUB apparently remains the ugliness of Palm Pilots circa 1998.

    If those responsible for EPUB want its evolving standards to be adopted, they need to create aspects of those standards that are worth taking up. That means letting the creators create and the consumer enjoyably consume. It does not mean forcing the users into a role they never sought and don’t want.

    –Michael W. Perry, medical writer

  2. […] Why Specs Change: EPUB 3.2 and the Evolution of the Ebook Ecosystem (EpubSecrets)  […]

  3. Abiatha says:

    Adaption of presentation makes sense on digital platforms for a number of reasons, not leas of which is that digital platforms and what presentation features they support vary widely. Also, we now have to be concerned about accessibility. Presentation must be flexible to accommodate different modes of access as much as technically possible.

    And yes, when Gutenberg made his first bible, his type and design largely copied the existing models of hand-written books. He had no other models to go on. However, it didn’t take very long for that to change.

  4. […] Why Specs Change: EPUB 3.2 and the Evolution of the Ebook Ecosystem – EPUBSecrets […]