epubcheck 101: Just What is Epubcheck?

  • Sumo

Parts of ebook creation can be confusing to those coming over from the print world. One of the tools people commonly ask me about is epubcheck. It’s a powerful tool that unfortunately isn’t the most human-friendly. Today we’ll look at why it’s important to use it, and in future posts we’ll look at how to use it.

What is Epubcheck?

The EPUB spec provides a set of rules for the creation of EPUB files. These rules define what elements are required, what elements are optional, and what elements are explicitly forbidden from being in EPUB files. And while you could read the entire spec if you wanted to…well it’s better if there were just a tool that could check your file and tell you if your file met this specification.

And that’s what epubcheck does. When you run epubcheck (essentially a software tool with no Graphical User Interface) it compares your file to the set of rules that define the EPUB spec and tells you whether or not your file meets the requirements. When a file meets its spec, we say this file is “valid.”

Why is Epubcheck important?

Obviously, making a file that meets the rules of EPUB files just seems like a good idea. But that’s not the sole reason why you should use it.

Similar to Epubcheck, there are validators out there for HTML and XHTML files. But a lot of web developers don’t use them anymore. Why?  Well, part of the legacy of web development is that multiple specs—including the current HTML5, which took years to codify— has left a lot of different rules that can get really confusing. This has pushed a lot of the validation to web browsers. Today’s modern browsers will often take invalid files and find a way to interpret them as valid.

I probably don’t need to tell you that modern e-reader software does a poor job of interpreting an invalid EPUB file :). And that’s why Epubcheck is extremely important for EPUB creators. If your EPUB is valid, there’s a much higher chance that it will be read correctly by an e-reader. Maybe someday (…) e-reader software will be as good as modern web browsers and will interpret invalid files correctly, but until that day epubcheck is your safety net.

Equally as important: many of the major retailers (Apple, B&N, Kindle to some degree) use epubcheck when you submit your EPUB to them. iBooks, for example, won’t allow you to submit an EPUB that doesn’t pass epubcheck. So if you’re looking to sell a title, it better pass epubcheck.

 

In the next post we’ll look at a couple ways to run epubcheck on your file and how to view the results of your test.

5 Responses to “epubcheck 101: Just What is Epubcheck?”

  1. I blog quite often and I really thank you for your information. Your article has really peaked my interest.
    I’m going to take a note of your site and keep checking for new information about once a
    week. I opted in for your RSS feed too.

  2. […] Part 1 we looked at why Epubcheck is important. In Part 2 we looked at the various ways to use it. Here […]

  3. […] The EPUB spec provides a set of rules for the creation of EPUB files. These rules define what elements are required, what elements are optional, and what elements are explicitly forbidden from being in EPUB files. And while you could read the entire spec if you wanted to…well it’s better if there were just a tool that could check your file and tell you if your file met this specification. And that’s what epubcheck does. When you run epubcheck (essentially a software tool with no Graphical User Interface) it compares your file to the set of rules that define the EPUB spec and tells you whether or not your file meets the requirements. When a file meets its spec, we say this file is “valid.” (epubcheck 101: Just What is Epubcheck?) […]

  4. […] ebook is finished! You did your homework, fixed all the nasty errors that epubcheck threw at you, and your ebook is done. The only remaining step is to send it to Apple’s review […]

  5. skreutzer says:

    No, it would be a disaster if e-reader software and distributors would start to interpret invalid EPUB files just as web browsers do with websites. This would weaken the standard (standard-incompatible “extensions” would emerge), increase the needed amout of code to cope with badly constructed files for no benefit (except allowing developers and tools to be more sloppy), and it would make invalid publications quite useless for machine readability (that is small programs/scripts reading the EPUB for a specific, limited purpose) just as we have a web full of documents today, and most of them are accessible for human readers only, not for machines to make sense out of them.