eBook Publishers Learn A Lesson: The Markup Is Not The Book…

RocketBook: Less of a rocket, more a damp firecracker…

I joined the fledgling eBook team at Microsoft eleven years ago this month. When we began work on eBooks back in 1998, it was clear from the outset that no publisher wanted to sign up to supporting a single eBook format.

At that time, there were two eBook contenders with devices in the marketplace – RocketBook and SoftBook – both of which are now either gathering dust in the closets of “early adopters”, or taking up space in landfill somewhere…

SoftBook: Turned out to be just too soft…

Each used its own format. Microsoft Reader was a third. Soon, there were eBook readers from Adobe and a host of others. They proliferated like weeds.

Very understandably, no publisher wanted to bet its entire eBook future on a single format. It was a problem everyone recognized. Out of that grew a markup standard called Open eBook. Eventually, the Open eBook organization morphed into the International Digital Publishing Forum, and Open eBook became ePub.

It was clear to everyone that there would be different eBook formats for a long time to come – perhaps forever. The problem is that if the publisher wants any kind of Digital Rights Management (DRM) or protection, the raw markup somehow has to be wrapped in a secured software package.

In the case of Reader, Open eBook markup was converted to .Lit. If the book’s being sold in Adobe’s Digital Editions, then it’s wrapped by Adobe’s Content Server and served up as .epub.

Amazon has entered the eBook fray in a spectacular way with its Kindle, which uses its own .azw format (again, with digital rights protection).

Since Amazon has its own “closed” device, its DRM can be a lot more transparent to its customers than DRM which has to protect content in an open software environment like a Windows PC or a Macintosh. Both MS-Reader and Adobe Digital Editions require the reader to Activate the reader software they’re using.

Many people hate DRM, and suggest that publishers are trying to hang onto the past. “It hasn’t worked for music, or anything else”, goes the litany,”So why do publishers believe it will work for books?” These are often the same people who insist all old business models are dead, and just don’t know it yet, and that all content will eventually be free.

Personally, as a writer, I’m not ready just yet to give up on being paid for my work. I’m writing a book right now, and if it takes me a year, I’d like to hope I’ll be able to pay the mortgage at the end of it.

Unlike many of the litanists, I believe that publishers and their editors perform a vital function in improving the quality of material which gets published. They need to get real, though, accept that the removal of the requirement to print and distribute physical copies of books has driven publishing costs down dramatically, and re-work their business models accordingly.

eBooks should be a lot cheaper, and could be a lot cheaper, without harming publishers’ or writers’ incomes. But a world of nothing but free content is like free cable TV – 500 Channels, and you can spend hours searching it to find there’s nothing you want to watch.

However, the issue here isn’t the different ways of wrapping standard markup. What happens to it when it gets rendered?

It’s exactly the same problem that I wrote about yesterday on this blog in relation to the Web. I created a page which used absolutely Web-standards HTML markup, and a standard CSS3 stylesheet – both verified as such by the WorldWideWeb Consortium’s Validation tools.

Yet the final rendering worked the way I wanted it to on only one Web browser. On three others it broke. One just made my page slightly ugly – the others hit it with a truck.

And there’s the issue for eBook publishers. Even though they all standardize on ePub as their markup, what happens to it when the reader sees it is out of their control.

I’m not talking here about the reader’s “personal comfort” decisions – like making the text larger, for instance. Readers have to be able to do that.

I’m talking about what happens at a lower level, in the rendering engine and its text and page composition engine.

Take Microsoft Reader, for example. At its heart is a text composition engine called Page and Table Services – the result of hundreds of man-years of effort by one of the best teams in the company (I can say that – I never worked in that team).

Microsoft Reader – still a pretty comfortable read…

At the heart of Adobe’s Digital Editions is that company’s terrific text composition expertise. Adobe (and Aldus, which it acquired in 1994) has been doing this for decades, for professional printers and publishers with the highest possible standards.

Both will compose to their own metrics.

Huckleberry Finn in PDF: Nice! (But you can’t change the text size)

In Adobe’s Digital Editions, I looked at two free ePub books, and one free book in Adobe’s PDF format. Huckleberry Finn, in PDF, was beautifully set; nice line-length, great word and line-spacing, hyphenation and justification. Just about perfect. The two ePubs, though, were a bit rough – justification but no hyphenation, and lots of other problems…

ePub in Digital Editions: Not such a good read…

Amazon clearly has its own engine in Kindle. It’s not bad – but it does do weird stuff that it shouldn’t, and doesn’t do other things that it should. It’s readable, but could be improved.

(Disclosure here: I have to earn a living outside Microsoft now, and there’s a limit to how much free advice I’m going to give…)

I’m clearly not the only one who’s spotted this problem. On the IDPF’s own homepage is a Open Letter from the Association of American Publishers, which says:
“We encourage the IDPF to provide support to facilitate implementation industry-wide. We recognize that a number of issues remain, and we encourage the IDPF to work with its member organizations to develop guidelines/plans for addressing:
  • Quality assurance of any other formats which are created based on the EPUB version
  • Conversion to .LIT and eReader
  • How to handle books that do not have reflowable text and are not appropriate for EPUB”

Well over a decade on the road to eBooks, and we’re clearly not there yet…


18 thoughts on “eBook Publishers Learn A Lesson: The Markup Is Not The Book…

  1. Anonymous

    Hello!We believe that ebooks and ebook readers are the future. We recently created a website ( sunbookr.com), where you can post your ebook and sell it!

  2. Hadrien

    Interesting post: the real problem though is flexibility. With Microsoft Reader, it's the software that takes most of the decisions about the layout, in ePub & Digital Editions it's the content provider who is responsible for some of them (linespacing, paragraph spacing, margins, indenting). The reading software could of course support hyphenation and use better algorithms for the justification, but to create a great experience with every book it would have to override some CSS properties.

  3. Bill Hill

    @ Hadrien:Great comment! It crystallizes an important issue that exists both in eBooks and on the Web. I’ll talk more about this in my next blog entry.

  4. bowerbird

    any e-book experience thatdoesn’t allow the end-userto be the final authority onvariables like “linespacing, paragraph spacing, margins, indenting” and the like, willultimately be rejected by theend-user, and for good reason.-bowerbird

  5. Bill Hill

    @ bowerbirdNonsense! (my second, more polite, choice of word…)Text in an eBook should be served up as well-composed for readability as possible.The reader needs to be allowed to adjust, of course.But are you really saying readers know better than typographers and designers who’ve spent years studying how to do this most effectively?The real challenge is to get that knowledge in the right place – in the individual’s reading software on the individual’s machine, and not making design choices blindfolded, as designers are force to do today.

  6. Richard Fink

    Is it not axiomatic that decisions should be made by the people who are the most knowledgeable about and in the most advantageous position to assess, the problem at hand.Would this not also hold true for typography and layout?hadrien and bill are rightbower_bird is… well, we’ve been through this all before, eh?BTW – Full-Screen mode in Safari (on the Mac, at least) can be had with either of two plug-ins:SaftGlimsNo answer for Windows as yet.

  7. bowerbird

    is there anyone who ismore qualified as an experton what i can and cannot readthan me? who dares to claim that?you guys have a lot to learn…-bowerbird

  8. Bill Hill

    @bowerbird:No-one is suggesting that anyone’s an “expert on what you can and cannot read”.But there are people who have spent a lifetime (and indeed, built on the expertise learned during the lifetimes of many who preceded them) figuring out how to present text in the most optimal way to humans who’re reading it.The vast majority of readers would prefer not to even have to think about these issues – they just want the best reading experience possible. And they’ll entrust it to the “experts” to make it as good as it can be.Of course, they should be free to alter anything they want, even though it creates a less-than-optimal experience. But most won’t change it, or even want to. Times New Roman happens to be the font used in billions of documents because for many years it was the default font in Word, and more than 98% of Word users never changed it.So it’s up to the “experts” to try to create the best initial experience they possibly can, and not just chuck the content at the reader and say “OK – now you work out how to make it readable…”

  9. bowerbird

    bill-please. you haven't said anything there that i didn'talready know, long ago…nothing that isn't obvious. think about it.> Of course, they should > be free to alter anything> they wantok, here's a little newsflash.nobody was sitting aroundwaiting for your permission.i mean, thanks for giving it.but this is something thatusers claimed for ourselvesa very, very long time ago…> Of course, they should > be free to alter anything> they want> even though it creates a > less-than-optimal experience. ok, and here's where thearrogance really shows up.you've specifically said herethat any change a user makeswill worsen the experience,make it "less-than-optimal".what utter hogwash!because an "expert" saidthat 12-point type was"optimal", i'm making theirprecious "expert design"less-than-optimal bymaking the text largerso i can actually read it?total and absolute crap.i mean, surely, you cannotreally _mean_ to say that?so do please tell me, bill,exactly what i'm missing.do you really actually believethat you can make one designwhich will be "optimal" for_everyone_ in the whole world?and that _anyone_ who makesany kind of change to it will"worsen the experience"?> But most won't change it, > or even want to. perhaps. and if they are notbothered enough to changethe display, then it must be"good enough" for them…which is totally fine by me.> Times New Roman happens> to be the font used in billions > of documents because for > many years it was the default > font in Word, and more than > 98% of Word users > never changed it.i guess it was "good enough"for them. i'm sure glad thatmicrosoft made such a wisechoice with their default font.after all, they did have "experts"who were doing their "design",didn't they? who am i to argue?who am i to say that _i_ finda sans serif font more readable?> So it's up to the "experts"> to try to create the best > initial experience they > possibly canwell, of course that's true…did you expect anyone toquestion that? seriously?where, in what i have said,do you get any kind of ideathat _i_ was questioning it?i work hard to make surethat my defaults are wise.but i'm not so arrogant tobelieve that they are "right"for every darn single user…> and not just chuck the > content at the reader > and say "OK – now you > work out how to > make it readable…"i never said _anything_ tothat effect. just read above,where i clearly wrote that"the final authority" wouldhave to rest with the user.what does "final authority"mean to you? i think thatit's perfectly clear that it's_the_last_word_, and notthe darn starting default…and, on a more practicalbasis, i do believe you'llfind that if you do _not_give the end-user theability to tweak a designwhen they need to do so,that they'll totally _reject_your content because ofthis design straitjacket…monopolies don't workany more, not like the past.-bowerbird

  10. Richard Fink

    @HadrienIf you’re still checking this thread: I have the IRex 1000 E-Reader, and Feedbooks.com has really made that device a lot more usable for me. Thanks!And you are exactly right about ePub and CSS. In a comment on a previous thread on this blog there is a download link to an experimental eBook – one that takes an approach similar to ePub and uses hyphenation and the kinds of CSS techniques (plus font-linking) – that you are describing. You might find it interesting and worth a look. (Check the comment for installation instructions and notes.)More elaborate demonstrations and experimentations coming soon.

  11. Hadrien

    @Richard: Glad that you enjoy using Feedbooks on your iRex device. You’re probably using a custom PDF right ?We plan on improving typography in our PDF files in the upcoming months, and I’d like to create several high-quality templates too.@Bill: Sure, I’ll send you an e-mail.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s