CCTV camera by Mike Fleming @flickr.com CC attribution license

Is Big Brother watching what you read?

Earlier this year, at the Digital Book World conference, a company called JellyBooks announced what it called “Google Analytics for ebooks.”

For some readers, this raised the specter of Big Brother — or his corporate brethren — reading over their shoulders. Should we be worried about what we’re reading being tracked?

It depends on how paranoid you are, but at this point, probably not.

The only tracking of commercial ebooks that’s currently done is by the retailers — when you ask them to sync the “furthest read” point between devices, à la Amazon’s WhisperSync, you give them permission to check that much. And they do keep track — not because they’re interested in you, particularly, but because it’s helpful for them to know what makes people stop reading, or keep reading, and what kind of thing they keep reading over and over.

So no — no one at Amazon or Apple has noticed that you keep going back to that particular chapter of 50 Shades of GreyI promise.

Another reason that some retailers need to keep track of how much of a book has been read (if not who’s read it) is that subscription services like Oyster and ScribD pay the publisher based on a percentage of the book that was read, while KindleUnlimited pays about half a penny per “page.” No one is watching who reads what, but everyone needs to know how much of the book was read (or not).

Having said that, it is theoretically possible to place JavaScript code in an ePub3 ebook file — this is probably how JellyBooks’s (pre-retail) system works. However most retailers don’t yet accept enhanced ebooks (that is, those with added audio, video, and/or scripting) yet, and even those that do are very picky about the kinds of data that can be collected and sent (in fact, they currently allow none of the latter — the scripts are allowed to pull data in from the Internet or keep it within the book itself but cannot transmit to anyone but the retailer). So in practical terms, no one is tracking you.

Anyone who’s run a website knows how helpful analytics can be — telling you where people came in, where they left, what paths they took, what pages were “sticky” and encouraged visitors to check out some more, and which were dead ends. It isn’t about tracking individuals; it’s about looking at aggregate data (“big data”) that tells web publishers what does and doesn’t work, and that then gives them an idea of why.

Ebook stores (like Amazon’s Kindle Store, Apple’s iBooks Store, etc.) have long gathered that kind of data about ebook reading habits, and publishers have long wanted it for themselves. Books are far more linear creatures than web sites, but, after all, they’re just boxed-up web pages, and it would be helpful to know, for example, what points in a book cause readers to stop reading, or which sections readers reread over and over again. In a collection of essays or stories, or a cookbook, what sections garnered the largest number of views, and which led to the most subsequent views (in web terms, which pieces had the lowest bounce rate)? And then of course there’s demographic info: does a particular book do better with women in their thirties? men in their fifties? teens?

This information would be helpful in making editorial, design, and marketing choices for the next book — or for a new version of the current book. (Digital publishing means never having to say you’re finished!)

There are all sorts of privacy and security issues that need to be addressed before this kind of data gathering can become a reality. The JellyBooks platform counts on Javascript running within the ePub files, both accepting and transmitting outside data — a major security loophole in any tablet, phone, or ereader. Still, there are ways of making it work. The big ebook sellers already have these sort of analytics because we ask them to keep track of how far we’ve read. None of us wants anyone else watching what we read, and we certainly don’t want that data floating over the internet without any precautions! JellyBooks’ idea as I understand it is to launch their system to be used in “beta testing” ebooks — before commercial release. The readers would be informed of the presence of information gathering scripts, and could opt out. For a large-scale release, that actually makes a lot of sense. Whether it’s something that will work for run-of-the-mill independent publishers is another question.

But the future of ebook analytics is wide open.

Image (“CCTV camera“) by Mike Fleming.
Used through a Creative Commons license.

Leave a Reply