Friday, October 2, 2009

Amazon's Whispernet: the door swings both ways

Amazon's Whispernet and Whispersync have been touted as a revolution, an innovation in the content pipeline that will propel eBooks from a niche market into the mainstream population. If you have a Kindle, Whispernet means that you have access to all of Amazon's Kindle content from anywhere there is Sprint coverage, and that your user-generated content is automatically backed-up and synced across multiple devices by Amazon. For free. Great! It also means that Amazon pulls data from your Kindle; obviously, Amazon retrieves the content that it backs up, but perhaps it gets other information as well. To co-opt an old adage: if you can see Amazon, Amazon can see you.  And what Amazon sees (and how often) affects your data privacy and ownership.  Given Amazon's track record with transparency, Kindle users should be proactive in learning more about features of their device that are not advertised.

So just what information is sent to Amazon by your Kindle? According to Amazon:
"Information Received. The Device Software will provide Amazon with data about your Device and its interaction with the Service (such as available memory, up-time, log files and signal strength) and information related to the content on your Device and your use of it (such as automatic bookmarking of the last page read and content deletions from the Device). Annotations, bookmarks, notes, highlights, or similar markings you make in your Device are backed up through the Service. Information we receive is subject to the Privacy Notice."
This description is very vague. Just what kind of information is logged on the device, and how much of it is sent to Amazon? What information related to content is sent? Does Amazon only receive information related to content that is downloaded from Amazon, or will they receive information about, for instance, the fan-fiction you downloaded? How often is this information sent to Amazon?

If the Kindle communicated with Amazon via WiFi, you could set up your access point or router to intercept traffic from the Kindle and find the answers to the above questions that way. However, the Kindle communicates with Amazon over a Sprint 3G network via an account you have no access to, so you'd have to somehow intercept and interpret the cell signal. My understanding of wireless technology is extremely limited, but I'm guessing that you'd have to either passively intercept the signal (a federal crime in the United States) or somehow execute a man-in-the-middle attack against your Kindle (for instance, by setting up or impersonating a cell station, which would also probably constitute at least one federal crime).

Without being able to intercept traffic from the Kindle, the only way to determine exactly what information is sent by a Kindle is hacking it, which is against the Amazon Kindle License and Terms of Use:
No Reverse Engineering, Decompilation, Disassembly or Circumvention. You may not, and you will not encourage, assist or authorize any other person to, modify, reverse engineer, decompile or disassemble the Device or the Software, whether in whole or in part, create any derivative works from or of the Software, or bypass, modify, defeat or tamper with or circumvent any of the functions or protections of the Device or Software or any mechanisms operatively linked to the Software, including, but not limited to, augmenting or substituting any digital rights management functionality of the Device or Software.

Luckily, there are people far more tech-savvy, more willing to risk bricking their Kindles, and more willing to risk getting banned by Amazon than I am. They've who have figured out what information gets sent to Amazon. Since the sample logs from that forum post are just snippets, my interpretation may not be entirely accurate, but what seems to get sent to Amazon is:
  • the times at which you switch screens (i.e. from the list of books to a particular book)
  • the details of the book you are reading (i.e. the title, authors, the Amazon Standard Identification Number, content type, publisher, publication date, display title and authors, length, when you last accessed the book, your last location in the book, whether it's encrypted, whether it's a sample, whether it is newly downloaded to your Kindle, the path to the file on the Kindle system, whether there is text-to-speech metadata)
  • the details of your device - I'm not sure how to interpret all the data, but it seems like: your EVDO network information, signal strength, your latitude and longitude, and more
Obviously, because the Kindle has a 3G radio, Sprint knows where you are. But why is this information sent to Amazon as well? The terms of use did not mention receiving information regarding your location coordinates.

From the various complaints I've read about Whispersync not applying to PDFs on the Kindle DX, and the phrasing in various reviews regarding Whispersync, I assume Amazon does not sync your non-Amazon content across your devices. However, it's possible that Amazon also retrieves statistics about your non-Amazon content, along with your Amazon content (i.e. title, filename, etc. as given above).

Furthermore, when you use the web browser, all traffic goes through an Amazon web proxy (which is understandable - the proxy can optimize pages for display on the Kindle, for example by filtering out large images and/or video that the Kindle can't display anyway). This information is not disclosed to the Kindle user (although perhaps Amazon did not mention this in the Kindle terms of use because the browser is experimental). So if you use the Kindle for web browsing, Amazon also receives information about which sites you visit.

There is an obvious reason why Amazon wants all this data, a reason that doesn't involve tinfoil hats: market research. The wealth of information regarding users' reading habits provides accurate data I doubt companies can even pay for because self-reporting is notoriously subject to recall bias. Setting up a market research study at such a scale, with a sample size numbering in the hundreds of thousands, if not millions, would probably be prohibitively expensive. The data the Kindle collects is commercially valuable to Amazon, possibly valuable enough to offset what it pays to Sprint for connectivity.

While Amazon's motivations are likely benign, there may still be negative externalities associated with the data collection.  For instance, although this information is generated by their users, the users cannot access it and have no control over how long it is stored or how it is used.  Is losing control over personal data a reasonable price for a user to pay for the convenience of Whispernet? Maybe so - it depends on the preferences of each Kindle user.  However, the price is likely one that most users aren't even aware of and do not know how to evaluate. Which brings me to another cliché I was considering as a title for this blog post: there is no such thing as a free lunch.

Tuesday, April 28, 2009

Testing your Android applications

For future reference, Můj will probably be a good blog to look at for test-driven Android development. The linked article is the only one so far though, and the rest of the blog seems to be in Czech (?). But this quote is particularly pertinent:
"There are basically three levels of tests/unit-test you can use in your application:
  1. Unit testing of logic which does not depend on android at all. This is similar to every usual unit testing. Running tests directly from Eclipse requires some configuration changes which we will cover later.
  2. Unit testing of business logic which depends on Android but does not depend on Android application elements and ui. These logic does not require activity to be running with complete context and it can be tested in isolation from ui. These tests usually require something from context (e.g. resources, configuration, logger, some native classes)
  3. Unit/funcional testing of Android application elements. These tests are fully instantiated activities, services, content providers and applications. Via instrumentation it is possible to send keyboard and touch events to the activities and check response in ui. It is possible to test lifecycle of a service and to test databases changes made by content provider."
I've been trying to figure out how to test the second case, since I've been getting "Java RunTimeException: Stub!" errors for a while. As it turns out, you cannot just use JUnit to test code that calls on Android APIs: "android.jar in the SDK is only stubbed methods/classes, and contains no code." Even if you are not testing an Activity or any UI elements, you will still need to use Instrumentation. Once I figure this all out, and I am not lazy, I hope to write a very simple tutorial. Unless someone beats me to it, in which case I won't. :)

Monday, April 27, 2009

Android ApiDemos in Eclipse

I've been learning how to write Android apps lately, and I've been trying to figure out to use TDD. Diego Torres Milano's blog has been pretty helpful to me, especially his post about how get Android ApiDemos' tests to work. As a newbie to Eclipse and Android development though, I had some trouble with his instructions. Here are a few things I did that I didn't think was very clear on Diego's blog:
  • Create an ApiDemos Project first (New Android project -> Project from source -> use the source from [android SDK directory]/samples/ApiDemo).
  • Don't bother to delete the Dummy class or remove the source directory you created (when you follow his instructions to create the ApiDemosTest project).
  • The emulator has the ApiDemos installed by default. You should uninstall the default installation (adb uninstall and run your ApiDemos project in Eclipse. If you don't, you'll get errors when you run ApiDemosTest because the default ApiDemos installation is in a separate security domain (I think).
Other than these points, his post (and the comments) are pretty clear. This entry from Zhao's Weblog is also helpful.

Friday, February 6, 2009

Intrepid Ibex on MSI Wind

I upgraded my MSI Wind laptop from Hardy Heron to Intrepid Ibex. It didn't really fix any problem - my wifi drivers don't seem to be packaged with Intrepid and I have re-download the kernel module every time my kernel gets upgraded. My webcam still does not work. And my hard drive is clicking again. Editing /etc/hdparm.conf doesn't seem to work anymore. So I edited the following files:
  • /etc/acpi/resume.d/
  • /etc/acpi/battery.d/
I set DO_HDPARM=n. Basically, I'm saying "don't park my drive heads." I could also install laptop-mode-tools, but all I really want to do is to stop the drive heads from parking. This is an inelegant solution, but simple.