Imagine the (legitimate) outcry if a local municipality, a State government, or the Federal government in the US deployed an infrastructure that would systematically identify and track people as they went about their daily lives, without a viable option to opt out. While the US has laws that govern when and how data about individuals could be used, the mere availability of such data would lead to temptations that would be irresistible in practice, yet not necessary for the functioning of this society.
Blog Category: privacy
Ever since Pete Warden and Alasdair Allan published their discovery in the Where 2.0 conference, the popular press has been abuzz with sensational articles on how iPhones and iPads are recording your location in a secret file. The article itself misstates some key technical details. For one thing, the database is “hidden” because all the internal files in iOS are hidden and only visible in a jail broken phone; the file itself is only accessible to the root user. For users who make unencrypted backups of their iPhones using iTunes, this location data is exposed on their desktops. One hopes that users do not make unencrypted backups of their iPhone contents on a stranger’s desktop. If, on the other hand, an intruder had control over my account, they could access far more private data than just my location history.
Bill van Melle, Thea Turner, and Eleanor Rieffel contributed to this post
FXPAL’s work on the MyUnity Awareness Platform has received considerable attention from the popular press and the Internet blogosphere in recent weeks, following a nice write-up in MIT’s Technology Review. That article, despite its misleading headline, correctly relays the core motivation for the work: to improve communication among workers in an increasingly fragmented workplace. However, some writers who picked up on that article focused instead on the sensational aspects of having technology monitor people’s behaviors and activities while they are working. They incorrectly described some of the platform’s technical details, overstated what the platform does and what it is able to do with the data it collects, and failed to mention the numerous options we offer users to control their privacy. We thought we should clear up some of these misconceptions and clarify the technical details.
A while ago I wrote about the general threats to one’s privacy posed by search engine histories. It appears that the threat is more than theoretical, as researchers at INRIA and UCI have shown recently. They were able to exploit security weaknesses in the Google Web History used to generate personalized suggestions through what they termed a “Historiographer” attack.
Google appears to be taking the researchers’ warnings seriously, and has modified some of its services to use HTTPS. Not all aspects have yet been secured, however.
The success of de-anonymization efforts, as discussed here, suggests that older anonymization methods no longer work, especially in light of the large amount of publicly available data that can serve as auxiliary information. The quest to find suitable replacements for these methods is ongoing. As one starting point in this broader quest, we need useful definitions of privacy.
It has proven surprisingly difficult to find pragmatic definitions of privacy, definitions that capture a coherent aspect of privacy, are workable in the sense that it is possible to protect privacy defined in this way, and are sufficiently formal to provide means for determining if a method protects this type of privacy and, if so, how well.
On Friday Netflix canceled the sequel to its Netflix prize due to privacy concerns. The announcement of the cancellation has had a mixed reception from both researchers and the public. Narayanan and Shmatikov, the researchers who exposed the privacy issues in the original Netflix prize competition data, write “Today is a sad day. It is also a day of hope.”
The Netflix prize data example is probably the third most famous example of de-anonymization of data that was released with the explicit claim that the data had been anonymized. These examples differ from the privacy breaches discussed by Maribeth Back in her post on ChatRoulette or the issues with Google Buzz discussed as part of Gene Golovchinsky’s post “What’s private on the Web?” . Those examples made sensitive information available directly. In the case of the following three de-anonymization attacks, the data itself was “anonymized,” but researchers were able, with the addition of publicly available auxiliary information, de-anonymize much of the data.