The abstract of my Computers, Freedom, Privacy 2014 conference presentation.
Synopsis
Many Big Data and online businesses proceed on a naive assumption that data in the "public domain" is up for grabs; technocrats are often surprised that conventional data protection laws can be interpreted to cover the extraction of PII from raw data. On the other hand, orthodox privacy frameworks don't cater for the way PII can be created in future from raw data collected today. This presentation will bridge the conceptual gap between data analytics and privacy, and offer new dynamic consent models to civilize the trade in PII for goods and services.
Abstract
It's often said that technology has outpaced privacy law, yet by and large that's just not the case. Technology has certainly outpaced decency, with Big Data and biometrics in particular becoming increasingly invasive. However OECD data privacy principles set out over thirty years ago still serve us well. Outside the US, rights-based privacy law has proven effective against today's technocrats' most worrying business practices, based as they are on taking liberties with any data that comes their way. For example, regulators in Australia, the Netherlands, Korea and elsewhere found that when Google's StreetView cars collected Personal Information from unencrypted Wi-Fi networks, there was a privacy breach regardless of the fact the data was in the 'public domain'. And in Europe, Facebook was forced to shut down its photo tagging service and delete all facial recognition templates, because users were not even aware of the network's automatic biometric identification, much less had consented to it.
So to borrow from Niels Bohr, it appears that technologists who are not surprised by data privacy do not understand it.
The cornerstone of data privacy in most places is the Collection Limitation principle, which holds that organizations should not collect Personally Identifiable Information beyond their express needs. It is the conceptual cousin of security's core Need-to-Know and Least Privelege principles, and the best starting point for "Privacy-by-Design" (that is, ICT architecture should begin with analysis of what PII is really needed for the mission, and then restrict itself accordingly). The Collection Limitation principle is technology neutral and thus blind to the manner of collection. Whether PII is collected directly by questionnaire or indirectly via biometric facial recognition or data mining, data privacy laws apply.
It's not for nothing we refer to "data mining". But few of those unlicensed data gold diggers seem to understand that the synthesis of fresh PII from raw data (including the identification of anonymous records like photos) is merely another form of collection. The real challenge in Big Data is that we don't know what we'll find as we refine the techniques. With the best will in the world, it is hard to disclose in a conventional Privacy Policy what PII might be collected through synthesis down the track. The age of Big Data demands a new privacy compact between organisations and individuals. High minded organizations will promise to keep people abreast of new collections and will offer ways to opt in, and out and in again, as the PII-for-service bargain continues to evolve.