What’s going on when people spill their guts on social media sites? Or on the other hand, when they loudly reject the public health advice of their own governments?
It’s risky to generalise but I tend to agree that there are stereotypical differences across the world in how the public rations out their precious trust to governments versus businesses.
The cliché goes that Americans typically distrust government but have faith in the invisible hand of the markets, and so they “trust” businesses more (the shudder quotes are deliberate when we’re using such broad generalizations). Conversely continental Europeans tend to “trust” governments more than business. Australia, Canada and the United Kingdom fall somewhere between the poles.
Now, trust is a moveable feast, but this article isn’t the place to dissect how trust in social institutions is degrading. Rather I'd like to debunk "trust" in social media. Little or no conscious trust actually enters into the decision to share data with social media. In fact, it’s not so much a decision as a reflex.
Very few users comprehend what’s going on in social networking. We should suspect a priori that the relationship between the giant digital properties and members is not exactly balanced. Many of the brightest data scientists in the world are employed by Facebook, Google and Twitter, and for the most part they’re not there to find a cure to cancer but to figure out cleverer ways to target advertising.
Facebook is notoriously clever at tricking users into divulging personal data. It’s pretty widely accepted now (isn't it?) that Facebook takes deliberate steps to get users addicted. See Shoshana Zuboff’s epic The Age of Surveillance Capitalism. My own favorite example of Facebook’s trickery is how they gamified the training of their face recognition algorithms, by getting users first to tag each other, and later to confirm the algorithm’s tag suggestions. Until then, facial recognition routines were calibrated laboriously using relatively small image galleries. But Facebook changed the game, and not only by capitalizing on the billions of pictures innocently entrusted to it by its members, but by crowd-sourcing the matching of names to faces. It was genius.
So that’s one of the best examples of Facebook’s direct (albeit covert) collection of personal data. But they’re also supremely adept at mining and refining data from the matrix they have cultivated. The closely guarded People You May Know (PYMK) algorithm is the state of the art in indirect data collection. Some of the best digital experts in the world ― like investigative journalist Kashmir Hill ― have been stumped by how Facebook manages to make its connections and generate friend suggestions
What’s to be done about all this? I am not a lawyer but I have two regulatory suggestions. I’d welcome feedback from legal and regulatory experts.
First, most places in the world (and increasingly also some U.S. states) have technology neutral data privacy laws which put limits on the collection of personal data. These laws are largely blind to the manner of collection. If Social Media companies extract insights about identifiable users, then that’s personal data and it’s being collected. Therefore privacy laws apply, especially when individuals are unaware of the collection.
So in my view, legal action could be taken by Data Protection Authorities (DPAs) in any number of jurisdictions against the unbridled use of algorithms in social networks (including Friend Suggestions, Tag Suggestions and People You May Know) to surreptitiously collect personal data. Anna Johnston and I have written at length about Facebook’s particular clash with the Collection Limitation principle; see our chapter in the Encyclopedia of Social Network Analysis and Mining.
Social media’s algorithmic collection of personal data is but one example. An increasingly prevalent feature of digital businesses is the use of machine learning and analytics to extract insights, such as financial scores and insurance risks, from our digital exhaust. The synthesis of personal data ― or what the Office of the Australian Information Commissioner calls Collection by Creation ― falls within the scope of most data privacy law. If a business uses algorithms instead of questionnaires in order to get to know you, then privacy principles still apply.
My second regulatory suggestion is good old fashioned consumer law. Social media magnates like to claim that today’s users are sophisticated enough to “know” that there’s no free lunch on the Internet. Users are supposed to “know” that their data is being traded with businesses as part of a deal for services. But what if typical users don’t actually know how their data is used? Very few people have figured this out, considering the opacity of PYMK.
Moreover, what if the social media bosses know that users don't know? Being aware of ― nay, exploiting ― the information asymmetry while claiming that users are aware of what they're getting into would be deceptive conduct wouldn’t it? The purported services-for-data deal is a one-sided sham if users don't enter into it freely and knowingly.
Some of the richest people in history have arguably made their fortunes on the back of users’ ignorance about how data flows in the Digital Economy. The most lucrative undercover digital economy isn’t the Dark Web, it’s social media, with its tricky personal data practices and synthetic collection algorithms hiding in plain sight.