Ethical Ways to Take Data from Users

Open Netflix and you’ll be spoiled for choice. According to Flixable, a searchable Netflix database, the platform has nearly six thousand titles. They’re aren’t yet all about murder or cakes so guiding users to the content they actually want to see has always been one of the platform’s biggest challenges. Blockbuster used to divide its shelves by genres, and each of those shelves could show hundreds of DVDs at a time. Netflix is limited to the dozen or so thumbnails that can fit comfortably onto a screen before the user has to scroll. It needs to be able to choose those thumbnails carefully.

The solution has been to use data. Netflix lets individual users on an account create their own profiles. It keeps track of the shows watched by each of those profiles and measures how many minutes they watched them.

That gives Netflix some idea of a user’s taste, but the information is limited. The streaming data only tells Netflix what a user has watched. It doesn’t tell the company how much they enjoyed watching it. Time spent streaming is a poor proxy for levels of enjoyment.

So Netflix turned to another way of gathering data. It also invited users to award up to five stars to each show they watched. The more stars they awarded to a show, the more they liked it. The company could then use those reviews to suggest other content the user might like. Give five stars to a political comedy, for example, and Netflix would offer another political comedy. Until April 2017, users could see how well a suggestion matched their tastes by the number of stars associated with the show. A show that carried a five-star ranking was a good match; a show with a one-star ranking was a poor match.

That system, though, was confusing. Users didn’t understand that the star ranking they saw under shows was a measure of fit and not quality measured by the votes of multiple users, like on Amazon or TripAdvisor. So Netflix changed the process. Instead of showing stars to indicate a match, it used a percentage, and instead of asking users to award stars, it asked them to give shows a simple thumbs up or a thumbs down. The result wasn’t just clearer matching. It was also a 200 percent rise in ratings activity.

For businesses wondering how to ethically collect and use data from their users, Netflix’s move provides a useful case study. The company needs customers to tell it what sort of movies and shows they like. By offering a simple tool for contributing that data—and few things are simpler than a basic thumbs up or thumbs down—they were able to encourage users to provide that personal information. The use of that data has been clear too. The company hasn’t sold its customers preferences to other firms; it has only used them to make recommendations and help its customers get more out of the platform.

Netflix’s data collection and usage stands in stark contrast to that of Facebook, the biggest collector and seller of data on the Internet. While Netflix makes money by selling access to content on a subscription basis, Facebook’s business model is to offer the content for free but sell access to the data generated by visitors. It’s as though Netflix were letting people watch movies for free but selling the thumbs-up-thumbs-down data to movie studios to help them improve their chances of making a hit or better target the marketing of their movies.

With more than 1.5 billion users visiting the site every day and telling the company where they are, what they’re doing, what they like, and who they know, Facebook has massive amounts of personal, identifiable data to sell. It also has a reputation for taking that data surreptitiously and not guarding it closely.

How Steve Bannon Exposed Facebook’s Unethical Data Use

The extent of Facebook’s sloppiness towards both collecting and protecting user data came clear in March 2018 when Christopher Wylie, an employee at Cambridge Analytica, blew the whistle on his company’s use of Facebook data.

Cambridge Analytica was a political consulting firm. Trump advisor Steve Bannon was a vice-president, and the company was funded by conservative donors Rebekah and Robert Mercer. The company’s work was performed by SCL Group, a British PR firm that claimed expertise in “influence operations” and “psychological warfare.” To conduct those operations, the group needed data. It needed to know exactly who would be susceptible to which kinds of messages—and it had no data of its own.

The data it used came from a Cambridge University data scientist called Alexandr Kogan. Kogan had created an app in 2014 called “This Is Your Digital Life.” A personality quiz distributed on Facebook, the app appeared to users as little more than a piece of entertainment. There were no shortage of similar quizzes at the time, inviting users to answer a series of questions that would tell them what house they would belong to at Hogwarts or which kind of animal they were. What wasn’t clear to the people who took those quizzes was that they were also tools for collecting data. In effect, the users were completing data marketing questionnaires—and the publisher of the app would have access to all of the other Facebook data associated with each user’s profile.

Alexandr Kogan’s app though, went even further. In addition to collecting answers and sucking up user data, the app was also able to burrow into the profiles of the contacts of the people who took the quiz. Those users might never have heard of the “This Is Your Digital Life” quiz. They hadn’t consented to Kogan accessing their data, but because someone they know on Facebook had taken it, all of their personal information was now in the hands of a third party.

Altogether some 270,000 people took Kogan’s quiz but they gave him access to the data of some 87 million people. When Kogan gave Cambridge Analytica access to his app, the company would have had access to all of that personal data—enough for them to run their misinformation campaigns.

None of that should have been possible. Kogan had told Facebook that the app was for academic purposes. It wasn’t but Facebook didn’t check. Selling that data to a third party was a breach of Facebook’s terms but there was little that Facebook could do to prevent the sale of its data from taking place.

When Christopher Wylie told The Guardian newspaper how Cambridge Analytica had come by the data that allowed it to push political messages during the 2016 presidential election two things became clear: first, users were reminded of the vast amount of data that Facebook knew about them; and second, they were shown just how loosely Facebook guarded that data. Anyone could access it, and through them access their friends’ data too, even when those friends hadn’t provided consent.

It was a huge betrayal of trust.

Facebook Makes Things Better

After the Cambridge Analytica scandal broke, users started wondering what exactly Facebook knew about them. What they found surprised them. Facebook confirmed to Bloomberg that it scans content in Messenger, although that appears to be more to check for questionable content such as child pornography than collecting data. More troubling was that Facebook’s Android app allowed the company to track phone data. It couldn’t listen into phone calls or read texts but it could record which numbers a user called or texted, how frequently and for how long. Users might once have given permission to Facebook to read their contact lists but they were more surprised to learn that Facebook was also checking what they were doing on their phones when they weren’t using the app.

The result of all these scandals was that so many people chose to download their data from Facebook to see what the company knew about them—and possibly delete their accounts—that that there were backlogs and delays in receiving their data.

Faced with an angry reaction from Congress and with European governments threatening new regulations, Facebook responded to the Cambridge Analytica scandal by trying to plug the gaps in its data protection. The company rescinded the ability of third parties to take data from the contacts of consenting users without their permission (though companies who had taken than data before 2015 would still have it.)

It also took a stricter attitude towards data collected by third parties. The company wound down the “partner categories” in its advertising programs that allowed marketers to supplement Facebook’s data with their own. Facebook couldn’t control how third parties collected their data so it excluded that data from its site. It also made it harder for advertisers to use “custom audiences”—to upload their own email lists and match them to the user data on Facebook. The company was worried that those firms might not have the consent of all the owners of the email addresses.

Those moves protected Facebook from data unethically collected by other companies being used on its platform. They also punished marketers who had collected data ethically, made advertisers on Facebook more dependent on the company’s own data, and they did little to change Facebook’s own habits of taking more information from users than users might want or were aware of. Facebook’s changed punished other companies for doing what they themselves had been doing but did little to alter their own practices.

How Other Industries Collect Data

Until the Cambridge Analytica scandal, users might been willing to shrug away social media’s data excesses. People knew that if they weren’t paying for the product, they were the product. As long as they knew what data they were supplying, and as long as that data was only being used by the company they were giving it to and in order to serve better ads or direct them towards content or offers that they might enjoy, that was a deal users could accept. Millions have accepted that deal.

There have been few objections to Amazon using browsing and purchasing data to suggest more books or products. Google’s habit of showing ads from a site you’ve visited and left feels creepy, like a salesman who follows you out of the store and down the street, but the company hasn’t come under the same pressure as Facebook.

The problem isn’t just that Facebook knows far more about greater numbers of identifiable individuals than other companies, and it isn’t just that it doesn’t protect that data very well. It’s that Facebook has been helping itself to more data than it was entitled to take and allowed everyone else to help themselves to it too. That was a betrayal of trust, and it felt highly unethical.

So how could Facebook and other social media companies act? What could they do that would allow them to collect the data they need to help their advertisers while still supplying access for free?

One place to look for ideas—other than Netflix—is the market research industry.

Data marketing depends on collecting identifiable information. Advertisers want to be able target their marketing to the level of individual tastes. Market research keeps the data collected unidentifiable. Rather than learning about individuals, market research learns about the behaviors and preferences of groups.

In an article published on The Verge last year, Alexandra Samuel, a tech writer and former data marketer, explained that the ability to grab data from the contacts of consenting users on social media platforms was common knowledge in the marketing industry. A company pitched her firm a tool called “Wisdom” that was described as a data source based on 17.5 million anonymous, opted-in Facebook users. In fact, when questioned, the company admitted that it had managed to accumulate just 52,600 installs. Each of which those installs gave them access to an average of 332 friends. What Alexandr Kogan had done with his app was neither new nor surprising.

Nor were data collection “quizzes” and apps that grabbed friends’ data the only underhand methods that marketing companies were using to take data without consent. In her article, Samuel quotes Mary Hodder, a privacy consultant, describing an idea that came up during a project her company performed for the Hard Rock Café in Las Vegas.

“’They wanted to put wands in the ceiling to collect the IMEI [identification] numbers of every phone that went by, map everywhere they went in the casino or on the property, and map them in the hallways up to their rooms. And then they could do a reverse lookup on IMEI numbers because there are companies that aggregate IMEI numbers, and as soon as they figured out who the person was, they could send them offers, text them offers, and the people had not opted in. So they were basically just intercepting your phone, and figuring out how to send messages to you in one form or another.”

Hodder explains that her colleagues at the meeting in which this surreptitious data collection was discussed saw nothing wrong with it. “That’s how normal it was to harvest data and use it to target individual ads, long before Cambridge Analytica got in on the action,” Samuel writes.

Her own company however, did object to these techniques. Vision Critical had come from the market research industry which is much older than the data marketing industry. When the question of taking data from friends of users first came up, the company’s founder dismissed it immediately. At each stage of the development of the company’s Facebook app, they made clear that they were using it to gather data rather than to tell someone which common room they would have used at Hogwarts or which enclosure would be theirs at the zoo.

ESOMAR, an international association for market research, social research, and data analytics, dates back to 1947 and drew up its first Code of Marketing and Social Research Practice the following year. The latest version dates to 2008 and has been combined with the code used by the International Chamber of Commerce. It makes clear that successful market research depends on public confidence, “that it is carried out honestly, objectively and without unwelcome intrusion or disadvantage to its participants.” The “key fundamentals” of the code include following all relevant national and international laws; behaving ethically and not doing anything that might damage the reputation of market research; and taking “special care” when conducting research among children and young people.

Most importantly though, the cooperation of respondents must be “voluntary and must be based on adequate, and not misleading, information about the general purpose and nature of the project when their agreement to participate is being obtained and all such statements shall be honoured.” Market researchers, the code also states, “shall never allow personal data they collect in a market research project to be used for any purpose other than market research.”

Regarding data security, market researchers are required to “ensure that adequate security measures are employed in order to prevent unauthorised access, manipulation to or disclosure of the personal data. If personal data are transferred to third parties, it shall be established that they employ at least an equivalent level of security measures.”

The market research code, in other words, covers just about all of the topics in which Facebook and other digital firms have run into trouble. It requires participants to be honest, transparent, and to safeguard their data. The digital data collection used in data marketing currently has no similar standards.

“The whole time, it felt like we were swimming against the tide by following old-school standards for transparency and accountability in how we handled data,” Alexandra Samuel wrote on The Verge. “I hate to admit how many times I pitched my colleagues on some clever way of incentivizing people to connect to Facebook, based on some scheme or app I’d just stumbled across, only to be reminded that it would violate our data or privacy policies.”

How Data Marketing Needs to Change

It may well be too much to expect data marketers to come up with a code of conduct similar to that followed by market research firms. The essential benefit of data marketing is its granularity, the ability of sellers to personalize their messages to the level of individuals. Users will give up some of that data willingly in return for free access to a platform and its content. They’ll balk at giving up other data, and they’re likely to hesitate at allowing third parties they don’t know to understand which television shows they like or how often they call their mother. The ability to gain that information surreptitiously, through quizzes, games, monitoring digital behavior, and connections to contacts is likely to remain too easy, too tempting, and too valuable for data marketing firms to voluntarily give up.

Social media firms could copy practices used in other industries. Retailers have the ability to keep track of their customers’ purchasing habits through both buyer history and loyalty cards. Customers give up a certain amount of privacy in return for the chance of landing targeted discounts. If the advertising on social media platform were to make clear to users why they were being targeted and include a reward for the ability to make that targeting, users are likely to find the use of their data more acceptable.

Ultimately though, data collection and protection will become more ethical only as a result of pressure from two directions.

The first is governmental. The European Union’s General Data Protection Regulation already requires companies to obtain informed consent for the use of their data. The threat of fines as high as €20 million or 4 percent of annual global revenue have led firms to adhere to the regulations strictly. It’s also led to a large decline in the effectiveness of email marketing. The EU is now threatening to treat Internet communications firms such as Google and Facebook like traditional telecoms companies.

“It cannot be right that a company providing traditional telecommunications services has to meet certain regulatory requirements, like those concerning data protection, while a company providing comparable services over the web does not,” Jochen Homann, the head of Germany’s telecoms regulator told the Financial Times last year. That regulation would force companies to be more transparent and to follow stricter regulations regarding data security. Currently, the EU is waiting to see whether the industry can police itself. If it fails to do so, it’s likely that tight regulations will follow.

Alternatively, users could continue drifting away. Facebook’s growth has already plateaued. Between July 20, 2018 and December 21, 2018, as the data scandals deepened, Facebook’s stock price fell from nearly $210 to just under $125. (It’s since risen to around $167, still a long way off its peak.) If users no longer trust the platform, they’ll leave. And without those users’ data, the company will suffer.

They’ll also leave faster if they have somewhere to go. It’s never easy to start a new social media platform. Incumbents have the advantage of both familiarity and a network of contacts already in place. But just as ICQ and Instant Messenger had millions of users and are now Internet history, so Facebook could be in trouble too if an alternative platform offers similar networking benefits but better data safeguards. As Alexandra Samuel told Sarah Steimer of the American Marketing Association:

“At this point, the real check

[would be]

the availability of competitive platforms that behave better. There certainly is a market opportunity for a social media platform to build a user base on the strength of respect for privacy, though so far none of those efforts have really taken off. I do believe it’s just a matter of time before that happens. The fear of that alternative may motivate Facebook and others to make some real changes.”

Companies like Netflix, with their subscription models and no-advertising policies, have it easy when it comes to data protection. The question is whether a future Facebook will take a similar approach and change the way data is collected and safeguarded.

Ethical Ways to Take Data from Users

Recent Posts

Recent Comments

Archives

Categories

Meta

Read more

Recent Posts

Recent Comments

Archives

Categories

Meta