‘We can’t un-FLoC ourselves’: Google’s cookieless ad targeting proposal under fire from ethics researchers, lawmakers for discriminatory potential

Google’s proposed method for enabling personalized ads without targeting specific individuals, FLoC, is up against privacy hurdles delaying trials in Europe. But, as testing of the AI-based technique gets underway here in the U.S. and elsewhere, concerns within tech, academic and even U.S. government circles have grown regarding the potential discriminatory and harmful impacts of FLoC.

Google on March 30 launched trials of FLoC, or Federated Learning of Cohorts, among a relatively small portion of people using its popular Chrome browser in Australia, Brazil, Canada, India, Indonesia, Japan, Mexico, New Zealand, Philippines and the U.S.

As the origin trial gets underway, data ethics researchers, privacy advocates and even some in ad tech themselves fear that FLoC data could be combined with personally identifiable information to expose information about people’s webpage visits and interests to nefarious actors, (and yes, advertisers). And they worry cohort-based targeting could be used to deliberately harm or discriminate against particular groups of people and that FLoC will only exacerbate problems inherent in algorithmic categorizations of people, rather than achieving the ethically-sound mission the method is intended to.

It’s worth noting ad tech firms and others in the digital ad industry that have fought Google’s dominance for years have an incentive to sabotage its efforts in devising replacements for tracking and targeting once enabled through the very third-party cookies Google plans to render obsolete.

When Digiday asked Google about its internal efforts to assess the potentially harmful impacts of FLoC algorithms, the company said it has worked on the issue for the past several months. To ensure that sensitive categories are blocked, the company said it has been extensively testing cohort algorithms, reconfiguring them to remove correlations with sensitive issues. Multiple teams inside Google are involved including Google research and machine learning modeling teams.

Scrutiny from academics and government

U.S. Rep. Yvette Clarke asked Google CEO Sundar Pichai specifically about FLoC and how the company has addressed potential bias and disparate impact of its machine learning algorithms during a joint House Energy and Commerce Committee hearing on disinformation held in late March. “The longer we delay in this, the more these systems that you’ve created bake discrimination into these algorithms,” said the Democrat from New York. When questioned by Rep. Clarke, Pichai said Google will apply its artificial intelligence principles, which prohibit categorization based on sensitive categories, in developing FLoC.

The FLoC method is one of many ad targeting and measurement approaches Google has pushed through its Privacy Sandbox initiative. It relies on an algorithmic process inside the browser to generate cohorts composed of thousands of people based on the sites they have visited in recent days, the content of pages they viewed and other factors. In an effort to preserve privacy, Chrome assigns FLoC IDs to a cohort, or group of people, without including any individual-level data.  For example, if there is a FLoC for people interested in home design, then everyone in that cohort would be given the same FLoC ID, and those people can also be given IDs for other cohorts into which they have been grouped.

Because FLoC IDs will be assigned to reflect people’s interests based on web pages they have visited during a period of a few days, the IDs will change regularly. Google will allow advertisers to use their own data, machine learning models or predictive analytics to evaluate what cohorts of people they are interested in based on what FLoC IDs imply; however, it is not clear how advertisers will be able to access IDs.

Academic researchers at universities including Princeton University and University of Southern California have begun digging into FLoC to understand how the system works and its potential for discrimination and unintended privacy breaches. Privacy advocates have also criticized FLoC.

“What Google gets to say now is we do not have any individual data, but it’s an ephemera,” said Pam Dixon, executive director of nonprofit research group World Privacy Forum. One of the concerns is that it is not clear how people will be able to opt out from FLoC targeting – unless they simply do not use the Chrome browser. “We can’t un-FLoC ourselves from those categories,” said Dixon, who suggested the method is in some ways more “unfair” than the third-party cookie-based targeting it’s intended to replace because, as others also warn, it could be employed to categorize and target people in discriminatory ways.

The nonprofit Electronic Frontier Foundation has been particularly vocal in its criticism of FLoC, arguing that cohort-based categorization makes existing tracking methods more powerful. “A FLoC identifier is just going to be a new chunk of information that can be appended to what [advertisers] already have,” Bennett Cyphers, a staff technologist at EFF, told Digiday.

Ad tech firms or other entities employing fingerprinting techniques, which Google Chrome itself prohibits, could actually benefit from FLoC IDs, according to Cyphers. Put simply, fingerprinting uses a variety of individual pieces of data about someone’s device to decipher their identity. “The FLoC ID just becomes another data point in their profile and a very powerful one,” he said. In a March EFF post, Cyphers wrote, “If a tracker starts with your FLoC cohort, it only has to distinguish your browser from a few thousand others (rather than a few hundred million).”

Under consideration at the W3C

As Google’s origin trials give researchers more insight into how FLoCs work, their findings will be taken into consideration if the Worldwide Web Consortium (W3C) moves in the direction of creating standards for FLoC, Wendy Seltzer, strategy lead and counsel at the W3C told Digiday. “I think that’s important research, and I would like it to help inform the W3C considerations and questions as work moves to standards work, because we are concerned with the social impact of our technologies,” she said. Seltzer declined to provide further detail, in part because the Privacy Sandbox process hosted by the W3C remains in the early stages.

Not only are researchers concerned that FLoC will supercharge existing identifiable profiles on people, some worry people with malicious intent could use FLoC targeting to attack vulnerable groups.

Basile Leparmentier, a senior machine learning engineer at ad tech firm Criteo referenced an extreme scenario in which LGBTQ youth in Ireland were assaulted by people posing as LGBTQ in dating apps in a January W3C post about FLoC. “The attacker can easily emulate the internet browsing history of a member of the group he is willing to harm and see the FLoC he/she has been added to,” he wrote, noting that bad actors could combine personally-identifiable information about someone who has the same FLoC ID. “The ‘attacker’ can then target this FLoC ID in any specific way they wish, even though they don’t have access to any specific user.”

Google details approach to blocking sensitive categories

The week of March 28, Google submitted a paper to the W3C group focused on development and testing of its Privacy Sandbox ad technologies detailing the company’s approach to ensuring that cohorts produced by FLoC cannot be correlated with any sensitive attribute. For example, the paper explained how it plans to use a sensitivity threshold to block the smallest number of sensitive cohorts while providing strong privacy protections.

Google’s definition of what web pages are too sensitive to be used to create cohorts is based on the company’s existing ad policies, which prohibit ad targeting related to race or ethnicity, religious belief, political affiliation, sexual interest, or to categories reflecting personal hardship such as health or medical issues or criminal record.

Google’s own approaches to AI ethics have been fraught with controversy over the years. Following outcries over certain people it chose to sit on its artificial intelligence ethics board in 2019, Google shut it down. More recently, Google sparked a backlash when the company fired AI ethics researcher Timnit Gebru after demanding she withdraw a research paper critical of Google’s AI technologies.

Google’s paper describing its approach to sensitive categories “sidesteps the most pressing issues,” Cyphers wrote in a March 30 post on the EFF site. “It’s highly likely that certain demographics are going to visit a different subset of the web than other demographics are, and that such behavior will not be captured by Google’s ‘sensitive sites’ framing,” he said.

“Meanwhile, tracking companies are well-equipped to gather traffic from millions of users, link it to data about demographics or behavior, and decode which cohorts are linked to which sensitive traits. Google’s website-based system, as proposed, has no way of stopping that.”

https://digiday.com/?p=409772

More in Media

Media Briefing: Publishers’ Q4 programmatic ad businesses are in limbo

This week’s Media Briefing looks at how publishers in the U.S. and Europe have seen programmatic ad sales on the open market slow in the fourth quarter while they’ve picked up in the private marketplace.

How the European and U.S. publishing landscapes compare and contrast

Publishing executives compared and contrasted the European and U.S. media landscapes and the challenges facing publishers in both regions.

Media Briefing: Publishers’ Q3 earnings show revenue upticks despite election ad pullback

Q3 was a mixed bag for publishers, with some blaming the U.S. presidential election for an ad-spend pullback.