<<

Here’s the File Clearview AI Has Been Keeping on Me, and Probably on You Too

vice.com/en_us/article/5dmkyq/heres-the-file-clearview-ai-has-been-keeping-on-me-and-probably-on-you-too

Edward Ongweso Jr February 28, 2020

Photo via Clearview AI report produced under the CCPA

After a recent, extensive, and rather withering bout of bad press, the facial recognition company Clearview AI has changed its homepage, which now touts all the things it says its technology can do, and a few things it can’t. Clearview’s system, the company says, is “an after-the-fact research tool. Clearview is not a surveillance system and is not built like one. For example, analysts upload images from crime scenes and compare them to publicly available images.” In doing so, it says, it has the power to help its clients—which include police departments, ICE, Macy’s, , and the FBI, according to a recent Buzzfeed report—stop criminals: “Clearview helps to identify child molesters, murderers, suspected terrorists, and other dangerous people quickly, accurately, and reliably to keep our families and communities safe.”

What goes unsaid here is that Clearview claims to do these things by building an extremely large database of photos of ordinary U.S. citizens, who are not accused of any wrongdoing, and making that database searchable for the thousands of clients to whom it has already sold the technology. I am in that database, and you probably are too.

If you live in California, under the rules of the newly enacted California Consumer Privacy Act, you can see what Clearview has gathered on you, and request that they stop it.

Do you work at Clearview or one of its clients? We'd love to talk to you. From a non-work device, contact Anna Merlan from a non-work device at [email protected] or Joseph Cox securely on Signal on +44 20 8133 5190 , Wickr on josephcox, OTR chat on [email protected] , or email [email protected] .

I recently did just that. In mid-January, I emailed [email protected] and requested information on any of my personal data that Clearview obtained, the method by which they obtained it, and how it was used. (You can read the guidelines they claim to follow under the CCPA here.) I also asked that all said data be deleted after it was given to me and opted out of Clearview's data collection systems in the future. In response, 11 days later, Clearview emailed me back asking for “a clear photo” of myself and a government- issued ID.

1/8 “Clearview does not maintain any sort of information other than photos,” the company wrote. “To find your information, we cannot search by name or any method other than image. Additionally, we need to confirm your identity to guard against fraudulent access requests. Finally, we need your name to maintain a record of removal requests as required by law.”

After a moment of irritation and a passing desire not to give these people any more of my information, I emailed Clearview a photo of my work ID badge and a redacted copy of my passport. About a month went by, and then I got a PDF, containing an extremely curious collection of images and an explanation that my request for data deletion and opt-out had been processed. “Images of you, to the extent the [sic] we are able to identify them using the image that you have shared to facilitate your request, will no longer appear in Clearview search results,” the “Clearview Privacy Team” wrote.

The images themselves are indeed all photos of me, ones that I or friends have put on social media, and they are exceedingly odd. (The source of them is odd, not my face, although, that too.)

2/8 The images seen here range from around 2004 to 2019; some are from my MySpace profile (RIP) and some from Instagram, , and . What’s curious is that, according to Clearview, many of them weren't scraped from social media directly, but from a collection of utterly bizarre and seemingly random websites.

"You may have forgotten about the photos you uploaded to a then-popular social media site ten or fifteen years ago... but Clearview hasn't," Riana Pfefferkorn, associate director of surveillance and cybersecurity at the Stanford Center for Internet and Society, wrote in an email. "A lot of data about individuals can quickly become 'stale' and thus low-value by those seeking to monetize it. Jobs, salaries, addresses, phone numbers, those all change. But photos are different: your face doesn't go stale."

The “Image Index” lists where the photos were obtained; the sites include Insta Stalker—one of dozens of sketchy Instagram scrapers available online—an enraged post someone wrote accusing me of yellow journalism, and the website of an extremely marginal conspiracy theorist who has written about me a handful of times.

3/8 Nicholas Weaver, a senior researcher at the International Computer Science Institute at UC Berkeley, said that the response "gives you an insight into the various sources being scraped." He noted that Clearview is not just obtaining images from social media sites like Instagram themselves, but also from other sites that have already scraped Instagram, like Insta Stalker.

The data presented here don’t necessarily confirm that Clearview is able to accurately do what it claims to: allow someone to upload a photo of a subject and return publicly available photos of that person. But I do know, thanks to the CCPA, who Clearview plans to share photos of my face with: “Clearview users: law enforcement, security, and anti-human trafficking professionals,” as they write in their explanation of how they intend to comply with the CCPA.

There’s also this baffling addendum, which seems to suggest that Clearview is going through a security penetration test at the moment: “Occasionally and for limited purposes and durations, third party service providers can use Clearview’s search tools to assess their accuracy and verify our cybersecurity performance.”

What is clear is that this information is available to far more people than Clearview likes to acknowledge, and that they have future, as-yet-unannounced plans for their photos of your face. Reporters at Gizmodo were recently able to download a version of Clearview’s app, which they found, they report, “on an Amazon server that is publicly accessible.”

“Other bits of code appear to hint at features under development,” the Gizmodo reporters wrote, “such as references to a voice search option; an in-app feature that would allow police to take photos of people to run through Clearview’s database; and a “private search mode,” no further descriptions of which are available through surface-level access.”

Adam Schwartz, senior staff attorney at the Electronic Frontier Foundation (EFF), wrote in an email to VICE: "EFF is disturbed that Clearview AI has made faceprints of people without their consent, and is selling personal information based on these faceprints to police departments and others. This is why we need privacy laws like the Illinois Biometric Information Privacy Act, which requires consent before collection of biometrics, and the California Consumer Privacy Act, which empowers consumers to access the personal information a business has collected about them."

Jeramie D. Scott, senior counsel at the Electronic Privacy Information Center, wrote in an email "The face search results show exactly why we need a moratorium on face surveillance. In a democratic society, we should not accept our images being secretly collected and retained to create a database to be used, disclosed, and analyzed at the whim of an unaccountable company. The threat to our Constitutional rights and democracy is too great. Our participation in society should not come with the price tag of our privacy."

4/8 Clearview has still declined to release a full list of the agencies who use their product. It’s also claimed that the app has been “tested for accuracy and evaluated for legal compliance by nationally recognized authorities,” without citing who those authorities are. And it, of course, represents a breach of privacy more extreme than anything any technology company has ever produced. But at least, if you live in California, you can see what they’ve got about you, and take their word for it that they’ll stop.

Clearview's lawyer and PR spokesperson did not immediately respond to questions asking how many requests the company has received, or how many records it has deleted under applicable laws.

Update: This piece has been updated to include comment from Jeramie D. Scott and Riana Pfefferkorn.

Additional reporting by Joseph Cox.

Get a personalized roundup of VICE's best stories in your inbox.

More like this

On Wednesday, Senators Jeff Merkley (D-OR) and Cory Booker (D-NJ) introduced legislation to place a moratorium on the use of facial recognition by the federal government or with federal funds—unless Congress passed regulations for the technology.

The Ethical Use of Facial Recognition Act aims to create a 13 member congressional commission representing interested parties—including law enforcement, communities subjected to surveillance, and privacy experts.

“Facial recognition technology works well enough that it’s a huge temptation for the government to try and track Americans,” Merkley told Motherboard. “This is Big Brother on steroids. I don’t want America to become a police state that tracks us everywhere we go.”

“Facial recognition technology has been demonstrated to be often inaccurate— misidentifying and disproportionately targeting women and people of color,” Booker said in a statement. “To protect consumer privacy and safety, Congress must work to set the rules of the road for responsible uses of this technology by the federal government.”

There is a notable exception in the legislation, however: law enforcement may still use the technology with court warrants.

5/8 “While a good first step, the bill’s exceptions—including allowing police to use this technology with only a warrant—fails to fully account for the realities of this mass surveillance tool,” ACLU Senior Legislative Counsel Neema Singh Guliani said in a press release.

That exception is a huge one, especially given that the close relationships between police departments and commercial vendors will only grow more intimate once the moratorium closes the door on federal funds. Civil liberties organizations have said that Amazon’s Rekognition and Clearview AI both feature incredibly powerful facial recognition technologies that pose a threat to marginalized groups, whether or not they are accurate. But this hasn’t stopped their adoption by police departments or sales pitches making false claims about accuracy. In the example of Clearview AI, these claims come despite David Scalzo, whose firm was an early investor in Clearview, acknowledging that databases of billions of faces paired with racist and sexist algorithms "might lead to a dystopian future or something.”

"I'm very concerned about the databases that are being compiled. Our faces are being sold. I personally would like—if I could wave a magic wand right now, I'd put a complete stop on that as well,” Merkley said. “But I think tackling the Big Brother government aspect and using that as an opportunity for people to come up to speed with how dangerous this technology is, is probably a good way to go about it."

One notable consequence of the moratorium could be the undermining of some operations by Palantir, the surveillance firm that recently admitted its role in helping Immigration and Customs Enforcement separate families and deport migrants. While the company does not itself offer facial recognition technology, its software is used to analyze that data and further flesh out surveillance networks used by local, state, and federal authorities.

“I think it’d be an overreach at this point, both a political overreach and probably a policy overreach, to try to dictate what states do,” Merkley said. “But having the conversation at the national level will hopefully stimulate a lot of other conversations.”

Bloomberg / Getty Images

Researchers have created what may be the most advanced system yet for tricking top-of- the-line facial recognition algorithms, subtly modifying images to make faces and other objects unrecognizable to machines.

The program, developed by researchers from the University of Chicago, builds on previous work from a group of researchers exploring how deep neural networks learn. In 2014, they released a paper showing that “imperceptible perturbations” in a picture could

6/8 force state-of-the art recognition algorithms to misclassify an image. Their paper led to an explosion of research in a new field: the subversion of image recognition systems through adversarial attacks.

The work has taken on new urgency with the widespread adoption of facial recognition technology and revelations that companies like Clearview AI are scraping social media sites to build massive face databases on which they train algorithms that are then sold to police, department stores, and sports leagues.

The new program—named “Fawkes,” after the infamous revolutionary whose face adorns the ubiquitous protest masks of Anonymous—“cloaks” a picture by adding a small number of pixels to the image. While the changes are imperceptible to the human eye, if the cloaked photo is used to train an algorithm—by being scraped from social media, for example—it will cause the facial recognition system to misclassify an image of the person in question. Fawkes-cloaked images successfully fooled Amazon, Microsoft, and Megvii recognition systems 100 percent of the time in tests, the researchers reported in a new paper.

Amazon did not respond to a request for comment. Microsoft and Megvii declined interview requests.

“[Fawkes] allows individuals to inoculate themselves against unauthorized facial recognition models at any time without significant[ly] distorting their own photos, or wearing conspicuous patches,” according to the paper. Shawn Shan, one of the lead authors, said the team could not comment further on their research at the time because it is undergoing a peer review process.

The team acknowledged that Fawkes is far from a perfect solution. It relies on a recognition system being trained on cloaked images, but most people already have dozens, if not hundreds, of photos of themselves posted online that the systems could have already drawn from. Fawkes’ success fell to 39 percent when an algorithm’s training set included less than 85 percent cloaked photos for a particular person, according to the paper.

“It’s a very, very nice idea, but in terms of the practical implications or widespread use it’s not clear how widely it will be adopted,” Anil Jain, a biometrics and machine learning professor at Michigan State University, told Motherboard. “And if it is widely adopted, then the face recognition systems will do some modifications to their algorithms so it will be avoided.”

There is a definite, and growing, demand for anti-surveillance tools that can defeat image recognition, as demonstrated by the wide range of glasses, hats, t-shirts, and patterns created for the purpose. And apart from its general creepiness, facial recognition poses direct physical threat to activists, minority groups, and sex workers, often harming them in unexpected ways.

7/8 Liara Roux, a sex worker and activist, told Motherboard that porn performers have experienced a crackdown from companies that use facial recognition for identity verification when attempting to use the services under their real names for purposes unrelated to their professional work.

“A lot of performers are having this issue where they’re being denied Airbnb rentals,” they said. “It’s such a pervasive thing for people in the industry, to kind of out of the blue have their accounts shut down.”

Fawkes is not the first attempt at an anti-facial recognition tool for the public. In 2018, a group of engineers and technologists working in Harvard’s Berkman Klein Center for Internet and Society published EqualAIs, an online tool that deployed a similar technique. The method of subtly altering inputs, like photos, to subvert a deep neural network’s training is known as a poison attack.

“In theory, on a small scale, it should be a functioning shield.” Daniel Pedraza, the project manager for EqualAIs, told Motherboard. The creators, whose primary goal was to start a conversation over the power of biometric data, haven’t been updating the tool, so it may no longer be as effective as it was in 2018. “For us, the win was people talking about it. I think the issue is on the forefront of more peoples’ minds today than it was back in 2018, and it will continue to be so. The fact that people are learning the machines are fallible is super helpful.”

Fawkes isn’t available for public use yet, and if and when it is, experts say it likely won’t take long before facial recognition vendors develop a response. Most of the research into new poison attacks, or other forms of adversarial attacks against deep neural networks, is as focused on ways to detect and defeat the methods as it is on creating them.

Like many areas of cybersecurity, experts say facial recognition is poised to be a cat-and- mouse game.

“Sooner or later, you can imagine that as the level and sophistication of the defense increases, the level of the attackers also increases,” Battista Biggio, a professor in the Pattern Recognition and Application Lab at the University of Cagliari, in , told Motherboard. “This is not something we are going to solve any time soon.”

8/8