The Ethical Questions That Haunt Facial- Recognition
Total Page:16
File Type:pdf, Size:1020Kb
Feature BASED ON THE YAHOO FLICKR CREATIVE COMMONS COMMONS FLICKR CREATIVE THE YAHOO BASED ON ET AL. IMAGE VISUALIZATION BY ADAM HARVEY (HTTPS://MEGAPIXELS.CC) BASED ON THE MEGAFACE DATA DATA THE MEGAFACE BASED ON (HTTPS://MEGAPIXELS.CC) HARVEY ADAM BY VISUALIZATION IMAGE IRA KEMELMACHER-SHLIZERMAN SET BY LICENCES BY) (CC ATTRIBUTION COMMONS CREATIVE UNDER LICENSED SET AND DATA MILLION 100 A collage of images from the MegaFace data set, which scraped online photos. Images are obscured to protect people’s privacy. n September 2019, four researchers wrote to the publisher Wiley to “respect- fully ask” that it immediately retract a scientific paper. The study, published in THE ETHICAL 2018, had trained algorithms to distin- guish faces of Uyghur people, a predom- inantly Muslim minority ethnic group in China, from those of Korean and Tibetan Iethnicity1. QUESTIONS THAT China had already been internationally con- demned for its heavy surveillance and mass detentions of Uyghurs in camps in the north- western province of Xinjiang — which the gov- HAUNT FACIAL- ernment says are re-education centres aimed at quelling a terrorist movement. According to media reports, authorities in Xinjiang have used surveillance cameras equipped with soft- ware attuned to Uyghur faces. RECOGNITION As a result, many researchers found it dis- turbing that academics had tried to build such algorithms — and that a US journal had published a research paper on the topic. And the 2018 study wasn’t the only one: journals RESEARCH from publishers including Springer Nature, Elsevier and the Institute of Electrical and Elec- Journals and researchers are under fire for tronics Engineers (IEEE) had also published controversial studies using this technology. And a peer-reviewed papers that describe using facial recognition to identify Uyghurs and Nature survey reveals that many in this field think members of other Chinese minority groups. there is a problem. By Richard Van Noorden (Nature’s news team is editorially independent from its publisher, Springer Nature.) The complaint, which launched an ongo- ing investigation, was one foray in a growing push by some scientists and human-rights activists to get the scientific community 354 | Nature | Vol 587 | 19 November 2020 ©2020 Spri nger Nature Li mited. All ri ghts reserved. ©2020 Spri nger Nature Li mited. All ri ghts reserved. to take a firmer stance against unethical issued the world’s largest data set, MSCeleb5, used it or data derived from it (see go.nature. facial-recognition research. It’s important consisting of 10 million images of nearly com/3nlkjre). The authors urged researchers to denounce controversial uses of the tech- 100,000 individuals, including journalists, to set more restrictions on the use of data sets nology, but that’s not enough, ethicists say. musicians and academics, scraped from the and asked journals to stop accepting papers Scientists should also acknowledge the mor- Internet. that use data sets that had been taken down. ally dubious foundations of much of the aca- In 2019, Berlin-based artist Adam Harvey Legally, it is unclear whether scientists in demic work in the field — including studies that created a website called MegaPixels that Europe can collect photos of individuals’ faces have collected enormous data sets of images flagged these and other data sets. He and for biometric research without their consent. of people’s faces without consent, many of another Berlin-based technologist and pro- The European Union’s vaunted General Data which helped hone commercial or military grammer, Jules LaPlace, showed that many had Protection Regulation (GDPR) does not pro- surveillance algorithms. (A feature on page been shared openly and used to evaluate and vide an obvious legal basis for researchers 347 explores concerns over algorithmic bias improve commercial surveillance products. to do this, reported6 Catherine Jasserand, a in facial-recognition systems.) Some were cited, for instance, by companies biometrics and privacy-law researcher at the An increasing number of scientists are urg- that worked on military projects in China. “I Catholic University of Leuven in Belgium, in ing researchers to avoid working with firms wanted to uncover the uncomfortable truth 2018. But there has been no official guidance or universities linked to unethical projects, that many of the photos people posted online on how to interpret the GDPR on this point, and to re-evaluate how they collect and distribute have an afterlife as training data,” Harvey it hasn’t been tested in the courts. In the United facial-recognition data sets and to rethink the says. In total, he says he has charted 29 data States, some states say it is illegal for commer- ethics of their own studies. Some institutions sets, used in around 900 research projects. cial firms to use a person’s biometric data with- are already taking steps in this direction. In the Researchers often use public Flickr images that out their consent; Illinois is unique in allowing past year, several journals and an academic were uploaded under copyright licences that individuals to sue over this. As a result, several conference have announced extra ethics allow liberal reuse. firms have been hit with class-action lawsuits. checks on studies. After The Financial Times published an The US social-media firm Facebook, for “A lot of people are now questioning why article on Harvey’s work in 2019, Microsoft instance, agreed this year to pay US$650 mil- the computer-vision community dedicates so and several universities took their data sets lion to resolve an Illinois class-action law- much energy to facial-recognition work when down. Most said at the time — and reiterated suit over a collection of photos that was not it’s so difficult to do it ethically,” says Deborah to Nature this month — that their projects publicly available, which it used for facial Raji, a researcher in Ottawa who works at the had been completed or that researchers recognition (it now allows users to opt out of non-profit Internet foundation Mozilla. “I’m had requested that the data set be removed. facial-recognition tagging). The controver- seeing a growing coalition that is just against sial New York City-based technology company this entire enterprise.” “Conferences should avoid Clearview AI — which says it scraped three bil- This year, Nature asked 480 researchers lion online photos for a facial-recognition sys- around the world who work in facial recog- sponsors who are accused of tem — has also been sued for violating this law nition, computer vision and artificial intel- enabling abuses of human in pending cases. And the US tech firms IBM, ligence (AI) for their views on thorny ethical rights.” Google, Microsoft, Amazon and FaceFirst were questions about facial-recognition research. also sued in Illinois for using a data set of nearly The results of this first-of-a-kind survey sug- one million online photos that IBM released in gest that some scientists are concerned about Computer scientist Carlo Tomasi at Duke Uni- January 2019; IBM removed it at around the the ethics of work in this field — but others still versity was the sole researcher to apologize for time of the lawsuit, which followed a report by don’t see academic studies as problematic. a mistake. In a statement two months after the NBC News detailing photographers’ disquiet data set had been taken down, he said he had that their pictures were in the data set. Data without consent got institutional review board (IRB) approval Microsoft told Nature that it has filed to dis- For facial-recognition algorithms to work well, for his recordings — which his team made to miss the case, and Clearview says it “searches they must be trained and tested on large data analyse the motion of objects in video, not only publicly available information, like sets of images, ideally captured many times for facial recognition. But the IRB guidance Google or any other search engine”. Other under different lighting conditions and at dif- said he shouldn’t have recorded outdoors and firms did not respond to requests for com- ferent angles. In the 1990s and 2000s, scien- shouldn’t have made the data available with- ment. tists generally got volunteers to pose for these out password protection. Tomasi told Nature photos — but most now collect facial images that he did make efforts to alert students by Vulnerable populations without asking permission. putting up posters to describe the project. In the study on Uyghur faces published by For instance, in 2015, scientists at Stanford The removal of the data sets seems to have Wiley1, the researchers didn’t gather photos University in California published a set of dampened their usage a little, Harvey says. But from online, but said they took pictures of 12,000 images from a webcam in a San Fran- big online image collections such as MSCeleb more than 300 Uyghur, Korean and Tibetan cisco café that had been live-streamed online2. are still distributed among researchers, who 18–22-year-old students at Dalian Minzu Uni- The following year, researchers at Duke Uni- continue to cite them, and in some cases have versity in northeast China, where some of the versity in Durham, North Carolina, released re-uploaded them or data sets derived from scientists worked. Months after the study was more than 2 million video frames (85 minutes) them. Scientists sometimes stipulate that data published, the authors added a note to say that of footage of students walking on the univer- sets should be used only for non-commercial the students had consented to this. But the sity campus3. research — but once they have been widely researchers’ assertions don’t assuage ethical The biggest collections have been gathered shared, it is impossible to stop companies concerns, says Yves Moreau, a computational online.