Dossier [email protected] 0. Introduction (Standards) 1
Total Page:16
File Type:pdf, Size:1020Kb
Dossier [email protected] 0. Introduction (Standards) 1. Recognition 2. Classification 3. Bias & Noise 4. Unknown Known (New Teapots) 0. Introduction (Standards) I’ve been thinking about the problems of standards and categories in computer image recognition, how their abstractions result in hegemonic form that averages and normalizes, and how drawing and modeling might be used to critique and resist these tendencies. The Utah Teapot rendered four ways by Martin Newell To first talk about standards, I’m going to return to some work I did earlier in the semester on the Utah Teapot. The Utah Teapot was created by Martin Newell, a computer graphics researcher in the Graphics Lab at the University of Utah Computer Science Department, in 1975. Newell needed an object for testing 3D scenes, and his wife suggested their Melitta teapot. It was useful to computer graphics researchers primarily because it met certain geometric criteria. “It was round, contained saddle points, had a genus greater than zero because of the hole in the handle, could project a shadow on itself, and could be displayed accurately without a surface texture” (per Wikipedia). But it also was useful because it met some contextual or cultural criteria: it was a familiar, everyday object. The Utah Teapot thus became a standard in the computer graphics world. Original drawing of the Utah Teapot by Martin Newell Outline of the Utah Teapot rotating about the y-axis, Alex Bodkin Outline of the Utah Teapot rotating about the y-axis, Alex Bodkin Outline of the Utah Teapot rotating about the y-axis, Alex Bodkin Outline of the Utah Teapot rotating about the y-axis, Alex Bodkin Dancers in the Wings, Edgar Degas The Pink Dancers, Before the Ballet, Edgar Degas Outline of the Utah Teapot rotating about the z-axis, Alex Bodkin Outline of the Utah Teapot rotating about the z-axis, Alex Bodkin Outline of the Utah Teapot rotating about the z-axis, Alex Bodkin Outline of the Utah Teapot rotating about the z-axis, Alex Bodkin Outline of the Utah Teapot zooming towards “camera”, Alex Bodkin Outline of the Utah Teapot rotating about the z-axis and y-axis, Alex Bodkin The Utah Teapot in Toy Story The Utah Teapot is no longer a useful standard for computer graphics. Instead, it has become a cultural object representing the idea of a standard. In this project, I am using the teapot to look at issues surrounding standards as they relate to how computers see and understand the world. The Utah Teapot in The Simpsons 1. Recognition Landmark Fitting used by FaceTracker Let’s start with recognition. How do computers recognize stuff in the world? It depends on what you’re trying to recognize. For example, with faces, one way is to have an abstracted “Reference Face” with landmarks, which are typical facial features that can be tracked, such as eyes, nose, and mouth. This is the case with FaceTracker, created by Kyle McDonald based on the work of Jason Saragih. FaceTracker will try to deform this Reference Face (technically called an Active Shape Model) onto the actual face it is tracking, using the landmarks as guides. The Active Shape Model of the Reference Face was generated by training the FaceTracker software on a bunch of faces in an image dataset. The FaceTracker software learned what faces in the dataset looked like, and generated its Reference Face model based on characteristics these faces shared. Reference Face used by FaceTracker “And you can see, it kind of matches up with what you would expect, but it’s kind of weird in some ways. It looks weirdly elongated. This [the model’s cheeks] should be out more, right along here, this [the model’s right cheekbone] should be protruding more, but because it’s not important, there’s no features here to track, so there’s no reason for the 3D model to represent that. All of the points here are in a good place, but the overall structure can be strange sometimes. Also here across the nose, it makes it look like there’s a giant triangle right here. But again, that’s because there’s no features to track there.” Kyle McDonald, creator of FaceOSC Rafael Lozano-Hemmer, Levels of Confidence, 2015 Antonio Daniele, This is Not Private, 2015 Apple, ARFaceAnchor, 2017 Andreas Refsgaard, Eye Conductor, 2015 Dan Williams, Nick Clegg Looking Algorithmically Sad Zhanpeng Zhang, Ping Luo, Chen Change Loy, Xiaoou Tang, Facial Landmark Detection by Deep Multi-task Learning, 2014 Justus Thies, Face2Face, 2016 Facebook Research, DeepFace, 2014 Greg Borenstein, Pareidolia, 2012 Affectiva, AFFDEX SDK, 2016 2. Classification The same strategy used to train FaceTracker also applies more generally to the training of any Image Recognition and Classification algorithm. Image recognition algorithms are trained on image datasets, which contain images with objects that are already tagged and categorized by humans. The algorithm learns what the characteristics of an object are, which allows it to recognize that object in images it has never seen before. It learns categories and classes and then reinforces those categories and classes. ImageNet image dataset CIFAR-10 image dataset Caltech 101 image dataset 3. Bias & Noise There are problems with image datasets that then cause issues with recognition and classification. • Image datasets flatten, average, and normalize by making judgements about what is signal and what is noise, what is valuable and what is not • There aren’t enough image datasets to train image recognition algorithms • Image datasets contain hidden biases which are only understood once an algorithm has been trained on the dataset • Categories allow for object recognition but not necessarily for an understanding of image context. The loss of context and narrative around an object or set of objects flattens them and removes meaning and value. Image recognition is not enough, Ken Ryu “As with language, photos need contextual intelligence.” “Current datasets, however, offer a somewhat limited range of image variability... The problems with such restrictions are two fold: (i) some algorithms may exploit them, yet will fail when the restrictions do not apply; and, related to this, (ii) the images are not sufficiently challenging for the benefits of more sophisticated algorithms to make a difference.” Article on object detection in image recognition algorithms “Datasets play a very important (and sometimes underrated) role in research. Every time a new dataset is released, papers are released, and new models are compared and often improved upon, pushing the limits of what’s possible. Unfortunately, there aren’t enough datasets for object detection. Data is harder (and more expensive) to generate, companies probably don’t feel like freely giving away their investment, and universities do not have that many resources.” A Sea of Data: Apophenia and Pattern (Mis-)Recognition, Hito Steyerl “Jacques Rancière tells a mythical story about how the separation of signal and noise might have been accomplished in Ancient Greece. Sounds produced by affluent male locals were defined as speech, whereas women, children, slaves, and foreigners were assumed to produce garbled noise... Those identified as speaking were labeled citizens and the rest as irrelevant, irrational, and potentially dangerous nuisances. Similarly, today, the question of separating signal and noise has a fundamental political dimension.” When Discrimination Is Baked Into Algorithms, Lauren Kirchner “From retail to real estate, from employment to criminal justice, the use of data mining, scoring software, and predictive analytics programs is proliferating at an exponential rate. Software that makes decisions based on data like a person’s zip code can reflect, or even amplify, the results of historical or institutional discrimination.“[A]n algorithm is only as good as the data it works with,” Solon Barocas and Andrew Selbst write in their article “Big Data’s Disparate Impact,” forthcoming in the California Law Review. “Even in situations where data miners are extremely careful, they can still affect discriminatory results with models that, quite unintentionally, pick out proxy variables for protected classes.”” There is also one big problem inherent in the neural nets that are used to train image recognition algorithms: they are black boxes to both the public and to their designers. That means that people don’t really understand how they do what they do. I have no idea if this will remain the case, but at present, it’s still something we can confront. Whose black box do you trust?, Tim O’Reilly 4. Unknown Known (New Teapots) Mechanical Turk Human Intelligence Task 1 On white paper, please create a black line drawing of a shape that meets all of the criteria listed below. Upload a photo or scan of the drawing as a JPEG file to http://imgbb.com/ and submit the link to me. List of criteria: • Use a black pen or marker on white paper • The shape must be round (or rounded in some way). • The shape must have at least one visible hole in it. • The shape must be smooth. • The shape must have concave and convex curvature. • The shape must not be symmetrical. Feel free to be creative! The result will be part of a student project within the Department of Architecture at MIT. Thank you :) Drawing, Mechanical Turk Worker IDA229SEF35EGTSF Drawing, Mechanical Turk Worker IDA7JS3BROKU6S1 Drawing, Mechanical Turk Worker IDA381XM5ZPMVMDK Drawing, Mechanical Turk Worker IDADIAEAPYRI8CK Drawing, Mechanical Turk Worker IDAEZY44DUJFT42 Drawing, Mechanical Turk Worker IDAJF954955YWZG Drawing, Mechanical Turk Worker IDAGK30V341UN88` Drawing,