Content Modeling for Social Media Text Author

Total Page:16

File Type:pdf, Size:1020Kb

Content Modeling for Social Media Text Author Content Modeling for Social Media Text by Christina Sauper Submitted to the Department of Electrical Engineering and Computer Science ARCHIVES in partial fulfillment of the requirements for the degree of MAss TTUE Doctor of Philosophy JUL 01 2012 at the '-lLiBRRIES MASSACHUSETTS INSTITUTE OF TECHNOLOGY - June 2012 © Massachusetts Institute of Technology 2012. All rights reserved. Author ................ ................... Department of 4 lectrical Engineering and Computer Science May 14, 2012 Certified by........... ...... .... Rg.. Bar...i.. Regina Barzilay Associate Professor, Electrical Engineering and Computer Science Thesis Supervisor Accepted by A.... cce .. ted. y . ............. .... J - Leslie A. Kolodziejski Chairman, Department Committee on Graduate Theses Content Modeling for Social Media Text by Christina Sauper Submitted to the Department of Electrical Engineering and Computer Science on May 14, 2012, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract This thesis focuses on machine learning methods for extracting information from user- generated content. Instances of this data such as product and restaurant reviews have become increasingly valuable and influential in daily decision making. In this work, I consider a range of extraction tasks such as sentiment analysis and aspect-based review aggregation. These tasks have been well studied in the context of newswire documents, but the informal and colloquial nature of social media poses significant new challenges. The key idea behind our approach is to automatically induce the content structure of individual documents given a large, noisy collection of user-generated content. This structure enables us to model the connection between individual documents and effectively aggregate their content. The models I propose demonstrate that content structure can be utilized at both document and phrase level to aid in standard text analysis tasks. At the document level, I capture this idea by joining the original task features with global contextual information. The coupling of the content model and the task-specific model allows the two components to mutually influence each other during learning. At the phrase level, I utilize a generative Bayesian topic model where a set of properties and corresponding attribute tendencies are represented as hidden variables. The model explains how the observed text arises from the latent variables, thereby connecting text fragments with corresponding properties and attributes. Thesis Supervisor: Regina Barzilay Title: Associate Professor, Electrical Engineering and Computer Science 2 Acknowledgments I have been extremely fortunate throughout this journey to have the support of many wonderful people. It is only with their help and encouragement that this thesis has come into existence. First and foremost, I would like to thank my advisor, Regina Barzilay. She has been an incredible source of knowledge and inspiration; she has a sharp insight about interesting directions of research, and she is a strong role model in a field in which women are still a minority. Her high standards and excellent guidance have shaped my work into something I can be truly proud of. My thesis committee members, Peter Szolovits and James Glass, have also provided invaluable discussion as to the considerations of this work applied to their respective fields, analysis of medical data and speech processing. Working with my coauthor, Aria Haghighi, was an intensely focused and amazing time; he has been a great friend and collaborator to me, and I will certainly miss the late-night coding with Tosci's ice cream. I am very grateful to Alan Woolf, Michelle Zeager, Bill Long, and the other members of the Fairwitness group for helping to make my introduction to the medical world as smooth as possible. My fellow students at MIT are a diverse and fantastic group of people: my group- mates from the Natural Language Processing group, including Ted Benson, Branavan, Harr Chen, Jacob Eisenstein, Yoong Keok Lee, Tahira Naseem, and Ben Snyder; my officemates, including David Sontag and Paresh Malalur; and my friends from GSB and other areas of CSAIL. They have always been there for good company and stim- ulating discussions, both related to research and not. I am thankful to all, and I wish them the best for their future endeavors. Finally, I dedicate this thesis to my family. They have stood by me through good times and bad, and it is through their loving support and their neverending faith in me that I have made it this far. 3 Contents 1 Introduction 12 1.1 Modeling content structure for text analysis tasks . ... .. ... .. 18 1.2 Modeling relation structure for informative aggregation 20 1.3 Contributions .. .. ... .. ... .. ... .. .. .. 22 1.4 O utline .. ... .... ... ... .... ... .... 23 2 Modeling content structure for text analysis tasks 25 2.1 Introduction . .... ..... .... ..... ... .. .. .. .. 26 2.2 Related work ... ... .... ... .... ... .. .. .. 28 2.2.1 Task-specific models for relevance . .. ... .. .. .. .. 28 2.2.2 Topic modeling ... .. ... ... .. ... .. .. 29 2.2.3 Discourse models of document structure .. .. .. 32 2.3 M odel .. ... .... ... .... ... .... .. .. .. .. 33 2.3.1 Multi-aspect phrase extraction . ... ... .. .. .. 34 2.3.1.1 Problem Formulation . .. ... .. .. 34 2.3.1.2 Model overview .. .. .. .. .. 36 2.3.1.3 Learning .. ..... .. .. ... .. 38 2.3.1.4 Inference . ..... ... .. .. .. 40 4 2.3.1.5 Leveraging unannotated data . 41 2.3.2 Multi-aspect sentiment analysis .. .. .. .. ... .. .. .. 4 2 2.3.2.1 Problem Formulation . ... .. .. .. .. .. .. 4 2 2.3.2.2 M odel .. .. .. .. .. .. .. .. .. .. .. 4 3 2.3.2.3 Learning .. .. .. .. .. .. .. .. .. .. 4 3 2.3.2.4 Inference .. .. .. .. .. .. .. .. ... .. 4 4 2.4 Data sets .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 4 4 2.4.1 Multi-aspect sentiment analysis . .. .. .. .. .. .. 4 5 2.4.2 Multi-aspect summarization .. .. .. .. .. .. .. 4 7 2.5 Experiments .. .. .. .. .. .. .. .. .. .. .. .. .. 4 g 2.5.1 Multi-aspect phrase extraction . .. .. .. .. .. 5 0 2.5.1.1 Baselines .. ... .. .. .. .. .. .. .. 5 0 2.5.1.2 Evaluation metrics .. .. .. .. .. .. .. 5 1 2.5.1.3 Results .. .. .. .. .. .. .. .. .. 5 2 2.5.2 Multi-aspect sentiment analysis . .. .. .. .. .. 5 5 2.5.2.1 Baselines . .. .. .. .. .. .. .. .. 5 6 2.5.2.2 Evaluation metrics . .. .. .. .. .. 5 6 2.5.2.3 Results ... .... .. .. .. .. .. 56 2.6 Conclusion .. .. ... ... ... .. .. .. .. .. 57 3 Modeling relation structure for informative aggregation 59 3.1 Introduction .. .. .. .. .. .. .. .. .. .. .. .. .. 60 3.2 Related work . .. .. .. .. .. .. .. .. .. .. .. .. 65 3.2.1 Single-aspect sentiment analysis . .. .. .. .. .. .. 65 3.2.2 Aspect-based sentiment analysis .. ... .. .. .. .. .. 66 3.2.2.1 Data-mining and fixed-aspect techniques for sentiment analysis .. .. .. .. .. .. .. .. .. .. .. 66 3.2.2.2 Multi-document summarization and its application to sentiment analysis . .. .. ... .. .. ... .. .. 68 3.2.2.3 Probabilistic topic modeling for sentiment analysis . 70 5 3.3 Problem formulation .. .. ... .. ... .. ... .. 73 3.3.1 Model components ... .. .. .. ... .. .. .. 73 3.3.2 Problem setup. ... ... ... .. ... .. .. .. .. .. 75 3.4 M odel ... .. ... .. ... .. ... .. ... .. .. .. 76 3.4.1 General formulation . .. .. .. ... .. .. .. .. .. 76 3.4.2 Model extensions ... .. .. .. ... .. .. .. .. .. 82 3.5 Inference. .. ... .. ... .. ... .. ... .. .. .. .. 84 3.5.1 Inference for model extensions . .. ... .. .. .. .. .. 90 3.6 Experiments . ... ... ... ... ... ... ... .. .. .. .. 92 3.6.1 Joint identification of aspect and sentiment . .. .. .. .. 93 3.6.1.1 Data set . ... .. .. .. ... .. .. .. .. .. 93 3.6.1.2 Domain challenges and modeling techniques ..... 94 3.6.1.3 Cluster prediction ... ... ... .. .. .. .. 95 3.6.1.4 Sentiment analysis .. ... .. .. .. .. .. .. 97 3.6.1.5 Per-word labeling accuracy .. .. .. .. 100 3.6.2 Aspect identification with shared aspects .. .. .. .. 103 3.6.2.1 D ata set . .. .. ... .. .. .. .. .. .. 104 3.6.2.2 Domain challenges and modeling te chniques ... .. 104 3.6.2.3 Cluster prediction .. .. .. .. .. .. 105 3.7 Conclusion .. .. .. .. ... .. .. .. ... .. .. .. ... .. 107 4 Conclusion 109 4.1 Future Work. .. ... .. .. .. ... .. .. .. ... .. .. .. 110 A Condensr 113 A.1 System implementation. .. .. ... .. .. .. ... .. .. .. .. 113 A.2 Examples . .. ... .. .. .. ... .. .. .. ... .. .. .. .. 120 6 List of Figures 1-1 A comparison of potential content models for summarization and sen- tim ent analysis ... .. ... .. ... ... .. ... .. ... .. .. 14 (a) A content model for summarization focused on identifying topics in text. ....... ............................... 14 (b) A content model for sentiment analysis focused on identifying sentiment-bearing sentences in text. .. ... .. ... ... .. 14 1-2 Three reviews about the same restaurant discussing different aspects and opinions. .. .. .. .. .. .. .. ... .. .. .. .. .. .. 16 1-3 An excerpt from a DVD review demonstrating ambiguity which can be resolved through knowledge of structure. .. .. .. .. .. ... .. 17 1-4 Sample text analysis tasks to which document structure is introduced. 19 (a) Example system input and output on the multi-aspect sentiment analysis task ... .... ... .... .... ... .... ... 19 (b) Example system input and output on the multi-aspect phrase extraction dom ain .. ... .... ... .... .... ... 19 1-5 An example of the desired input and output of our system in the restau- rant domain, both at corpus-level and at snippet-level. ..
Recommended publications
  • Bit.Ly/Pm19hms Can AI Accelerate Precision Medicine?
    Department of Biomedical Informatics 10 Shattuck Street, Suite 514 Boston, MA 02115 dbmi.hms.harvard.edu @HarvardDBMI TWITTER #pm19hms LINK bit.ly/pm19hms Can AI Accelerate Precision Medicine? Harvard Medical School June 18, 2019 Welcome Recent progress in artificial intelligence has been touted as the solution for many of the challenges facing clinicians and patients in diagnosing and treating disease. Concurrently, there is a growing concern about the biases and lapses caused by the use of algorithms that are incompletely understood. Yet precision medicine* requires sifting through millions of individuals measured with thousands of variables: an analysis that defeats human cognitive capabilities. In this conference, we ask the question whether AI can be used effectively to accelerate precision medicine in ways that are safe, non-discriminatory, affordable and in the patient’s best interest. To answer the question, we have three panels and three keynote speakers. The first panel addresses the question ofPolicy and the Patient: Who’s in Control? We will review the intersection of consent, transparency and regulatory oversight in this very dynamic ethical landscape. The second panel, Is There Value in Prediction?, addresses a widely-shared challenge in medicine, which is to predict the patient’s future, and most importantly, their response to therapy. Can AI actually make an important contribution here? The third panel on Hyperindividualized Treatments asks the question of how to think about a future that is fast becoming the present, where therapy is “hyper-individualized,” so it is the very uniqueness of your genome that determines your therapy. As always, we have a patient representative for our opening keynote, Matt Might, who returns to this annual conference to discuss the development of an algorithm for conducting precision medicine, through the lens of a personal story: discovering that his child was the first case of a new, ultra-rare genetic disorder.
    [Show full text]
  • Sentence Fusion for Multidocument News Summarization
    Sentence Fusion for Multidocument News Summarization ∗ † Regina Barzilay Kathleen R. McKeown Massachusetts Institute of Technology Columbia University A system that can produce informative summaries, highlighting common information found in many online documents, will help Web users to pinpoint information that they need without extensive reading. In this article, we introduce sentence fusion, a novel text-to-text generation technique for synthesizing common information across documents. Sentence fusion involves bottom-up local multisequence alignment to identify phrases conveying similar information and statistical generation to combine common phrases into a sentence. Sentence fusion moves the summarization field from the use of purely extractive methods to the generation of abstracts that contain sentences not found in any of the input documents and can synthesize information across sources. 1. Introduction Redundancy in large text collections, such as the Web, creates both problems and opportunities for natural language systems. On the one hand, the presence of numer- ous sources conveying the same information causes difficulties for end users of search engines and news providers; they must read the same information over and over again. On the other hand, redundancy can be exploited to identify important and accurate information for applications such as summarization and question answering (Mani and Bloedorn 1997; Radev and McKeown 1998; Radev, Prager, and Samn 2000; Clarke, Cormack, and Lynam 2001; Dumais et al. 2002; Chu-Carroll et al. 2003). Clearly, it would be highly desirable to have a mechanism that could identify common information among multiple related documents and fuse it into a coherent text. In this article, we present a method for sentence fusion that exploits redundancy to achieve this task in the context of multidocument summarization.
    [Show full text]
  • Artificial Intelligence: Implications for Business Strategy Online Short Course
    MIT SLOAN SCHOOL OF MANAGEMENT MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY (CSAIL) ARTIFICIAL INTELLIGENCE: IMPLICATIONS FOR BUSINESS STRATEGY ONLINE SHORT COURSE Gain the knowledge and confdence to support the integration of AI into your organization. Certifcate Track: Management and Leadership ABOUT THIS COURSE What is artifcial intelligence (AI)? What does it mean for business? And how can your company take advantage of it? This online program, designed by the MIT Sloan School of Management and the MIT Computer Science and Artifcial Intelligence Laboratory (CSAIL), will help you answer these questions. Through an engaging mix of introductions to key technologies, business insights, case examples, and your own business-focused project, your learning journey will bring into sharp focus the reality of central AI technologies today and how they can be harnessed to support your business needs. Focusing on key AI technologies, such as machine learning, natural language processing, and robotics, the course will help you understand the implications of these new technologies for business strategy, as well as the economic and societal issues they raise. MIT expert instructors examine how artifcial intelligence will complement and strengthen our workforce rather than just eliminate jobs. Additionally, the program will emphasize how the collective intelligence of people and computers together can solve business problems that not long ago were considered impossible. WHAT THE PROGRAM COVERS Upon completion of the program, you’ll be ready to apply your knowledge to support informed This 6-week online program presents you with a foundational understanding of where we are today strategic decision making around the use of key AI with AI and how we got here.
    [Show full text]
  • Download a PDF of This Issue
    Massachusetts Institute of Technology Spectrum Spring 2021 PLUS EMPLOYING AI AND SYNTHETIC BIO TO FIGHT THE PANDEMIC P. 2 2 EARTH From left: Kripa Varanasi, MIT professor of mechanical engineering; Karim Khalil PhD ’18; and Maher Damak PhD ’18 cofounded Infinite Cooling to capture and recycle vaporized water from thermoelectric power plants. PHOTO: COURTESY OF INFINITE COOLING LOOKING FOR MORE? Infinite Cooling is just one of many sustainability-focused companies with MIT roots. Read about them and find additional stories of MIT’s extraordinary faculty, researchers, and students at work exclusively at spectrum.mit.edu SPECIAL SECTION Breakthroughs Wide Angle Earth and Insights 2 Community Building 8 Faculty experts paint big picture 22 Synthetic biologist Jim Collins engineers disease fighters 10 MIT accelerates climate action 11 Campus becomes test bed Subjects for flood-risk assessment 12 Engineering projects Inside the MIT span climate landscape Campaign for 4 14.13 Psychology and 14 Improving Earth system modeling Economics a Better World 15 Doctoral student helps architects design cooler buildings 24 David ’69 and Jeanne-Marie 16 Why care about climate change? Brookfield: MIT family gives back John Sterman explains 25 Thuan ’90, SM ’91 and Nicole Pham: 18 Desirée Plata devises new Financial aid success story methods for decontaminating air, water 19 New J-PAL initiative zeros in on climate impacts in Spectrum is printed on 100% recycled vulnerable regions paper by DS Graphics | Universal Wilde 20 PhD candidate uses optical in Lowell, MA. DSG | UW is certified by the sensor to assess oceans’ Forest Stewardship Council®, the chemical changes Sustainable Forestry Initiative®, and the Program for the Endorsement of Forest FRONT COVER Blooms of cynobacteria, a type Certification standards.
    [Show full text]
  • Decoding Algorithms for Complex Natural Language Tasks Pawan
    Decoding Algorithms for Complex Natural Language Tasks by Pawan Deshpande Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Masters of Engineering in Computer Science and Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY January 2007 @ Massachusetts Institute of Technology 2007. A]l1rights reserved. Author .................................... Department of Electrical Engineering and Computer Science January 30, 2007 Certified by... ............. .... Assetgina Barzilay Assistant Professor Thesis Supervisor *y;o77 / Accepted by .......... .(. ......... ........ .. - Arthur C. Smith MASSACHUSETTS INSTITUTE Chairman, Department Committee on Graduate Students OF TECHNOLOGY OCT 0 3 2007 ARCHIVES LIBRARIES Decoding Algorithms for Complex Natural Language Tasks by Pawan Deshpande Submitted to the Department of Electrical Engineering and Computer Science on January 30, 2007, in partial fulfillment of the requirements for the degree of Masters of Engineering in Computer Science and Engineering Abstract This thesis focuses on developing decoding techniques for complex Natural Language Processing (NLP) tasks. The goal of decoding is to find an optimal or near optimal solution given a model that defines the goodness of a candidate. The task is chal- lenging because in a typical problem the search space is large, and the dependencies between elements of the solution are complex. The goal of this work is two-fold. First, we are interested in developing decoding techniques with strong theoretical guarantees. We develop a decoding model based on the Integer Linear Programming paradigm which is guaranteed to compute the opti- mal solution and is capable of accounting for a wide range of global constraints. As an alternative, we also present a novel randomized algorithm which can guarantee an ar- bitrarily high probability of finding the optimal solution.
    [Show full text]
  • Artificial Intelligence in Health Care Online Short Course
    MASSACHUSETTS INSTITUTE OF TECHNOLOGY SLOAN SCHOOL OF MANAGEMENT ARTIFICIAL INTELLIGENCE IN HEALTH CARE ONLINE SHORT COURSE Develop a grounded understanding of how the use of AI is transforming health care. ABOUT THIS COURSE The potential of artifcial intelligence (AI) to transform health care — through the work of both organizational leaders and medical professionals — is increasingly evident as more real-world clinical applications emerge. As patient data sets become larger, manual analysis is becoming less feasible. AI has the power to effciently process data far beyond our own capacity, and has already enabled innovation in areas including chemotherapy regimens, patient care, breast cancer risk, and even ICU death prediction. With this program, the MIT Sloan School of Management and the MIT J-Clinic aims to equip health care leaders with a grounded understanding of the potential for AI innovations in the health care industry. The Artifcial Intelligence in Health Care online short course explores types of AI technology, its applications, limitations, and industry opportunities. Techniques like natural language processing, data analytics, and machine learning will be investigated across contexts such as disease diagnosis and hospital management. WHAT THE PROGRAM COVERS integrated approach to hospital management and optimization, and develop a framework to assess the Over the course of six weeks, you’ll develop a holistic viability of using AI within your health care context. understanding of AI’s growing role in health care through an immersive online experience that draws on real- world case studies. You’ll explore how AI strategies have $2,800 already been successfully deployed within the sector, and learn to ask the right questions when evaluating an AI technique for potential use within your own context.
    [Show full text]
  • MIT Facts2018-Final.Indd
    MIT Facts 2018 Massachusetts Institute of Technology 77 Massachusetts Ave, Cambridge, MA 02139-4307 617.253.1000 | web.mit.edu MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent. MIT welcomes the world’s best talent.
    [Show full text]
  • Scholar Profile: Roman Bauer Page 3
    CRYONICS 2nd Quarter 2020 | Vol 41, Issue 2 www.alcor.org Scholar Profile: Roman Bauer page 3 Mathematics and Modeling 2019 Aging Conferences in Cryonics page 33 page 16 CRYONICS Editorial Board Saul Kent Ralph C. Merkle, Ph.D. Contents Max More, Ph.D. R. Michael Perry, Ph.D. 3 Scholar Profile: Roman Bauer, Ph.D. Editor Computational biologist and interdisciplinary research Aschwin de Wolf team leader, Dr. Roman Bauer, explains the promise of Contributing Writers computational modeling to the future of biology. Currently Roman Bauer based at Newcastle University, Dr. Bauer leads a team of Ben Best experimental researchers and computational experts in David Brandt-Erichsen cryopreservation research focused on retinal tissue. He is also Max More, Ph.D. R. Michael Perry, Ph.D. the creative mind behind the open-source biology dynamics Nicole Weinstock modeler BioDynaMo, a powerful resource in his work and Aschwin de Wolf quite possibly, the future of cryonics. Copyright 2020 by Alcor Life Extension Foundation 15 Pro-Death Philosophy in Star Trek Picard All rights reserved. Most filmed science fiction is really “anti-science” fiction. Reproduction, in whole or part, David Brandt-Erichsen reviews Star Trek Picard. without permission is prohibited. 16 Mathematics and Modeling in Cryonics: Cryonics magazine is published quarterly. Some Historical Highlights Please note: If you change your address We survey, at layman’s level, some of the remarkable uses of less than a month before the magazine is mathematics modeling that have occurred in cryonics stretching mailed, it may be sent to your old address. back to the early 1970s.
    [Show full text]
  • Neural Reading Comprehension and Beyond
    NEURAL READING COMPREHENSION AND BEYOND A DISSERTATION SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Danqi Chen December 2018 © 2018 by Danqi Chen. All Rights Reserved. Re-distributed by Stanford University under license with the author. This work is licensed under a Creative Commons Attribution- Noncommercial 3.0 United States License. http://creativecommons.org/licenses/by-nc/3.0/us/ This dissertation is online at: http://purl.stanford.edu/gd576xb1833 ii I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Christopher Manning, Primary Adviser I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Dan Jurafsky I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Percy Liang I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Luke Zettlemoyer, Approved for the Stanford University Committee on Graduate Studies. Patricia J. Gumport, Vice Provost for Graduate Education This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file in University Archives.
    [Show full text]
  • Efficient Conformal Prediction Via Cascaded Inference with Expanded
    Published as a conference paper at ICLR 2021 EFFICIENT CONFORMAL PREDICTION VIA CASCADED INFERENCE WITH EXPANDED ADMISSION Adam Fisch∗ Tal Schuster∗ Tommi Jaakkola Regina Barzilay Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology {fisch,tals,tommi,regina}@csail.mit.edu ABSTRACT In this paper, we present a novel approach for conformal prediction (CP), in which we aim to identify a set of promising prediction candidates—in place of a single prediction. This set is guaranteed to contain a correct answer with high probability, and is well-suited for many open-ended classification tasks. In the standard CP paradigm, the predicted set can often be unusably large and also costly to obtain. This is particularly pervasive in settings where the correct answer is not unique, and the number of total possible answers is high. We first expand the CP correctness criterion to allow for additional, inferred “admissible” answers, which can substan- tially reduce the size of the predicted set while still providing valid performance guarantees. Second, we amortize costs by conformalizing prediction cascades, in which we aggressively prune implausible labels early on by using progressively stronger classifiers—again, while still providing valid performance guarantees. We demonstrate the empirical effectiveness of our approach for multiple applications in natural language processing and computational chemistry for drug discovery.1 1I NTRODUCTION The ability to provide precise performance guarantees is critical to many classification tasks (Amodei et al., 2016; Jiang et al., 2012; 2018). Yet, achieving perfect accuracy with only single guesses is often out of reach due to noise, limited data, insufficient modeling capacity, or other pitfalls.
    [Show full text]
  • Impact R Eport 20 19
    Impact Report 2019–2020 “With impressive speed, the MIT Stephen A. Schwarzman College of Computing has become a part of the fabric of MIT. Its profound impact on computing research and education at the Institute is helping us think deeply about how the technologies we invent can best serve, support, and care for our global human family.” L. Rafael Reif MIT President MIT celebrated the creation of the MIT Stephen A. Schwarzman College of Computing with three days of events in February 2019. 4 MIT Schwarzman College of Computing Impact Report 2019–2020 Message from the Dean 1 Organizational Structure 3 Program Highlights 8 Restructuring EECS Social and Ethical Responsibilities of Computing Common Ground for Computing Education Research Impact 16 Examples of Computing Research Spotlight on Covid-19 Community Highlights 26 Faculty Students Diversity, Equity, and Inclusion An Interdisciplinary Computing Hub 40 5 1 For centuries, as new technologies have come along, Starting last fall, we have been building a solid it has often proven challenging to determine what foundation for the college while at the same time kinds of impact they are going to have in the world, piloting new programs and activities. We have with the enduring effects sometimes not becoming restructured academic programs and created new apparent until many years later. For computing areas that cross disciplinary boundaries to increase technologies, which are being developed at such connections and collaborations in computing across a rapid pace, each technology generation brings MIT. We are changing ways in which we develop unforeseen opportunities and challenges often what we teach and rethinking how our research before we understand the impact of previous ones.
    [Show full text]
  • Opportunities and Obstacles for Deep Learning in Biology and Medicine
    bioRxiv preprint doi: https://doi.org/10.1101/142760; this version posted January 19, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Opportunities and obstacles for deep learning in biology and medicine A DOI-citable preprint of this manuscript is available at https://doi.org/10.1101/142760. This manuscript was automatically generated from greenelab/deep-review@a01dd71 on January 19, 2018. Authors 1,☯ 2 3 4 Travers Ching , Daniel S. Himmelstein , Brett K. Beaulieu-Jones , Alexandr A. Kalinin , 5 2 6 7 2 Brian T. Do , Gregory P. Way , Enrico Ferrero , Paul-Michael Agapow , Michael Zietz , 8,9,10 11 12 13 Michael M. Hoffman , Wei Xie , Gail L. Rosen , Benjamin J. Lengerich , Johnny 14 15 12 16 Israeli , Jack Lanchantin , Stephen Woloszynek , Anne E. Carpenter , Avanti 17 18 19,20 21 Shrikumar , Jinbo Xu , Evan M. Cofer , Christopher A. Lavender , Srinivas C. 22 17 23 24 25 Turaga , Amr M. Alexandari , Zhiyong Lu , David J. Harris , Dave DeCaprio , 15 17,26 23 27 28 Yanjun Qi , Anshul Kundaje , Yifan Peng , Laura K. Wiley , Marwin H.S. Segler , 29 30 31 32,33,† Simina M. Boca , S. Joshua Swamidass , Austin Huang , Anthony Gitter , 2,† Casey S. Greene ☯ — Author order was determined with a randomized algorithm † — To whom correspondence should be addressed: [email protected] (A.G.) and [email protected] (C.S.G.) 1. Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at Manoa, Honolulu, HI 2.
    [Show full text]