Research in the Language, Information and Computation Laboratory of the University of Pennsylvania
Total Page:16
File Type:pdf, Size:1020Kb
University of Pennsylvania ScholarlyCommons IRCS Technical Reports Series Institute for Research in Cognitive Science March 1995 Research in the Language, Information and Computation Laboratory of the University of Pennsylvania Matthew Stone University of Pennsylvania Libby Levison University of Pennsylvania Follow this and additional works at: https://repository.upenn.edu/ircs_reports Stone, Matthew and Levison, Libby, "Research in the Language, Information and Computation Laboratory of the University of Pennsylvania" (1995). IRCS Technical Reports Series. 120. https://repository.upenn.edu/ircs_reports/120 University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-95-05. This paper is posted at ScholarlyCommons. https://repository.upenn.edu/ircs_reports/120 For more information, please contact [email protected]. Research in the Language, Information and Computation Laboratory of the University of Pennsylvania Abstract This report takes its name from the Computational Linguistics Feedback Forum (CLiFF), an informal discussion group for students and faculty. However the scope of the research covered in this report is broader than the title might suggest; this is the yearly report of the LINC Lab, the Language, Information and Computation Laboratory of the University of Pennsylvania. It may at first be hard to see the threads that bind together the work presented here, work by faculty, graduate students and postdocs in the Computer Science and Linguistics Departments, and the Institute for Research in Cognitive Science. It includes prototypical Natural Language fields such as: Combinatorial Categorial Grammars, Tree Adjoining Grammars, syntactic parsing and the syntax-semantics interface; but it extends to statistical methods, plan inference, instruction understanding, intonation, causal reasoning, free word order languages, geometric reasoning, medical informatics, connectionism, and language acquisition. Naturally, this introduction cannot spell out all the connections between these abstracts; we invite you to explore them on your own. In fact, with this issue it’s easier than ever to do so: this document is accessible on the “information superhighway”. Just call up http://www.cis.upenn.edu/~cliff-group/94/ cliffnotes.html In addition, you can find many of the papers efr erenced in the CLiFF Notes on the net. Most can be obtained by following links from the authors’ abstracts in the web version of this report. The abstracts describe the researchers’ many areas of investigation, explain their shared concerns, and present some interesting work in Cognitive Science. We hope its new online format makes the CLiFF Notes a more useful and interesting guide to Computational Linguistics activity at Penn. Comments University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-95-05. This technical report is available at ScholarlyCommons: https://repository.upenn.edu/ircs_reports/120 The Institute For Research In Cognitive Science CLiFF Notes Research in the Language, Information and Computation Laboratory of the University of Pennsylvania Annual Report: 1994, No. 4 P Editors: Matthew Stone and Libby Levison E University of Pennsylvania 3401 Walnut Street, Suite 400C Philadelphia, PA 19104-6228 N March 1995 Site of the NSF Science and Technology Center for Research in Cognitive Science N University of Pennsylvania IRCS Report 95-05 Founded by Benjamin Franklin in 1740 CLiFF Notes Research in the Language, Information and Computation Laboratory of the University of Pennsylvania Annual Report: 1994, No. 4 Technical Report MS-CIS-95-07 LINC Lab 282 Department of Computer and Information Science University of Pennsylvania Philadelphia, PA 19104-6389 Editors: Matthew Stone & Libby Levison i Contents I Introduction iii II Abstracts 1 Breck Baldwin 2 Tilman Becker 4 Betty J. Birner 6 Sandra Carberry 8 Christine Doran 10 Dania Egedi 13 Jason Eisner 15 Christopher Geib 17 Abigail S. Gertner 19 James Henderson 21 Beth Ann Hockey 23 Beryl Hoffman 26 Paul S. Jacobs 28 Aravind K. Joshi 29 Jonathan M. Kaye 32 Albert Kim 34 Nobo Komagata 36 Seth Kulick 37 Sadao Kurohashi 38 Libby Levison 39 D. R. Mani 41 Mitch Marcus 43 I. Dan Melamed 46 ii Michael B. Moore 47 Charles L. Ortiz, Jr. 48 Martha Palmer 49 Hyun S Park 52 Jong Cheol Park 54 Scott Prevost 55 Ellen F. Prince 58 Lance A. Ramshaw 60 Lisa F. Rau 62 Jeffrey C. Reynar 64 Francesc Ribas i Framis 66 James Rogers 68 Bernhard Rohrbacher 70 Joseph Rosenzweig 72 Deborah Rossen-Knill 73 Anoop Sarkar 76 B. Srinivas 77 Mark Steedman 80 Matthew Stone 82 Anne Vainikka 84 Bonnie Lynn Webber 86 Michael White 89 David Yarowsky 91 III Projects and Working Groups 93 The AnimNL Project 94 The Gesture Jack project 96 Information Theory Reading Group 99 iii The STAG Machine Translation Project 100 The TraumAID Project 102 The XTAG Project 104 IV Appendix 107 iv Part I Introduction This report takes its name from the Computational Linguistics Feedback Forum (CLiFF), an informal discussion group for students and faculty. However the scope of the research covered in this report is broader than the title might suggest; this is the yearly report of the LINC Lab, the Language, Information and Computation Laboratory of the University of Pennsylvania. It may at first be hard to see the threads that bind together the work presented here, work by faculty, graduate students and postdocs in the Computer Science and Linguistics Departments,and the Institute for Research in Cognitive Science. It includes prototypical Natural Language fields such as: Combinatorial Categorial Grammars, Tree Adjoining Grammars, syntactic parsing and the syntax- semantics interface; but it extends to statistical methods, plan inference, instruction understanding, intonation, causal reasoning, free word order languages, geometric reasoning, medical informatics, connectionism, and language acquisition. Naturally, this introductioncannot spell out all the connections between these abstracts; we invite you to explore them on your own. In fact, with this issue it’s easier than ever to do so: this document is accessible on the “information superhighway”. Just call up http://www.cis.upenn.edu/˜cliff-group/94/cliffnotes.html In addition, you can find many of the papers referenced in the CLiFF Notes on the net. Most can be obtained by following links from the authors’ abstracts in the web version of this report. The abstracts describe the researchers’ many areas of investigation, explain their shared concerns, and present some interesting work in Cognitive Science. We hope its new online format makes the CLiFF Notes a more useful and interesting guide to Computational Linguistics activity at Penn. v 1.1 Addresses Contributors to this report can be contacted by writing to: contributor’sname department University of Pennsylvania Philadelphia, PA 19104 In addition, a list of publications from the Computer Science department may be obtained by contacting: Technical Report Librarian Department of Computer and Information Science University of Pennsylvania 200 S. 33rd St. Philadelphia, PA 19104-6389 (215)-898-3538 [email protected] 1.2 Funding Agencies This research is partially supported by: ARO including participation by the U.S. Army Research Laboratory (Aberdeen), Natick Laboratory, the Army Research Institute, the Institute for Simulation and Training, and NASA Ames Research Center; Air Force Office of Scientific Research; ARPA through General Electric Government Electronic Systems Division; AT&T Bell Labs; AT&T Fel- lowships; Bellcore; DMSO through the University of Iowa; General Electric Research Lab; IBM T.J. Watson Research Center; IBM Fellowships; National Defense Science and Engineering Graduate Fellowship in Computer Science; NSF, CISE, and Instrumentation and Laboratory Improvement Program Grant, SBE (Social, Behavioral, and Economic Sciences), EHR (Education & Human Re- sources Directorate); Nynex; Office of Naval Research; Texas Instruments; U.S. Air Force DEPTH contract through Hughes Missile Systems. Acknowledgements Thanks to all the contributors. Comments about this report can be sent to: Matthew Stone or Libby Levison Department of Computer and Information Science University of Pennsylvania 200 S. 33rd St. Philadelphia, PA 19104-6389 [email protected] or [email protected] vi Part II Abstracts 1 Breck Baldwin Department of Computer and Information Science [email protected] Uniqueness, Salience and the Structure of Discourse Keywords: Discourse Processing, Anaphora Resolution, Centering, Pragmatics In spoken and written language, anaphora occurs when one phrase “points to” another, where “points to” means that the two phrases denote the same thing in one’s mind. For example, consider the following quote from Kafka’s The Metamorphosis: As Gregor Samsa awoke one morning from uneasy dreams he found himself transformed in his bed into a gigantic insect. The proper noun Gregor Samsa is the antecedent to the anaphor he, and the relation between them is termed anaphora. A thorny problem for those interested in modeling human language capabilities on computers is determining what an anaphor (the pointing phrase) picks out as its antecedent (the pointed-to phrase). Much of the difficulty arises in picking the right antecedent when there are many to choose from. Since anaphora mediates the interrelatedness of language, accurately modeling this phenomenon paves the way to improvements in information retrieval, natural language interfaces and topic