Main Conference Program

Total Page:16

File Type:pdf, Size:1020Kb

Main Conference Program Main Conference Program Monday, June 5 9:00–9:10 Opening Session 9:10–10:10 Keynote Speaker I: Joshua Goodman Email and Spam and Spim and Spat 10:10–10:40 Break Machine Translation I 10:40–11:05 Capitalizing Machine Translation Wei Wang, Kevin Knight and Daniel Marcu 11:05–11:30 Do we need phrases? Challenging the conventional wisdom in Statistical Machine Translation Chris Quirk and Arul Menezes 11:30–11:55 Improved Statistical Machine Translation Using Paraphrases Chris Callison-Burch, Philipp Koehn and Miles Osborne 11:55–12:20 Segment Choice Models: Feature-Rich Models for Global Distortion in Statistical Machine Translation Roland Kuhn, Denis Yuen, Michel Simard, Patrick Paul, George Foster, Eric Joanis and Howard Johnson Inference and Entailment 10:40–11:05 Effectively Using Syntax for Recognizing False Entailment Rion Snow, Lucy Vanderwende and Arul Menezes 11:05–11:30 Learning to recognize features of valid textual entailments Bill MacCartney, Trond Grenager, Marie-Catherine de Marneffe, Daniel Cer and Christopher D. Manning 11:30–11:55 Acquisition of Verb Entailment from Text Viktor Pekar 11:55–12:20 Acquiring Inference Rules with Temporal Constraints by Using Japanese Coordi- nated Sentences and Noun-Verb Co-occurrences Kentaro Torisawa Monday, June 5 (continued) Named Entity Recognition 10:40–11:05 Role of Local Context in Automatic Deidentification of Ungrammatical, Fragmented Text Tawanda Sibanda, Ozlem Uzuner and Ozlem Uzuner 11:05–11:30 Exploiting Domain Structure for Named Entity Recognition Jing Jiang and ChengXiang Zhai 11:30–11:55 Named Entity Transliteration and Discovery from Multilingual Comparable Corpora Alexandre Klementiev and Dan Roth 11:55–12:20 Reducing Weight Undertraining in Structured Discriminative Learning Charles Sutton, Michael Sindelar and Andrew McCallum 12:20–1:50 Lunch Short Papers: Machine Translation, Multi-Lingual Speech 1:50–2:05 Spectral Clustering for Example Based Machine Translation Rashmi Gangadharaiah, Ralf Brown and Jaime Carbonell 2:05–2:20 Bridging the Inflection Morphology Gap for Arabic Statistical Machine Translation Andreas Zollmann, Venugopal Ashish and Vogel Stephan 2:20–2:35 Arabic Preprocessing Schemes for Statistical Machine Translation Nizar Habash and Fatiha Sadat 2:35–2:50 Thai Grapheme-Based Speech Recognition Paisarn Charoenpornsawat, Sanjika Hewavitharana and Tanja Schultz 2:50–3:05 Story Segmentation of Broadcast News in English, Mandarin and Arabic Andrew Rosenberg and Julia Hirschberg 3:05–3:20 Word Pronunciation Disambiguation using the Web Eiichiro Sumita and Fumiaki Sugaya Monday, June 5 (continued) Short Papers: Discourse/Dialogue 1:50–2:05 Agreement/Disagreement Classification: Exploiting Unlabeled Data using Contrast Clas- sifiers Sangyun Hahn, Richard Ladner and Mari Ostendorf 2:05–2:20 Using Phrasal Patterns to Identify Discourse Relations Manami Saito, Kazuhide Yamamoto and Satoshi Sekine 2:20–2:35 Evaluating Centering for Sentence Ordering in Two New Domains Nikiforos Karamanis 2:35–2:50 Computational Modelling of Structural Priming in Dialogue David Reitter, Frank Keller and Johanna D. Moore 2:50–3:05 Museli: A Multi-Source Evidence Integration Approach to Topic Segmentation of Sponta- neous Dialogue Jaime Arguello and Carolyn Rose 3:05–3:20 Automatic Recognition of Personality in Conversation Franois Mairesse and Marilyn Walker Short Papers: Retrieval, Language Models 1:50–2:05 Using the Web to Disambiguate Acronyms Eiichiro Sumita and Fumiaki Sugaya 2:05–2:20 Lycos Retriever: An Information Fusion Engine Brian Ulicny 2:20–2:35 BioEx: A Novel User-Interface that Accesses Images from Abstract Sentences Hong Yu and Minsuk Lee 2:35–2:50 Selecting relevant text subsets from web-data for building topic specific language models Abhinav Sethy, Panayiotis Georgiou and Shrikanth Narayanan 2:50–3:05 Factored Neural Language Models Andrei Alexandrescu and Katrin Kirchhoff 3:05–3:20 Quantitative Methods for Classifying Writing Systems Gerald Penn and Travis Choma 3:20–3:50 Break Monday, June 5 (continued) Word Alignment 3:50–4:15 A Maximum Entropy Approach to Combining Word Alignments Necip Fazil Ayan and Bonnie J. Dorr 4:15–4:40 Alignment by Agreement Percy Liang, Ben Taskar and Dan Klein 4:40–5:05 Word Alignment via Quadratic Assignment Simon Lacoste-Julien, Ben Taskar, Dan Klein and Michael I. Jordan Semantics I 3:50–4:15 An Empirical Study of the Behavior of Active Learning for Word Sense Disambiguation Jinying Chen, Andrew Schein, Lyle Ungar and Martha Palmer 4:15–4:40 Unknown word sense detection as outlier detection Katrin Erk 4:40–5:05 Understanding Temporal Expressions in Emails Benjamin Han, Donna Gates and Lori Levin Parsing I 3:50–4:15 Partial Training for a Lexicalized-Grammar Parser Stephen Clark and James Curran 4:15–4:40 Effective Self-Training for Parsing David McClosky, Eugene Charniak and Mark Johnson 4:40–5:05 Multilingual Dependency Parsing using Bayes Point Machines Simon Corston-Oliver, Anthony Aue, Kevin Duh and Eric Ringger Tuesday, June 6 Parsing II 9:00–9:25 Multilevel Coarse-to-Fine PCFG Parsing Eugene Charniak, Mark Johnson, Micha Elsner, Joseph Austerweil, David Ellis, Isaac Haxton, Catherine Hill, R. Shrivaths, Jeremy Moore, Michael Pozar and Theresa Vu 9:25–9:50 A Fully-Lexicalized Probabilistic Model for Japanese Syntactic and Case Structure Anal- ysis Daisuke Kawahara and Sadao Kurohashi 9:50–10:15 Fully Parsing the Penn Treebank Ryan Gabbard, Seth Kulick and Mitchell Marcus Discourse 9:00–9:25 Exploiting Semantic Role Labeling, WordNet and Wikipedia for Coreference Resolution Simone Paolo Ponzetto and Michael Strube 9:25–9:50 Identifying and Analyzing Judgment Opinions Soo-Min Kim and Eduard Hovy 9:50–10:15 Learning to Detect Conversation Focus of Threaded Discussions Donghui Feng, Erin Shaw, Jihie Kim and Eduard Hovy Spoken and Acoustic Aspects of Language 9:00–9:25 Towards Automatic Scoring of Non-Native Spontaneous Speech Klaus Zechner and Isaac Bejar 9:25–9:50 Unsupervised and Semi-supervised Learning of Tone and Pitch Accent Gina-Anne Levow 9:50–10:15 Learning Pronunciation Dictionaries: Language Complexity and Word Selection Strate- gies John Kominek and Alan W Black 10:15–10:45 Break Tuesday, June 6 (continued) Machine Translation II 10:45–11:10 Relabeling Syntax Trees to Improve Syntax-Based Machine Translation Quality Bryant Huang and Kevin Knight 11:10–11:35 Grammatical Machine Translation Stefan Riezler and John T. Maxwell III 11:35–12:00 Synchronous Binarization for Machine Translation Hao Zhang, Liang Huang, Daniel Gildea and Kevin Knight Dialogue 10:45–11:10 Modelling User Satisfaction and Student Learning in a Spoken Dialogue Tutoring System with Generic, Tutoring, and User Affect Parameters Kate Forbes-Riley and Diane Litman 11:10–11:35 Comparing the Utility of State Features in Spoken Dialogue Using Reinforcement Learn- ing Joel Tetreault and Diane Litman 11:35–12:00 Backoff Model Training using Partially Observed Data: Application to Dialog Act Tagging Gang Ji and Jeff Bilmes Relation Extraction 10:45–11:10 Exploring Syntactic Features for Relation Extraction using a Convolution Tree Kernel Min Zhang, Jie Zhang and Jian Su 11:10–11:35 Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text Aron Culotta, Andrew McCallum and Jonathan Betz 11:35–12:00 Preemptive Information Extraction using Unrestricted Relation Discovery Yusuke Shinyama and Satoshi Sekine 12:00–1:30 Lunch Tuesday, June 6 (continued) Best Paper And Plenary Demo Presentations 1:30–2:00 Probabilistic Context-Free Grammar Induction Based on Structural Zeros Mehryar Mohri and Brian Roark 2:00–2:30 Prototype-Driven Learning for Sequence Models Aria Haghighi and Dan Klein 2:30–3:00 Plenary demos InfoMagnets: Making Sense of Corpus Data Jamie Arguello and Carolyn Rose Question Answering with Web, Mobile and Speech Interfaces Edward Whittaker, Joanna Mrozinski, and Sadaoki Furui From Pipedreams to Products and Promise! Janet Baker and Patri Pugliese 3:00–3:15 Break Tuesday, June 6 (continued) Posters and Demos 3:15–5:15 Posters A Comparison of Tagging Strategies for Statistical Information Extraction Christian Siefkes A Finite-State Model of Georgian Verbal Morphology Olga Gurevich A Maximum Entropy Framework that Integrates Word Dependencies and Grammatical Relations for Reading Comprehension Kui Xu, Helen Meng and Fuliang Weng Answering the question you wish they had asked: The impact of paraphrasing for Question Answering Pablo Duboue and Jennifer Chu-Carroll Comparing the roles of textual, acoustic and spoken-language features on spontaneous- conversation summarization Xiaodan Zhu and Gerald Penn Evolving optimal inspectable strategies for spoken dialogue systems Dave Toney, Johanna Moore and Oliver Lemon Exploiting Variant Corpora for Machine Translation Michael Paul and Eiichiro Sumita Gesture Improves Coreference Resolution Jacob Eisenstein and Randall Davis Illuminating Trouble Tickets with Sublanguage Theory Svetlana Symonenko, Steven Rowe and Elizabeth D. Liddy Improved Affinity Graph Based Multi-Document Summarization Xiaojun Wan and Jianwu Yang Investigating Cross-Language Speech Retrieval for a Spontaneous Conversational Speech Collection Diana Inkpen, Muath Alzghool, Gareth Jones and Douglas Oard Tuesday, June 6 (continued) Measuring Semantic Relatedness Using People and WordNet Beata Beigman Klebanov MMR-based Active Machine Learning for Bio Named Entity Recognition Seokhwan Kim, Yu Song, Kyungduk Kim, Jeong-Won Cha and Gary Geunbae Lee NER Systems that Suit User’s Preferences: Adjusting
Recommended publications
  • Proactive Learning for Building Machine Translation Systems for Minority Languages
    Proactive Learning for Building Machine Translation Systems for Minority Languages Vamshi Ambati Jaime Carbonell [email protected] [email protected] Language Technologies Institute Language Technologies Institute Carnegie Mellon University Carnegie Mellon University Pittsburgh, PA 15213, USA Pittsburgh, PA 15213, USA Abstract based MT systems require lexicons that provide cov- erage for the target translations, synchronous gram- Building machine translation (MT) for many mar rules that define the divergences in word-order minority languages in the world is a serious across the language-pair. In case of minority lan- challenge. For many minor languages there is guages one can only expect to find meagre amount little machine readable text, few knowledge- of such data, if any. Building such resources effec- able linguists, and little money available for MT development. For these reasons, it be- tively, within a constrained budget, and deploying an comes very important for an MT system to MT system is the need of the day. make best use of its resources, both labeled We first consider ‘Active Learning’ (AL) as a and unlabeled, in building a quality system. framework for building annotated data for the task In this paper we argue that traditional active learning setup may not be the right fit for seek- of MT. However, AL relies on unrealistic assump- ing annotations required for building a Syn- tions related to the annotation tasks. For instance, tax Based MT system for minority languages. AL assumes there is a unique omniscient oracle. In We posit that a relatively new variant of active MT, it is possible and more general to have multiple learning, Proactive Learning, is more suitable sources of information with differing reliabilities or for this task.
    [Show full text]
  • Jaime Carbonell Massively Multilingual Language
    Jaime Carbonell Carnegie Mellon University Massively Multilingual Language Technologies The so-called "digital divide" addressed the challenge of poorer people not having access to the internet and all the myriad resources that come from being digitally connected. With the advent and spread of affordable smart phones, the present challenge is the "linguistic divide" where speakers of rare languages, often in poorer nations or regions, lack access to the vast content of information contained only in major languages. We -- CMU and InterAct -- have long sought solutions, via machine translation and via methods that extend to rare languages with minimal on-line data, well beyond the major languages with economic clout that the ma- jor search engines address. The presentation outlines some of the challenges and methods to bridge the linguistic divide. Dr. Jaime Carbonell is the Director of the Language Technologies Institute and Allen Newell Professor of Computer Science at Carnegie Mellon University. He received SB degrees in Physics and Mathe- matics from MIT, and MS and PhD degrees in Computer Science from Yale University. His current research includes machine learning, data / text mining, machine translation, rare-language analysis and computational proteomics. He invented Proactive Machine Learning, including its underlying decision-theoretic framework. He is also known for the Maximal Marginal Relevance principle in information retrieval, for derivational analogy in problem solving and for example-based machine translation and for machine learning in structural biology, and in protein interaction networks. Overall, he has published some 350 papers and books and supervised some 60 PhD dissertations. .
    [Show full text]
  • Information Extraction Lecture 5 – Named Entity Recognition III
    Information Extraction Lecture 5 – Named Entity Recognition III CIS, LMU München Winter Semester 2016-2017 Dr. Alexander Fraser, CIS Administravia • Seminar • There is now a LaTeX template for the Hausarbeit on the Seminar web page • Please don't forget to send me your presentation (as a PDF) after giving it • And, as you know, the Hausarbeit is due 3 weeks after your presentation! 2 Outline • IE end-to-end • Introduction: named entity detection as a classification problem 3 CMU Seminars task • Given an email about a seminar • Annotate – Speaker – Start time – End time – Location CMU Seminars - Example <[email protected] (Jaime Carbonell).0> Type: cmu.cs.proj.mt Topic: <speaker>Nagao</speaker> Talk Dates: 26-Apr-93 Time: <stime>10:00</stime> - <etime>11:00 AM</etime> PostedBy: jgc+ on 24-Apr-93 at 20:59 from NL.CS.CMU.EDU (Jaime Carbonell) Abstract: <paragraph><sentence>This Monday, 4/26, <speaker>Prof. Makoto Nagao</speaker> will give a seminar in the <location>CMT red conference room</location> <stime>10</stime>-<etime>11am</etime> on recent MT research results</sentence>.</paragraph> IE Template Slot Name Value Speaker Prof. Makoto Nagao Start time 1993-04-26 10:00 End time 1993-04-26 11:00 Location CMT red conference room Message Identifier (Filename) [email protected]. EDU (Jaime Carbonell).0 • Template contains *canonical* version of information • There are several "mentions" of speaker, start time and end- time (see previous slide) • Only one value for each slot • Location could probably also
    [Show full text]
  • Probabilistic Approaches for Answer Selection in Multilingual Question Answering
    Probabilistic Approaches for Answer Selection in Multilingual Question Answering Jeongwoo Ko Aug 27 2007 Language Technologies Institute School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Thesis Committee: Eric Nyberg (Chair) Teruko Mitamura Jaime Carbonell Luo Si (Purdue University) Copyright c 2007 Jeongwoo Ko This work was supported in part by ARDA/DTO Advanced Question Answering for Intelligence (AQUAINT) program award number NBCHC040164 and used the NTCIR 5-6 corpus. Keywords: Answer ranking, answer selection, probabilistic framework, graphi- cal model, multilingual question answering To my family for love and support. iv Abstract Question answering (QA) aims at finding exact answers to a user’s natural language question from a large collection of documents. Most QA systems combine information retrieval with extraction techniques to iden- tify a set of likely candidates and then utilize some selection strategy to generate the final answers. This selection process can be very challenging, as it often entails ranking the relevant answers to the top positions. To address this challenge, many QA systems have incorporated semantic re- sources for answer ranking in a single language. However, there has been little research on a generalized probabilistic framework that models the correctness and correlation of answer candidates for multiple languages. In this thesis, we propose two probabilistic models for answer ranking: independent prediction and joint prediction. The independent prediction model directly estimates the probability of an individual answer candi- date given the degree of answer relevance and the amount of supporting evidence provided in a set of answer candidates. The joint prediction model uses an undirected graph to estimate the joint probability of all answer candidates, from which the probability of an individual candidate is inferred.
    [Show full text]
  • Active Learning
    AI, Machine Learning and Language Technologies: What’s Next? Army Mad Scientists Program March 7, 2017 Jaime Carbonell, and colleagues School of Computer Science Carnegie Mellon University www.cs.cmu.edu/~jgc AI Touches Virtually All Areas of Computer Science Systems Entertain Lang + Theory Tech Tech Fine Arts Comp Artificial Intelligence Bio Sciences Machine Learning Human-Comp Interaction Robotics Computer Humanities Science Engineering 3/6/2017 Jaime G. Carbonell, Language 2 Technolgies Institute AI is Becoming Central to the World Economy (Davos 2016) “The fourth Industrial Revolution” is characterized by: n “Ubiquitous and mobile internet”, n “Smaller and more powerful sensors”, n “Artificial Intelligence”, and n “Machine Learning” -- Prof. Klaus Schwab, founder of the Davos World Economic Forum, 2016 3/6/2017 Jaime G. Carbonell, Language 3 Technolgies Institute Key Components of AI o Automated Perception n Vision, sonar, lidar, haptics, … o Robotic Action n Locomotion, manipulation, … o Deep Reasoning n Planning, goal-oriented behavior, projection, … o Language Technologies n Language, speech, dialog, social nets, … o Machine Learning n Adaptation, reflection, knowledge acquisition, … o Big Data 3/6/2017 Jaime G. Carbonell, Language 4 Technolgies Institute Key Components of AI o Automated Perception n Vision, sonar, lidar, haptics, … o Robotic Action n Locomotion, manipulation, … o Deep Reasoning n Planning, goal-oriented behavior, projection, … o Language Technologies n Language, speech, dialog, social nets, … o Machine Learning Today’s main focus n Adaptation, reflection, knowledge acquisition, … My research o Big Data 3/6/2017 Jaime G. Carbonell, Language 5 Technolgies Institute How Big is Big? Dimensions of Big Data Analytics LARGE-SCALE : TERABYTES PETABYTES EXOBYTES Billions++ of entries: Terabyes/Petabyes of data HIGH-COMPLEXITY Trillions of potential relations among entries (graphs) HIGH-DIMENSIONAL Millions of attributes per entry (but typically sparse encoding) 3/6/2017 Jaime G.
    [Show full text]
  • Active Learning and Crowd-Sourcing for Machine Translation
    Active Learning and Crowd-Sourcing for Machine Translation Vamshi Ambati, Stephan Vogel, Jaime Carbonell Language Technologies Institute Carnegie Mellon University, Pittsburgh, PA 15213, USA fvamshi,vogel,[email protected] Abstract In recent years, corpus based approaches to machine translation have become predominant, with Statistical Machine Translation (SMT) being the most actively progressing area. Success of these approaches depends on the availability of parallel corpora. In this paper we propose Active Crowd Translation (ACT), a new paradigm where active learning and crowd-sourcing come together to enable automatic translation for low-resource language pairs. Active learning aims at reducing cost of label acquisition by prioritizing the most informative data for annotation, while crowd-sourcing reduces cost by using the power of the crowds to make do for the lack of expensive language experts. We experiment and compare our active learning strategies with strong baselines and see significant improvements in translation quality. Similarly, our experiments with crowd-sourcing on Mechanical Turk have shown that it is possible to create parallel corpora using non-experts and with sufficient quality assurance, a translation system that is trained using this corpus approaches expert quality. 1. Introduction tive learning approaches help us identify sentences, which Corpus based approaches to automatic translation like Ex- if translated have the potential to provide maximal improve- ample Based and Statistical Machine Translation systems ment to an existing system. Crowd-sourcing techniques, on use large amounts of parallel data created by humans to the other hand help us reach a significant number of trans- train mathematical models for automatic language transla- lators at very low costs.
    [Show full text]
  • Jaime G. Carbonell -- Curriculum Vita
    Jaime G. Carbonell -- Curriculum Vita Business Address Language Technologies Institute Carnegie Mellon University Pittsburgh, Pennsylvania 15213 USA Telephone: (412) 268-7279 Fax: (412) 268-6298 Email: [email protected] http://www.cs.cmu.edu/~jgc/ Personal: Born: July 29, 1953. Citizenship: USA Academic and Professional Positions Held 1996 - Present Director, Language Technologies Institute, Carnegie Mellon University 1995 - Present Allen Newell Professor of Computer Science, Carnegie Mellon University 2006 - Present Adjunct Faculty, Dept of Computational Biology, U Pittsburgh Medical School. 2003 - 2009 Chief Scientist, Meaningful Machines Inc. 2001 - 2011 Co-Founder, Board chairman, Carnegie Speech Incorporated 1995 - 2003 Co-Founder, Board chairman, Wisdom Technologies Corporation 1986 - 1996 Director, Center for Machine Translation, Carnegie Mellon University 1987 - 1995 Professor of Computer Science, Carnegie Mellon University 1983 - 1987 Associate Professor of Computer Science, Carnegie Mellon University 1983 - 1996 Co-Founder, Director, Scientific advisor, Carnegie Group Inc. 1979 - 1983 Assistant Professor of Computer Science, Carnegie Mellon University 1975 - 1978 Research Fellow in Computer Science, Yale University 1975 - 1977 Teaching Assistant in Computer Science, Yale University 1974 - 1975 Research Assistant, Center for Space Research, MIT 1971 - 1975 Research Programmer, Artificial Intelligence, BBN, Cambridge, MA Education 1975 - 1979 Yale University PhD 1979 – Computer Science MPhil 1977 - Computer Science MS 1976 - Computer Science 1971 - 1975 Massachusetts Institute of Technology SB 1975 - Physics SB 1975 - Mathematics Professional Organizations and Activities Member: ACM (elected Chair of SIGART 1983-85), ACL, AAAI (AAAI Fellow 1988-present, AAAI executive committee 1990-92), Cognitive Science Society, Sigma Xi, AMTA. Government Committees: NIH Human Genome Scientific Advisory Committee, aka "Watson Committee" (1988- 1992).
    [Show full text]
  • Naacl Hlt 2010
    NAACL HLT 2010 Workshop on Active Learning for Natural Language Processing (ALNLP-10) Proceedings of the Workshop June 6, 2010 Los Angeles, California USB memory sticks produced by Omnipress Inc. 2600 Anderson Street Madison, WI 53707 USA c 2010 The Association for Computational Linguistics Association for Computational Linguistics (ACL) 209 N. Eighth Street Stroudsburg, PA 18360 USA Tel: +1-570-476-8006 Fax: +1-570-476-0860 [email protected] ii Introduction Labeled training data is often required to achieve state-of-the-art performance in machine learning solutions to natural language tasks. While traditional supervised learning relies on existing labeled data, active learning algorithms may select unlabeled data for labeling with the goal of reducing annotation costs, while maintaining high accuracy. Thus, active learning has emerged as a promising framework for NLP applications and annotation projects where unlabeled data is readily available (e.g., web pages, audio recordings, minority language data), but obtaining labels is cost-prohibitive. The 2010 Workshop on Active Learning for Natural Language Processing (ALNLP) is the sequel to a successful 2009 meeting of the same name—both co-located with the Annual Conference of the North American Chapter of the Association for Computational Linguistics—with the intent of exploring key practical and theoretical aspects of active learning applied to real-world NLP problems. As we assembled the program committee, the topic’s timeliness resonated with researchers both in machine learning (who see natural language as an important application domain for active learning), and with those in language technologies (who see maturing active learning methods as an important part of their toolbox).
    [Show full text]
  • The Natural Language Sourcebook
    National Center for Research on Evaluation, C R E S S T Standards, and Student Testing Technical Report You can view this document on your screen or print a copy. UCLA Center for the Study of Evaluation in collaboration with: University of Colorado NORC, University of Chicago LRDC, University of Pittsburgh The RAND Corporation NATURAL LANGUAGE SOURCEBOOK Walter Read, Michael Dyer, Eva Baker, Patricia Mutch Frances Butler, Alex Quilici, John Reeves Center for Technology Assessment Center for the Study of Evaluation and Computer Science Department UCLA 1990 This overall project was directed by Eva L. Baker, funded by DARPA and administered by ONR, Contract Number N00014-86-K- 0395. We wish to thank the following people for assistance Craig Fields, MCC, Susan Chipman, ONR, Steve Kaisler, formerly of DARPA, Bob Simpson and Charles Wayne of DARPA, and, at UCLA, Sherry Miranda and Richard Seligman of the Office of Contracts and Grants Administration. In addition, the authors would like to thank the following people for their contributions at various stages in the development of the Natural Language Sourcebook: Kathleen Brennan, Rory Constancio, Cheryl Fantuzzi, Jana Good, Dayle Hartnett, Jill Fain Lehman, Elaine Lindheim, Mark Neder, and Jean Turner. Finally, Jaime Carbonell, Martha Evens, Evelyn Hatch, David Kieras, Carol Lord, and Merlin Wittrock served as consultants for the Sourcebook. Their suggestions were invaluable to the authors. © 1990 by the Regents of the University of California Table of Contents Introduction .............................................. 1 Sourcebook classification ................................. 13 Group I: Single-utterance issues ......................... 14 Identification of syntactic units .................... 15 Ambiguity ............................................ 23 Modifier attachment .................................. 50 Reference ............................................ 61 Metaphor and novel language .........................
    [Show full text]
  • Jaime G. Carbonell -- Curriculum Vita
    Jaime G. Carbonell -- Curriculum Vita Business Address Language Technologies Institute Carnegie Mellon University Pittsburgh, Pennsylvania 15213 USA Telephone: (412) 268-7279 Fax: (412) 268-6298 Email: [email protected] http://www.cs.cmu.edu/~jgc/ Citizenship: USA Academic and Professional Positions Held 2012 - Present University Professor, School of Computer Science, Carnegie Mellon University 1996 - Present Director, Language Technologies Institute, Carnegie Mellon University 1995 - Present Allen Newell Professor of Computer Science, Carnegie Mellon University 2006 - Present Adjunct Faculty, Dept of Computational Biology, U Pittsburgh Medical School. 2001 – Present Co-Founder, Board chairman, Carnegie Speech Corporation 1995 - 2003 Co-Founder, Board chairman, Wisdom Technologies Corporation 1986 - 1996 Director, Center for Machine Translation, Carnegie Mellon University 1987 - 1995 Professor of Computer Science, Carnegie Mellon University 1983 - 1987 Associate Professor of Computer Science, Carnegie Mellon University 1983 - 1996 Co-Founder, Director, Scientific advisor, Carnegie Group Inc. 1979 - 1983 Assistant Professor of Computer Science, Carnegie Mellon University 1975 - 1978 Research Fellow in Computer Science, Yale University 1975 - 1977 Teaching Assistant in Computer Science, Yale University 1974 - 1975 Research Assistant, Center for Space Research, MIT 1971 - 1975 Research Programmer, Artificial Intelligence, BBN, Cambridge, MA Education 1975 - 1979 Yale University PhD 1979 – Computer Science MPhil 1977 - Computer Science MS 1976 - Computer Science 1971 - 1975 Massachusetts Institute of Technology SB 1975 - Physics SB 1975 - Mathematics Professional Organizations and Activities Member: ACM (elected Chair of SIGART 1983-85), ACL, AAAI (AAAI Fellow 1988-present, AAAI executive committee 1990-92), Cognitive Science Society, Sigma Xi, AMTA. Government and other Committees: NSF/CISE Scientific Advisory Committee (2010-2014), NIH Human Genome Scientific Advisory Committee, aka "Watson Committee" (1988-1992).
    [Show full text]
  • Manuela M. Veloso
    Manuela M. Veloso Computer Science Department Home address: Carnegie Mellon University 6645 Woodwell Street Pittsburgh PA 15213-3819 Pittsburgh PA 15217 tel: (412) 268-1474 Citizenship: USA email: [email protected] Birth: Lisbon, Portugal, August 12, 1957 http://www.cs.cmu.edu/˜mmv/ Married; two sons, born 1981 and 1987 Academic Positions July 02 - Professor, Computer Science Department, Carnegie Mellon University July 99 - July 02 Tenured Associate Professor, Computer Science Department, Carnegie Mellon University Aug 99 - Aug 00 Visiting Associate Professor, Electrical Engineering and Computer Science Department, AI Lab, Massachusetts Institute of Technology, on sabbatical leave from Carnegie Mellon University. July 97 - July 99 Associate Professor, Computer Science Department, Carnegie Mellon University Sep 92 - June 97 Assistant Professor Computer Science Department, Carnegie Mellon University Aug 87 - Aug 92 Research Assistant in Computer Science, Carnegie Mellon University Jan 85 - Jul 86 Teaching Assistant/Lecturer in Computer Science, Boston University Oct 80 - Jul 84 Teaching Assistant/Lecturer in Electrical and Computer Engineering, Instituto Superior Tecnico,´ Lisbon, Portugal Education 1992 Ph.D. in Computer Science School of Computer Science, Carnegie Mellon University, Pittsburgh 1986 M.A. in Computer Science Computer Science Department, Boston University, Boston 1984 M.Sc. in Electrical and Computer Engineering Electrical and Computer Engineering Department, Instituto Superior Tecnico,´ Lisbon, Portugal 1980 Licenciatura
    [Show full text]
  • Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics Short Papers
    HLT-NAACL 2006 Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics Short Papers Robert C. Moore, General Chair Jeff Bilmes, Jennifer Chu-Carroll and Mark Sanderson Program Committee Chairs June 4-9, 2006 New York, New York, USA Published by the Association for Computational Linguistics http://www.aclweb.org Production and Manufacturing by Omnipress Inc. 2600 Anderson Street Madison, WI 53704 c 2006 The Association for Computational Linguistics Order copies of this and other ACL proceedings from: Association for Computational Linguistics (ACL) 209 N. Eighth Street Stroudsburg, PA 18360 USA Tel: +1-570-476-8006 Fax: +1-570-476-0860 [email protected] ii Table of Contents Factored Neural Language Models Andrei Alexandrescu and Katrin Kirchhoff. .1 The MILE Corpus for Less Commonly Taught Languages Alison Alvarez, Lori Levin, Robert Frederking, Simon Fung, Donna Gates and Jeff Good. .5 Museli: A Multi-Source Evidence Integration Approach to Topic Segmentation of Spontaneous Dialogue Jaime Arguello and Carolyn Rose . 9 Measuring Semantic Relatedness Using People and WordNet Beata Beigman Klebanov . 13 Thai Grapheme-Based Speech Recognition Paisarn Charoenpornsawat, Sanjika Hewavitharana and Tanja Schultz . 17 Class Model Adaptation for Speech Summarisation Pierre Chatain, Edward Whittaker, Joanna Mrozinski and Sadaoki Furui . 21 Semi-supervised Relation Extraction with Label Propagation Jinxiu Chen, Donghong Ji, Chew Lim Tan and Zhengyu Niu . 25 Temporal Classification of Text and Automatic Document Dating Angelo Dalli . 29 Answering the question you wish they had asked: The impact of paraphrasing for Question Answering Pablo Duboue and Jennifer Chu-Carroll . 33 Gesture Improves Coreference Resolution Jacob Eisenstein and Randall Davis .
    [Show full text]