The Potential of the Computational Linguistic Analysis of Social Media

Total Page:16

File Type:pdf, Size:1020Kb

The Potential of the Computational Linguistic Analysis of Social Media The Potential of the Computational Linguisti c Analysis of Social Media for Population Studies Letizia Mencarini Bocconi University, Dondena Centre for Research on Social Dynamics and Public Policy via Roentgen 1, 20136 Milan, I. [email protected] Abstract demographic studies took place. Rather than simply describing demographic patterns, today The paper provides an outline of the scope demographers are equally concerned in under- for synergy between computational lin- standing both the drivers and the consequences of guistic analysis and population studies. It demographic processes. In doing so, demogra- first reviews where population studies stand in terms of using social media data. phers have assembled an enormously rich set of Demographers are entering the realm of data for explaining not only population processes, big data in force. But, this paper argues, but also the motivational and behavioral drivers population studies have much to gain from behind these processes. However, data generated computational linguistic analysis, especial- by surveys may have peaked. As survey and poll- ly in terms of explaining the drivers be- ing agencies struggle with increasing costs and hind population processes. The paper gives declining survey response rates, statistic produc- two examples of how the method can be ers are increasingly looking towards big data. applied, and concludes with a fundamental Still, given their quantitative pedigree, demogra- caveat. Ye s, computational linguistic anal- phers are perhaps better placed than most other ysis provides a possible key for integrating micro theory into any demographic analy- social scientists to take on the challenge of the sis of social media data. But results may new big data revolution. be of little value in as much as knowledge Demographers are, in fact, already using big about fundamental sample characteristics data to describe demographic processes, including are unknown. data derived from social media. But there are chal- lenges. Big data is messy and unstructured, and this 1 The incomplete data revolution in is a considerable challenge for a scientific field demography acutely concerned with representativeness and un- biased estimation. Social media provides a promis- Demography is the study of population. Tradi- ing avenue, however, as demographers are interest- tionally, demography is concerned with measur- ed not only in describing population processes, but ing and estimating population change by births, also in the motivations that individuals have for deaths and migration. Demography is rooted in their behavior, which, ultimately, generates ob- quantitative methods, with data at its heart. As the served population processes. For demographers in field moved through different epochs of data search of the determinants and consequences of availability, in demography data have always demographic behavior, the linguistic analysis of so- been "big" (Billari and Zagheni, 2017). Starting cial media texts can offer a precious and rich new with the exercise of mapping macro-level trends source. Caution – as this paper highlights – is nec- through population level parameters, based large- essary, since its non-representativeness and partiali- ly on census and administrative records, the field ty makes it problematic in social-science terms. became more theory driven as individual data be- The rapid emergence of big data from social came available. It is fair to say that with the ex- media outpaced social scientists’ capacity for us- plosion in available survey data, a revolution in ing and analyzing them. That having been said, 62 Proceedings of the Second Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media, pages 62–68 New Orleans, Louisiana, June 6, 2018. c 2018 Association for Computational Linguistics demographers have made a start in exploiting so- cent decades. The theory stems from Inglehart’s cial media data. For example, Reis and Brown- work (1971). He argued that with the onset of stein (2010) show that the volume of Internet modernization, individuals now cared more about searches for abortion is inversely proportional to self-realization and less about traditional family local abortion rates and directly proportional to life, which consequently fostered new demo- local restrictions on abortion. Billari et al. (2013) graphic behaviors, such as out-of-wedlock show that Google searches for fertility-related childbearing, cohabitation replacing marriage and queries, like ‘pregnancy’ or ‘birth’, can be used to fertility decline. In other words, values, attitudes predict fertility intentions and consequently fertil- and opinions, play a critical role. Another example ity rates, several months ahead of them being concerns the theoretical concept of gender equali- made public through other data sources. Ojala et ty and equity. As women increasingly attain the al. (2017) use Google Correlate to detect evi- same levels of higher education as men their atti- dence for different socio-economic contexts relat- tudes change. Other than having children, they al- ed to fertility (e.g., teen fertility, fertility in high so want fulfilling work careers (Esping-Andersen income households, etc.). Email data have been and Billari, 2015; Aassve et al., 2015). The sense used to track migrants (Zagheni and Weber, of gender equity (Mencarini, 2014) changes as 2012); Facebook data to monitor migrant stocks women reach men’s level in terms of education, (Zagheni et al., 2017); patterns of short- and long- but traditional attitudes may prevail within house- term migration (Zagheni et al., 2014); and family holds. If so, there is a mismatch between gender change have been derived from Twitter data (Bil- equity and actual equality, which, McDonald lari et al., 2017). These applications are im- (2000) argues, creates a gender conflict, which portant, and have demonstrated that the combina- eventually leads to lower fertility. Yet, another tion of survey and internet data improve predic- important theoretical concept originates in eco- tive power and the accuracy of the described de- nomics. Economic models are used to explain mographic phenomena. Billari and Zagheni changes in divorce, migration drivers, and fertility (2017) triumphally affirm that the Data Revolu- and so forth. Starting with individual preferences, tion is already here for the study of population behavior come out through a process of decision processes. However, these studies are all ulti- making, where individuals’ (presumed) rational mately about describing demographic processes. evaluations are made in order to maximize their So far, progress in exploiting content analysis of wellbeing. As one moves from survey data to a texts and corpora has been limited, and existing social media corpus, these theoretical concepts of- studies have not yet tackled how social media da- fer both challenges and opportunities. On the one ta can explain the behavioral motivations that hand, new methods, not always familiar to de- drive observed population processes. On this mographers, must be implemented. On the other, point, there is massive potential for synergy be- there is opportunity in the fact that social science tween demography and computational linguistics. theories can show us what one should be looking Certain strands of the social sciences have started for in an otherwise complex and sometimes over- looking in this direction, as there are several ex- whelming amount of data. amples in political science and political economy. 3 Social media linguistic analysis as a 2 Why people’s opinions matter middle ground between qualitative and quantitative analysis In order to exploit social media data to explain the determinants of population processes, one has, One important reason behind the slow progress in perforce, to delve into the behavioral theories the field, is, perhaps, that demographers are more commonly invoked in demographic studies. For confident with the analysis of numbers than with population studies, there is no single theory. In- text: i.e. with quantitative rather than qualitative stead, being an interdisciplinary science, demog- analysis. Or, perhaps, there is still uncertainty and raphers borrow from a host of theoretical concepts suspicion about the extent to which social media from across the social sciences. One example is data can be used to properly infer theoretical con- the Second Demographic Transition theory (Van cepts for demography. Developments are being de Kaa, 1987; Lesthaeghe, 2010), which has been made elsewhere in the social sciences. However, a point of reference in family demography in re- 63 the most prominent examples are based on digit- The challenge lies in how text data can be investi- ized historical texts. The approach taken is similar gated for research questions which require closer to what is being done with social media data, in analysis and nuanced interpretation. But neither the sense that one exploits distributional semantic traditional qualitative approaches requiring the techniques. This is a ‘usage-based’ theory of manual screening and classification of all the ma- meaning built upon similarities of linguistic dis- terial, nor quantitative statistical analysis, can be tributions in a corpus (Lenci 2008), and it allows applied. In this sense, social media data texts pro- for the extraction of (near-) synonyms in a con- vide a middle ground between qualitative studies text-dependent way, for each document and period and
Recommended publications
  • Discussion Notes for Aristotle's Politics
    Sean Hannan Classics of Social & Political Thought I Autumn 2014 Discussion Notes for Aristotle’s Politics BOOK I 1. Introducing Aristotle a. Aristotle was born around 384 BCE (in Stagira, far north of Athens but still a ‘Greek’ city) and died around 322 BCE, so he lived into his early sixties. b. That means he was born about fifteen years after the trial and execution of Socrates. He would have been approximately 45 years younger than Plato, under whom he was eventually sent to study at the Academy in Athens. c. Aristotle stayed at the Academy for twenty years, eventually becoming a teacher there himself. When Plato died in 347 BCE, though, the leadership of the school passed on not to Aristotle, but to Plato’s nephew Speusippus. (As in the Republic, the stubborn reality of Plato’s family connections loomed large.) d. After living in Asia Minor from 347-343 BCE, Aristotle was invited by King Philip of Macedon to serve as the tutor for Philip’s son Alexander (yes, the Great). Aristotle taught Alexander for eight years, then returned to Athens in 335 BCE. There he founded his own school, the Lyceum. i. Aside: We should remember that these schools had substantial afterlives, not simply as ideas in texts, but as living sites of intellectual energy and exchange. The Academy lasted from 387 BCE until 83 BCE, then was re-founded as a ‘Neo-Platonic’ school in 410 CE. It was finally closed by Justinian in 529 CE. (Platonic philosophy was still being taught at Athens from 83 BCE through 410 CE, though it was not disseminated through a formalized Academy.) The Lyceum lasted from 334 BCE until 86 BCE, when it was abandoned as the Romans sacked Athens.
    [Show full text]
  • A Pragmatic Stylistic Framework for Text Analysis
    International Journal of Education ISSN 1948-5476 2015, Vol. 7, No. 1 A Pragmatic Stylistic Framework for Text Analysis Ibrahim Abushihab1,* 1English Department, Alzaytoonah University of Jordan, Jordan *Correspondence: English Department, Alzaytoonah University of Jordan, Jordan. E-mail: [email protected] Received: September 16, 2014 Accepted: January 16, 2015 Published: January 27, 2015 doi:10.5296/ije.v7i1.7015 URL: http://dx.doi.org/10.5296/ije.v7i1.7015 Abstract The paper focuses on the identification and analysis of a short story according to the principles of pragmatic stylistics and discourse analysis. The focus on text analysis and pragmatic stylistics is essential to text studies, comprehension of the message of a text and conveying the intention of the producer of the text. The paper also presents a set of standards of textuality and criteria from pragmatic stylistics to text analysis. Analyzing a text according to principles of pragmatic stylistics means approaching the text’s meaning and the intention of the producer. Keywords: Discourse analysis, Pragmatic stylistics Textuality, Fictional story and Stylistics 110 www.macrothink.org/ije International Journal of Education ISSN 1948-5476 2015, Vol. 7, No. 1 1. Introduction Discourse Analysis is concerned with the study of the relation between language and its use in context. Harris (1952) was interested in studying the text and its social situation. His paper “Discourse Analysis” was a far cry from the discourse analysis we are studying nowadays. The need for analyzing a text with more comprehensive understanding has given the focus on the emergence of pragmatics. Pragmatics focuses on the communicative use of language conceived as intentional human action.
    [Show full text]
  • Mathematics: Analysis and Approaches First Assessments for SL and HL—2021
    International Baccalaureate Diploma Programme Subject Brief Mathematics: analysis and approaches First assessments for SL and HL—2021 The Diploma Programme (DP) is a rigorous pre-university course of study designed for students in the 16 to 19 age range. It is a broad-based two-year course that aims to encourage students to be knowledgeable and inquiring, but also caring and compassionate. There is a strong emphasis on encouraging students to develop intercultural understanding, open-mindedness, and the attitudes necessary for them LOMA PROGRA IP MM to respect and evaluate a range of points of view. B D E I DIES IN LANGUA STU GE ND LITERATURE The course is presented as six academic areas enclosing a central core. Students study A A IN E E N D N DG two modern languages (or a modern language and a classical language), a humanities G E D IV A O L E I W X S ID U IT O T O G E U or social science subject, an experimental science, mathematics and one of the creative IS N N C N K ES TO T I A U CH E D E A A A L F C T L Q O H E S O R I I C P N D arts. Instead of an arts subject, students can choose two subjects from another area. P G E A Y S A E R S It is this comprehensive range of subjects that makes the Diploma Programme a O S E A Y H T demanding course of study designed to prepare students effectively for university entrance.
    [Show full text]
  • Scientific Discovery in the Era of Big Data: More Than the Scientific Method
    Scientific Discovery in the Era of Big Data: More than the Scientific Method A RENCI WHITE PAPER Vol. 3, No. 6, November 2015 Scientific Discovery in the Era of Big Data: More than the Scientific Method Authors Charles P. Schmitt, Director of Informatics and Chief Technical Officer Steven Cox, Cyberinfrastructure Engagement Lead Karamarie Fecho, Medical and Scientific Writer Ray Idaszak, Director of Collaborative Environments Howard Lander, Senior Research Software Developer Arcot Rajasekar, Chief Domain Scientist for Data Grid Technologies Sidharth Thakur, Senior Research Data Software Developer Renaissance Computing Institute University of North Carolina at Chapel Hill Chapel Hill, NC, USA 919-445-9640 RENCI White Paper Series, Vol. 3, No. 6 1 AT A GLANCE • Scientific discovery has long been guided by the scientific method, which is considered to be the “gold standard” in science. • The era of “big data” is increasingly driving the adoption of approaches to scientific discovery that either do not conform to or radically differ from the scientific method. Examples include the exploratory analysis of unstructured data sets, data mining, computer modeling, interactive simulation and virtual reality, scientific workflows, and widespread digital dissemination and adjudication of findings through means that are not restricted to traditional scientific publication and presentation. • While the scientific method remains an important approach to knowledge discovery in science, a holistic approach that encompasses new data-driven approaches is needed, and this will necessitate greater attention to the development of methods and infrastructure to integrate approaches. • New approaches to knowledge discovery will bring new challenges, however, including the risk of data deluge, loss of historical information, propagation of “false” knowledge, reliance on automation and analysis over inquiry and inference, and outdated scientific training models.
    [Show full text]
  • Aristotle's Prime Matter: an Analysis of Hugh R. King's Revisionist
    Abstract: Aristotle’s Prime Matter: An Analysis of Hugh R. King’s Revisionist Approach Aristotle is vague, at best, regarding the subject of prime matter. This has led to much discussion regarding its nature in his thought. In his article, “Aristotle without Prima Materia,” Hugh R. King refutes the traditional characterization of Aristotelian prime matter on the grounds that Aristotle’s first interpreters in the centuries after his death read into his doctrines through their own neo-Platonic leanings. They sought to reconcile Plato’s theory of Forms—those detached universal versions of form—with Aristotle’s notion of form. This pursuit led them to misinterpret Aristotle’s prime matter, making it merely capable of a kind of participation in its own actualization (i.e., its form). I agree with King here, but I find his redefinition of prime matter hasty. He falls into the same trap as the traditional interpreters in making any matter first. Aristotle is clear that actualization (form) is always prior to potentiality (matter). Hence, while King’s assertion that the four elements are Aristotle’s prime matter is compelling, it misses the mark. Aristotle’s prime matter is, in fact, found within the elemental forces. I argue for an approach to prime matter that eschews a dichotomous understanding of form’s place in the philosophies of Aristotle and Plato. In other words, it is not necessary to reject the Platonic aspects of Aristotle’s philosophy in order to rebut the traditional conception of Aristotelian prime matter. “Aristotle’s Prime Matter” - 1 Aristotle’s Prime Matter: An Analysis of Hugh R.
    [Show full text]
  • LING 211 – Introduction to Linguistic Analysis
    LING 211 – Introduction to Linguistic Analysis Section 01: TTh 1:40–3:00 PM, Eliot 103 Section 02: TTh 3:10–4:30 PM, Eliot 103 Course Syllabus Fall 2019 Sameer ud Dowla Khan Matt Pearson pronoun: he or they he office: Eliot 101C Vollum 313 email: [email protected] [email protected] phone: ext. 4018 (503-517-4018) ext. 7618 (503-517-7618) office hours: Wed, 11:00–1:00 Mon, 2:30–4:00 Fri, 1:00–2:00 Tue, 10:00–11:30 (or by appointment) (or by appointment) PREREQUISITES There are no prerequisites for this course, other than an interest in language. Some familiarity with tradi- tional grammar terms such as noun, verb, preposition, syllable, consonant, vowel, phrase, clause, sentence, etc., would be useful, but is by no means required. CONTENT AND FOCUS OF THE COURSE This course is an introduction to the scientific study of human language. Starting from basic questions such as “What is language?” and “What do we know when we know a language?”, we investigate the human language faculty through the hands-on analysis of naturalistic data from a variety of languages spoken around the world. We adopt a broadly cognitive viewpoint throughout, investigating language as a system of knowledge within the mind of the language user (a mental grammar), which can be studied empirically and represented using formal models. To make this task simpler, we will generally treat languages as though they were static systems. For example, we will assume that it is possible to describe a language structure synchronically (i.e., as it exists at a specific point in history), ignoring the fact that languages constantly change over time.
    [Show full text]
  • Computational Linguistics - Nicoletta Calzolari
    LINGUISTICS - Computational Linguistics - Nicoletta Calzolari COMPUTATIONAL LINGUISTICS Nicoletta Calzolari Istituto di Linguistica Computazionale del Consiglio Nazionale delle Ricerche (ILC- CNR), Pisa, Italy Keywords: Computational linguistics, text and speech processing, language technology, language resources. Contents 1. What is Computational Linguistics? 1.1. A few sketches in the history of Computational Linguistics 2. Automatic text processing 2.1. Parsing 2.2. Computational lexicons and ontologies 2.3. Acquisition methodologies 3. Applications 4. Infrastructural Language Resources 4.1. European projects 5. NLP in the global information and knowledge society 5.1. NLP in Europe 5.2. Production and “intelligent” use of the digital content (also multimedia) 6. Future perspectives 6.1. The promotion of national languages in the global society and the new Internet generation Glossary Bibliography Biographical Sketch Summary A brief overview of the field of Computational Linguistics is given. After a few sketches on the short history of the field, we focus on the area of Natural Language Processing providing an outline of some of the main components of computational linguistics systems. It is stressed the important role acquired by Language Resources, such asUNESCO corpora and lexicons, in the last –year s.EOLSS They are the prerequisite and the critical factor for the emergence and the consolidation of the data-driven and statistical approaches, which became undoubtedly predominant in the last decade, and are still the prevailing trendSAMPLE in human language technology. CHAPTERS We point at the fact that linguistic technologies – for analysis, representation, access, acquisition, management of textual information – are more and more used for applications such as: (semi-)automatic translation, summarization, information retrieval and cross-lingual information retrieval, information extraction, question answering, classification of documents, search engines on the web, text/data mining, decision support systems, etc.
    [Show full text]
  • Linguistic Analysis of Natural Language Engineering Requirements Carl Lamar Clemson University, [email protected]
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Clemson University: TigerPrints Clemson University TigerPrints All Theses Theses 12-2009 Linguistic Analysis of Natural Language Engineering Requirements Carl Lamar Clemson University, [email protected] Follow this and additional works at: https://tigerprints.clemson.edu/all_theses Part of the Engineering Mechanics Commons Recommended Citation Lamar, Carl, "Linguistic Analysis of Natural Language Engineering Requirements" (2009). All Theses. 671. https://tigerprints.clemson.edu/all_theses/671 This Thesis is brought to you for free and open access by the Theses at TigerPrints. It has been accepted for inclusion in All Theses by an authorized administrator of TigerPrints. For more information, please contact [email protected]. LINGUISTIC ANALYSIS OF NATURAL LANGUAGE ENGINEERING REQUIREMENTS A Thesis Presented to the Graduate School of Clemson University In Partial Fulfillment of the Requirements for the Degree Master of Science Mechanical Engineering by Carl Lamar December 2009 Accepted by: Dr. Gregory M. Mocko, Committee Chair Dr. Joshua D. Summers Dr. Paul Venhovens ABSTRACT In engineering design, the needs of the customer are expressed through engineering requirement statements. These requirement statements are often expressed using natural language because they are easily created and read. However, there are several problems associated with natural language requirements including but not limited to ambiguity, incompleteness, understandability, testability and over specificity. Several representation and analysis tools have been proposed to address these problems within a requirement statement. These tools include formal languages, such as UML and SysML, requirement management tools, such as IBM Telelogic Doors, and natural language processors such as QuARS.
    [Show full text]
  • “Linguistic Analysis” As a Misnomer, Or, Why Linguistics Is in a State of Permanent Crisis
    Alexander Kravchenko Russia, Baikal National University of Economics and Law, Irkutsk “Linguistic Analysis” as a Misnomer, or, Why Linguistics is in a State of Permanent Crisis Key words: methodology, written-language bias, biology of language, consensual domain, language function Ключевые слова: методология, письменноязыковая предвзятость, биология языка, консенсуальная область, функция языка Аннотация Показывается ошибочность общепринятого понятия лингвистического анализа как анализа естественного языка. Следствием соссюровского структурализма и фило- софии объективного реализма является взгляд на язык как на вещь (код) – систему знаков, используемых как инструмент для коммуникации, под которой понимается обмен информацией. При таком подходе анализ в традиционной лингвистике пред- ставляет собой анализ вещей – письменных слов, предложений, текстов; все они являются культурными артефактами и не принадлежат сфере естественного языка как видоспецифичного социального поведения, биологическая функция которого состоит в ориентировании себя и других в когнитивной области взаимодействий. До тех пор, пока наукой не будет отвергнута кодовая модель языка, лингвистический анализ будет продолжать обслуживать языковой миф, удерживая лингвистику в со- стоянии перманентного кризиса. 1. Th e problem with linguistics as a science There is a growing feeling of uneasiness in the global community of linguists about the current state of linguistics as a science. Despite a lot of academic activity in the fi eld, and massive research refl ected in the ever growing number of publications related to the study of the various aspects of language, there hasn’t been a major breakthrough in the explorations of language as a unique endowment of humans. 26 Alexander Kravchenko Numerous competing theoretical frameworks and approaches in linguistic re- search continue to obscure the obvious truth that there is very little understanding of language as a phenomenon which sets humans apart from all other known biological species.
    [Show full text]
  • International Journal of Quantitative and Qualitative Research Methods
    International Journal of Quantitative and Qualitative Research Methods Vol.8, No.3, pp.1-23, September 2020 Published by ECRTD-UK ISSN 2056-3620(Print), ISSN 2056-3639(Online) CIRCULAR REASONING FOR THE EVOLUTION OF RESEARCH THROUGH A STRATEGIC CONSTRUCTION OF RESEARCH METHODOLOGIES Dongmyung Park 1 Dyson School of Design Engineering Imperial College London South Kensington London SW7 2DB United Kingdom e-mail: [email protected] Fadzli Irwan Bahrudin Department of Applied Art & Design Kulliyyah of Architecture & Environmental Design International Islamic University Malaysia Selangor 53100 Malaysia e-mail: [email protected] Ji Han Division of Industrial Design University of Liverpool Brownlow Hill Liverpool L69 3GH United Kingdom e-mail: [email protected] ABSTRACT: In a research process, the inductive and deductive reasoning approach has shortcomings in terms of validity and applicability, respectively. Also, the objective-oriented reasoning approach can output findings with some of the two limitations. That meaning, the reasoning approaches do have flaws as a means of methodically and reliably answering research questions. Hence, they are often coupled together and formed a strong basis for an expansion of knowledge. However, academic discourse on best practice in selecting multiple reasonings for a research project is limited. This paper highlights the concept of a circular reasoning process of which a reasoning approach is complemented with one another for robust research findings. Through a strategic sequence of research methodologies, the circular reasoning process enables the capitalisation of strengths and compensation of weaknesses of inductive, deductive, and objective-oriented reasoning. Notably, an extensive cycle of the circular reasoning process would provide a strong foundation for embarking into new research, as well as expanding current research.
    [Show full text]
  • Register in Computational Linguistics Arxiv Version
    Computational Register Analysis and Synthesis Shlomo Engelson Argamon Department of Computer Science Illinois Institute of Technology Chicago, IL 60616 [email protected] Abstract The study of register in computational language research has historically been divided into register analysis, seeking to determine the registerial character of a text or corpus, and register synthesis, seeking to generate a text in a desired register. This article surveys the different approaches to these disparate tasks. Register synthesis has tended to use more theoretically articulated notions of register and genre than analysis work, which often seeks to categorize on the basis of intuitive and somewhat incoherent notions of prelabeled “text types”. I argue that a integration of computational register analysis and synthesis will benefit register studies as a whole, by enabling a new large-scale research program in register studies. It will enable comprehensive global mapping of functional language varieties in multiple languages, including the relationships between them. Furthermore, computational methods together with high coverage systematically collected and analyzed data will thus enable rigorous empirical validation and refinement of different theories of register, which will have also implications for our understanding of linguistic variation in general. Keywords: computational linguistics, natural language processing, style, stylistics, text classification 1. Introduction Register, broadly construed, refers to variation in language usage within different
    [Show full text]
  • An Introduction to Syntactic Analysis and Theory
    An Introduction to Syntactic Analysis and Theory Hilda Koopman Dominique Sportiche Edward Stabler 1 Morphology: Starting with words 1 2 Syntactic analysis introduced 37 3 Clauses 87 4 Many other phrases: first glance 101 5 X-bar theory and a first glimpse of discontinuities 121 6 The model of syntax 141 7 Binding and the hierarchical nature of phrase structure 163 8 Apparent violations of Locality of Selection 187 9 Raising and Control 203 10 Summary and review 223 iii 1 Morphology: Starting with words Our informal characterization defined syntax as the set of rules or princi- ples that govern how words are put together to form phrases, well formed sequences of words. Almost all of the words in it have some common sense meaning independent of the study of language. We more or less understand what a rule or principle is. A rule or principle describes a regularity in what happens. (For example: “if the temperature drops suddenly, water vapor will condense”). This notion of rule that we will be interested in should be distinguished from the notion of a rule that is an instruction or a statement about what should happen, such as “If the light is green, do not cross the street.” As linguists, our primary interest is not in how anyone says you should talk. Rather, we are interested in how people really talk. In common usage, “word” refers to some kind of linguistic unit. We have a rough, common sense idea of what a word is, but it is surprisingly difficult to characterize this precisely.
    [Show full text]