Computer-Assisted Topic Classification for Mixed-Methods Social Science Research

Total Page:16

File Type:pdf, Size:1020Kb

Computer-Assisted Topic Classification for Mixed-Methods Social Science Research WITP Computer-Assisted Topic Classification for Mixed-Methods Social Science Research Hillard, Purpura, and Wilkerson Dustin Hillard Stephen Purpura John Wilkerson ABSTRACT. Social scientists interested in mixed-methods research have traditionally turned to human annotators to classify the documents or events used in their analyses. The rapid growth of digi- tized government documents in recent years presents new opportunities for research but also new chal- lenges. With more and more data coming online, relying on human annotators becomes prohibitively expensive for many tasks. For researchers interested in saving time and money while maintaining confi- dence in their results, we show how a particular supervised learning system can provide estimates of the class of each document (or event). This system maintains high classification accuracy and provides accu- rate estimates of document proportions, while achieving reliability levels associated with human efforts. We estimate that it lowers the costs of classifying large numbers of complex documents by 80% or more. KEYWORDS. Topic classification, data mining, machine learning, content analysis, information retrieval, text annotation, Congress, legislation Technological advances are making vast of interest to political scientists, a user of the amounts of data on government activity newly Library of Congress’ THOMAS Web site (http:// available, but often in formats that are of limited thomas.loc.gov) can use its Legislative Indexing value to researchers as well as citizens. In this Vocabulary (LIV) to search for congressional article, we investigate one approach to trans- legislation on a given topic. Similarly, a user of a forming these data into useful information. commercial Internet service turns to a topic clas- “Topic classification” refers to the process of sification system when searching, for example, assigning individual documents (or parts of doc- Yahoo! Flikr for photos of cars or Yahoo! uments) to a limited set of categories. It is Personals for postings by men seeking women. widely used to facilitate search as well as in the Topic classification is valued for its ability to study of patterns and trends. To pick an example limit search results to documents that closely Dustin Hillard is a Ph.D. candidate in the Department of Electrical Engineering, University of Washington. Stephen Purpura is a Ph.D. student in Information Science, Cornell University. John Wilkerson is Associate Professor of Political Science at the University of Washington. This project was made possible with support of NSF grants SES-0429452, SES-00880066, SES-0111443, and SES-00880061. An earlier version of the paper was presented at the Coding Across the Disciplines Workshop (NSF grant SES-0620673). The views expressed are those of the authors and not the National Science Foundation. We thank Micah Altman, Frank Baumgartner, Matthew Baum, Jamie Callan, Claire Cardie, Kevin Esterling, Eduard Hovy, Aleks Jakulin, Thorsten Joachims, Bryan Jones, David King, David Lazer, Lillian Lee, Michael Neblo, James Purpura, Julianna Rigg, Jesse Shapiro, and Richard Zeckhauser for their helpful comments. Address correspondence to: John Wilkerson, Box 353530, University of Washington, Seattle, WA 98195 (E-mail: [email protected]). Journal of Information Technology & Politics, Vol. 4(4) 2007 Available online at http://jitp.haworthpress.com © 2007 by The Haworth Press. All rights reserved. doi:10.1080/19331680801975367 31 32 JOURNAL OF INFORMATION TECHNOLOGY & POLITICS match the user’s interests, when compared to approximately 379,000 congressional bill less selective keyword-based approaches. How- titles that trained human researchers have ever, a central drawback of these systems is assigned to one of 20 major topic and 226 their high costs. Humans—who must be trained subtopic categories, with high levels of inter- and supervised—traditionally do the labeling. annotator reliability.1 We draw on this corpus Although human annotators become somewhat to test several supervised learning algorithms more efficient with time and experience, the that use case-based2 or “learning by example” marginal cost of coding each document does methods to replicate the work of human anno- not really decline as the scope of the project tators. We find that some algorithms perform expands. This has led many researchers to ques- our particular task better than others. How- tion the value of such labor-intensive ever, combining results from individual approaches, especially given the availability of machine learning methods increases accuracy computational approaches that require much beyond that of any single method, and provides less human intervention. key signals of confidence regarding the Yet there are also good reasons to cling to a assigned topic for each document. We then proven approach. For the task of topic classifi- show how this simple confidence estimate cation, computational approaches are useful can be employed to achieve additional classifi- only to the extent that they “see” the patterns cation accuracy more efficiently than would that interest humans. A computer can quickly otherwise be possible. detect patterns in data, such as the number of E’s in a record. It can then very quickly orga- nize a dataset according to those patterns. But computers do not necessarily detect the patterns TOPIC CLASSIFICATION FOR that interest researchers. If those patterns are SOCIAL SCIENCE DOCUMENT easy to objectify (e.g., any document that men- RETRIEVAL tions George W. Bush), then machines will work well. The problem, of course, is that many Social scientists are interested in topic clas- of the phenomena that interest people defy sim- sification for two related reasons: retrieving ple definitions. “Bad” can mean good—or individual documents and tracing patterns and bad—depending on the context in which it is trends in issue-related activity. Mixed-method used. Humans are simply better at recognizing studies that combine pattern analyses with such distinctions, although computerized meth- case-level investigations are becoming stan- ods are closing the gap. dard, and linked examples are often critical to Technology becomes increasingly attractive persuading readers to accept statistical find- as the size and complexity of a classification ings (King, Keohane, & Verba, 1994). In Soft task increase. But what do we give up in terms News Goes to War, for example, Baum (2003) of accuracy and reliability when we adopt a draws on diverse corpora to analyze media cov- particular automated approach? In this article, erage of war (e.g., transcripts of “Entertainment we begin to investigate this accuracy/efficiency Tonight,” the jokes of John Stewart’s “The tradeoff in a particular context. We begin by Daily Show,” and network news programs). describing the ideal topic classification system Keyword searches are fast and may be effec- where the needs of social science researchers tive for the right applications, but effective key- are concerned. We then review existing appli- word searches can also be difficult to construct cations of computer-assisted methods in politi- without knowing what is actually in the data. A cal science before turning our attention to a search that is too narrow in scope (e.g., “renew- method that has generated limited attention able energy”) will omit relevant documents, within political science to date: supervised while one that is too broad (e.g., “solar”) learning systems. will generate unwanted false positives. In fact, The Congressional Bills Project (http:// most modern search engines, such as Google, www.congressionalbills.org) currently includes consciously reject producing a reasonably Hillard, Purpura, and Wilkerson 33 comprehensive list of results related to a topic tive attention with shifts in coding protocol as a design criterion.3 (Baumgartner, Jones, & Wilkerson, 2002). Many political scientists rely on existing So, what type of topic classification system databases where humans have classified events best serves the needs of social scientists? If the (decisions, votes, media attention, legislation) goals are to facilitate trend tracing and docu- according to a predetermined topic system (e.g., ment search, an ideal system possesses the fol- Jones & Baumgartner, 2005; Poole & lowing characteristics. First, it should be Rosenthal, 1997; Rohde, 2004; Segal & Spaeth, discriminating. By this we mean that the topic 2002). In addition to enabling scholars to study categories are mutually exclusive and span the trends and compare patterns of activity, reliable entire agenda of topics. The search requires that topic classification can save considerable the system indicate what each document is pri- research time. For example, Adler and Wilkerson marily about, while trend tracing is made more (2008) wanted to use the Congressional Bills difficult if the same document is assigned to Project database to study the impact of congres- multiple categories. Second, it should be sional reforms. To do this, they needed to trace accurate. The assigned topic should reflect the how alterations in congressional committee document’s content, and there should be a sys- jurisdictions affected bill referrals. The fact that tematic way of assessing accuracy. Third, the every bill during the years of interest had already ideal system should be reliable. Pattern and been annotated for topic allowed them to reduce trend tracing require that similar documents be the number of bills that had to be individually classified
Recommended publications
  • An Introduction to the Joint Principles for Data Citation
    RDAP Review EDITOR’S SUMMARY While the conventions of An Introduction to the Joint Principles for Data Citation bibliographic citation have been by Micah Altman, Christine Borgman, Mercè Crosas and Maryann Martone long established, the sole focus is on reference to other scholarly 3 works. Access to the data serving NOTE: This article summarizes and extends a longer report r of the manuscript must be available to any reader of Science ” e b as the basis for scholarly work has published as [ 1]. Contributors are listed in alphabetical order. We and that “ citations to unpublished data [emphasis added] m u been limited. Data citation extends describe contributions to the paper using a standard taxonomy N and personal communications cannot be used to support , described in [ 2]. Micah Altman and Mercè Crosas were the lead 1 important access to material that 4 authors, taking equal responsibility for revisions and authoring claims in a published paper” [ 4]. Too often, however, this e has been largely unavailable for m the first draft of the manuscript from which this is derived. All u proscription and others like it have been honored only in the l o sharing, verification and reuse. The authors contributed to the conception of the Force 11 principles V breach. Few research articles provide access to the data on – Joint Declaration of Data Citation discussed, to the methodology, to the project administration 5 1 and to the writing through critical review and commentary. which they are based, nor specific citations to data on which 0 Principles, finalized in February 2 the findings rely, nor protocols, algorithms, code or other h 2014, is a formal statement pulling c r a together practices used in the ata citation is rapidly emerging as a key practice technology necessary to reproduce, reuse or extend results.
    [Show full text]
  • Case 1:18-Cv-04727-ELR Document 17-11 Filed 10/19/18 Page 1 of 28
    Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 1 of 28 EXHIBIT 10 Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 2 of 28 Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 3 of 28 Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 4 of 28 Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 5 of 28 Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 6 of 28 Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 7 of 28 Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 8 of 28 Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 9 of 28 Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 10 of 28 Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 11 of 28 Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 12 of 28 Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 13 of 28 Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 14 of 28 Case 1:18-cv-04727-ELR Document 17-11 Filed 10/19/18 Page 15 of 28 Dr. Michael P.
    [Show full text]
  • 00017-89121.Pdf (110.8
    March 19, 2014 To: The Federal Trade Commision Re: Mobile Device Tracking From: Micah Altman, Director of Research, MIT Libraries; Non Resident Senior Fellow, Brookings Institution I appreciate the opportunity to contribute to the FTC’s considerations of Mobile Device Tracking. These comments address selected privacy risks and mitigation methods. Our perspective is informed by substantial advances in privacy science that have been made in the computer science literature and by recent research conducted by the members of the Privacy Tools for Sharing Research Data project at Harvard University.1 Scope of information The speakers focussed primarily on businesses use of mobile devices to track consumers’ movements through retail stores and nearby environments. However, as noted in the comments made by the Center for Digital Democracy [CDD 2014] and in the workshop discussion (as documented in the transcript), the the general scope of mobile information tracking businesses and third parties extends far beyond this scenario. Based on the current ability for third parties to collect location information from mobile phones alone, third parties have the potential to collect extensive, fine grained, continuous and identifiable records of a persons location and movement history, accompanied with a partial record of other devices (potentially linked to people) encountered over that history. Information sensitivity Generally, information policy should treat information as sensitive when that information, if linked to a person, even partially or probabilistically, is likely to cause substantial harm. There is a broad range of informational harms that are recognized by regulation and by researchers and 1 The Privacy Tools for Sharing Research Data project is a National Science Foundation funded collaboration at Harvard University involving the Center for Research on Computation and Society, the Institute for Quantitative Social Science, the Berkman Center for Internet & Society, and the Data Privacy Lab.
    [Show full text]
  • Numerical Issues in Statistical Computing for the Social Scientist
    Numerical Issues in Statistical Computing for the Social Scientist MICAH ALTMAN JEFF GILL MICHAEL P. McDONALD A JOHN WILEY & SONS, INC., PUBLICATION Numerical Issues in Statistical Computing for the Social Scientist ii WILEY SERIES IN PROBABILITY AND STATISTICS Established by WALTER A. SHEWHART and SAMUEL S. WILKS Editors: David J. Balding, Noel A. C. Cressie, Nicholas I. Fisher, Iain M. Johnstone, J. B. Kadane, Louise M. Ryan, David W. Scott, Adrian F. M. Smith, Jozef L. Teugels; Editors Emeriti: Vic Barnett, J. Stuart Hunter, David G. Kendall A complete list of the titles in this series appears at the end of this volume. Numerical Issues in Statistical Computing for the Social Scientist MICAH ALTMAN JEFF GILL MICHAEL P. McDONALD A JOHN WILEY & SONS, INC., PUBLICATION Copyright c 2004 by John Wiley & Sons, Inc. All rights reserved. Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400, fax 978-750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, e-mail: [email protected].
    [Show full text]
  • Numerical Issues in Statistical Computing for the Social Scientist
    JSS Journal of Statistical Software April 2005, Volume 12, Book Review 5. http://www.jstatsoft.org/ Reviewer: Frauke Kreuter University of Maryland, College Park Numerical Issues in Statistical Computing for the Social Scientist Micah Altman, Jeff Gill, Michael P. McDonald John Wiley & Sons, Hoboken, NJ, 2004. ISBN 0-471-23633-0. xv + 323 pp. $94.95. http://www.hmdc.harvard.edu/numerical_issues/ This is a very interesting book in an area that hasn’t gotten much attention in the so- cial sciences, but can expect to have more than just a niche audience with the increase of maximum-likelihood-based applications, a heightened interest in simulations, and a general appreciation for computational statistics. Numerical Issues in Statistical Computing for the Social Scientist is the right book for any social scientist who has stumbled upon error mes- sages relating to convergence problems, non-invertible, or ill-conditioned matrices, and who is not just interested in some rough guidance on what to watch out for, but rather wants to understand the source of these problems down to effects of errors in floating point arithmetic. Micah Altman, Jeff Gill, and Michael P. McDonald state in their preface that this book is intended to serve multiple purposes: Introducing new principles, algorithms and solutions while at the same time serving as a guide to statistical computing. There is a benefit to including new research results in a guidebook, but with it comes the challenge to find the right level of difficulty. As a result, the book seems bimodal, sometimes missing the intermediate applied researcher, who writes modest programs within a given statistical package.
    [Show full text]
  • Micah Altman
    Micah Altman Curriculum Vitae Institute for Quantitative Social Science, Harvard University 1737 Cambridge Street, Cambridge, MA 02138 Phone: (585) 466-4224 Fax: (617) 963-7370 E-mail: [email protected] URL: http://futurelib.org Education Harvard University 1999-2001 Post-doctoral Research Fellow, Department of Government California Institute of Technology 1998 Ph.D., Social Sciences Brown University 1989 B.A., Magna Cum Laude*, Computer Science B.A., Magna Cum Laude*, Ethics and Political Philosophy [*Highest distinction awarded by the University] Honors & Senior Fellow, Information Technology & Politics Section, 2011 Awards American Political Science Association Best Research Software 2009 American Political Science Association (ITP) for BARD Library Technology Excellence Award, Honorable Mention 2009 IGI Global (for The Henry A. Murray Archive) Listed in (Marquis) 2003-4, 2009 Who's Who in America, 57th, 58th, 63rd Edition Best Research Software Award, 2005 American Political Science Association (ITP); (for the VDC System) Annual Meeting Enrichment Fund Award, 2001 Association of American Geographers Best Political Science Research Website, 1999 (for The Record of American Democracy website) Outstanding Dissertation Award, 1999 Western Political Science Association (for the best political science dissertation) Weaver Award, Representation and Electoral Systems Section, 1998 American Political Science Association (for best paper presented at previous meeting) Pre-doctoral Fellowship 1996-7 Harvard-MIT Research Training Group in Political Economy John Randolf Haynes and Dora Haynes Fellowship 1995-6 Anna and James McDonnell Memorial Fellowship 1994-5 Phi Beta Kappa 1989 9/15/2011 Sigma Xi 1989 Research Senior Research Scientist 2006-Present Positions Institute for Quantitative Social Science, Harvard University Archival Director 2007-Present Henry A.
    [Show full text]
  • Open Data, Transparency and Redistricting in Mexico Alejandro Trelles, Micah Altman, Eric Magar and Michael P
    Open Data, Transparency and Redistricting in Mexico Alejandro Trelles, Micah Altman, Eric Magar and Michael P. McDonald* Abstract: The many complaints and protests by citizens generated by the deterioration of the political elite in recent decades are clear evidence, among other things, of the urgent need to strengthen the connections between citizens and their representatives. To this end, the delimitation of the electoral boundaries —also known as redistricting— is key to improve political representation. Given the many technicalities involved in this processes —geographic, statistical, digital, among the most obvious— it is easy to succumb to the temptation of relegating it to specialists and lose sight of its importance for democracy. From our perspective, the use of new technologies, as well as the generation and use of open data, offer an opportunity to strengthen political representation. In this article we discuss Mexico’s redistricting experience, the challenges in terms of transparency, and *Alejandro Trelles is a doctoral candidate in Political Science at the University of Pittsburgh, 4600 Wesley W. Posvar Hall, Pittsburgh, PA, 15260. Tel: +1(412) 979 07 15. E-mail: [email protected]. Micah Altman is director of research in the program on Information Science at the Massachusetts Institute of Technology (MIT), E25-131, 77 Massachusetts Ave, Cambridge, Massachusetts, 02139. Tel: +1(585) 466 42 24. E-mail: [email protected]. Eric Magar is a full-time professor of Political Science at the Instituto Tecnológico Autónomo de México (ITAM), Río Hondo 1, Progreso Tiza- pán, México, D.F., 01000. Tel: +52(55) 56 28 40 79. E-mail: [email protected].
    [Show full text]
  • Practical Approaches to Big Data Privacy Over Time1 Micah Altman,2 Alexandra Wood,3 David R
    Practical Approaches to Big Data Privacy Over Time1 Micah Altman,2 Alexandra Wood,3 David R. O’Brien4 & Urs Gasser5 DRAFT (November 6, 2016) Abstract. Increasingly, governments and businesses are collecting, analyzing, and sharing detailed information about individuals over long periods of time. Vast quantities of data from new sources and novel methods for large-scale data analysis promise to yield deeper understanding of human characteristics, behavior, and relationships and advance the state of science, public policy, and innovation. At the same time, the collection and use of fine-grained personal data over time is associated with significant risks to individuals, groups, and society at large. In this article, we examine a range of long- term data collections, conducted by researchers in social science, in order to identify the characteristics of these programs that drive their unique sets of risks and benefits. We also examine the practices that have been established by social scientists to protect the privacy of data subjects in light of the challenges presented in long-term studies. We argue that many uses of big data, across academic, government, and industry settings, have characteristics similar to those of traditional long-term research studies. In this article, we discuss the lessons that can be learned from longstanding data management practices in research and potentially applied in the context of newly emerging data sources and uses. 1. Corporations and governments are collecting data more frequently, and collecting, storing, and using it for longer periods. Commercial and government actors are collecting, storing, analyzing, and sharing increasingly greater quantities of personal information about individuals over progressively long periods of time.
    [Show full text]
  • Online Software Puts Redistricting Tools in the Hands of the Public by Micah Altman, Ph.D., and Michael P
    Feature The Dawn of Do-It-Yourself Redistricting? Online Software Puts Redistricting Tools in the Hands of the Public By Micah Altman, Ph.D., and Michael P. McDonald, Ph.D. “T here is only one way to do reapportionment— feed into the computer all the factors except political registration.” – Then-Governor Ronald Reagan quoted in the Los Angeles Times, Jan. 21, 1972 “ With sophisticated computer programs, politicians can draw lines to maximize precisely their party’s representation and minimize the other’s. The result is sham legislative elections in which fewer and fewer seats are competitive and moderates of both parties get squeezed out of office.” – Washington Post editorial, May 2, 2004 rizona’s 2nd congressional district is one of the most bizarrely shaped in the country. It consists of a head-shaped chunk in the state’s eastern half attached by a long, thin neck the width of Athe Colorado River as it snakes for a hundred miles through the Grand Canyon to a body-shaped portion that hugs the state’s border with California. Indeed, for sheer formal chutzpah, it may just outdo the salamander-shaped 19th- century Massachusetts district drawn to benefit Governor Elbridge Gerry and dubbed a “gerrymander” by critics. Un- like most gerrymandered districts, though, the rationale for 38 Campaigns & Elections | January 2011 the odd shape of Arizona’s 2nd is not to protect an incumbent or to favor a partic- ular party, but to separate members of the small Hopi Tribe from their longstanding, more numerous rivals in the surrounding Navajo Tribe. The desire to protect the voice of com- munities is just one among a number of competing concerns that factor into draw- Kingman Tempe ing congressional districts.
    [Show full text]
  • Public Comment Meeting Public Access to Federally Supported R&D
    Public Comment Meeting concerning Public Access to Federally Supported R&D Data 16 May 2013, 9 a.m. - 5 p.m. 17 May 2013, 9 a.m. - 12 p.m. NATIONAL ACADEMY OF SCIENCES 2101 Constitution Avenue, NW Washington DC 20418 and http://sites.nationalacademies.org/DBASSE/DBASSE_083053 Sponsored By Draft Agenda Thursday, 16 May 2013 0900 Opening of Meeting Lawrence D. Brown (NAS) Professor Department of Statistics, The Wharton School University of Pennsylvania 0915 Connecting Research, Publications, and Micah Altman Evidence: The Lifecycle and Institutional Director of Research and Head Scientist Ecology of Data Program on Information Science Massachusetts Institute of Technology Libraries 0945 Why Public Access to Data Is So Important Victoria Stodden Assistant Professor of Statistics Columbia University 1015 Break 1045 PUBLIC COMMENT SESSION 1 Universities Libraries Researchers and Students 1200 Lunch 1400 PUBLIC COMMENT SESSION 2 Publishers Private Individuals Federal Employees 1530 Break PUBLIC COMMENT SESSION 3 Walk-in Registrants, if any 1700 Meeting closes for the day Friday, 17 May 2013 0900 Opening of Meeting Lawrence Brown 0905 PUBLIC COMMENT SESSION 4 Digital Repositories and Organizations 1045 Break 1100 Rapporteur’s Report John L. King W. W. Bishop Professor School of Information University of Michigan 1145 Next Steps and Adjournment Lawrence Brown LAWRENCE D. BROWN is Miers Busch professor in the Department of Statistics, The Wharton School, University of Pennsylvania. He is an expert in statistical foundations, conditional inference, sequential methods, exponential families, and decision theory. He is a member of the National Academy of Sciences, and a fellow of both the Institute of Mathematical Statistics and the American Statistical Association.
    [Show full text]
  • Michael P. Mcdonald Dr
    Michael P. McDonald Dr. Michael P. McDonald is Assistant Professor of Government and Politics in the Public and International Affairs Department at George Mason University, and a visiting fellow in Governance Studies at the Brookings Institution. His research interests include voting behavior, redistricting, Congress, American political development, and political methodology. His research appears in American Political Science Review, American Journal of Political Science, Journal of Politics, Political Analysis, State Politics and Policy Quarterly, and PS: Political Science and Politics. He is currently working on a book with Micah Altman and Jeff Gill entitled Statistical Computing for the Social Sciences. His voter turnout research shows that turnout has not been declining, the ineligible population is rising. This research has been featured in the popular press, and he provides voter turnout numbers to academia and the media. On the practical side of politics, McDonald has worked on campaign staff for state legislative campaigns in California, has worked for national polling firms, and has worked as a redistricting consultant in Alaska, Arizona, California, and Michigan. McDonald received his B.S. of Economics from California Institute of Technology and his Ph.D. of Political Science from University of California, San Diego in 1999. He was awarded a one-year post-doctoral fellowship at the Harvard-MIT Data Center. He held a one-year Assistant Professor appointment at Vanderbilt University, and was Assistant Professor of Political Studies at University of Illinois, Springfield before moving to George Mason University. | 1775 Massachusetts Avenue, NW, Washington, DC 20036 | 202.797.6000 | fax 202.797.6004 | brookings edu .
    [Show full text]
  • Laws of Information Economics and Scholarly Publishing
    Information Services & Use 35 (2015) 57–70 57 DOI 10.3233/ISU-150775 IOS Press Information wants someone else to pay for it: Laws of information economics and scholarly publishing Micah Altman a,∗ and Marguerite Avery b,c a Director of Research, MIT Libraries, Massachusetts Institute of Technology, Cambridge, MA, USA E-mail: [email protected] b Director of Scholarly Communication at Hypothes.is, San Francisco, CA, USA c Research Affiliate at Program on Information Science, MIT Libraries, Massachusetts Institute of Technology, Cambridge, MA, USA E-mail: [email protected] Abstract. The increasing volume and complexity of research, scholarly publication, and research information puts an added strain on traditional methods of scholarly communication and evaluation. Information goods and networks are not standard market goods – and so we should not rely on markets alone to develop new forms of scholarly publishing. The affordances of digital information and networks create many opportunities to unbundle the functions of scholarly communication – the central challenge is to create a range of new forms of publication that effectively promote both market and collaborative ecosystems. Keywords: Information, economics, scholarly publishing, affordances 1. The state of scholarly communication: More and less This year marks the sesquarcentenial anniversary of Philosophical Transactions of the Royal Society, a notable milestone upon which to reflect upon the evolution of the scholarly communication ecosystem. In the beginning, three hundred and fifty years ago, scholarly communication was a relatively straight- forward proposition that entailed the collecting of individual research writings into a volume that was reproduced or published. This new method of communicating research findings to a dedicated audience was a vast improvement over the circulation of individual letters written by individual scholars on in- dividual topics.
    [Show full text]