Comparative Recall and Precision of Simple and Expert Searches in Google Scholar and Eight Other Databases William H
Total Page:16
File Type:pdf, Size:1020Kb
Comparative Recall and Precision of Simple and Expert Searches in Google Scholar and Eight Other Databases William H. Walters Portal: Libraries and the Academy This document is the final, published version of an article in , vol. 11, no. 4 (October 2011), pp. 971–1006. It is also available from the publisher’s web site at http://muse.jhu.edu/login?auth=0&type=summary&url=/journals/port al_libraries_and_the_academy/v011/11.4.walters.html and at http://www.press.jhu.edu/journals/portal_libraries_and_the_academy/ portal_pre_print/archive/articles/11.4walters.pdf William H. Walters 971 Comparative Recall and Precision of Simple and Expert Searches in Google Scholar and Eight Other Databases William H. Walters abstract: This study evaluates the effectiveness of simple and expert searches in Google Scholar (GS), EconLit, GEOBASE, PAIS, POPLINE, PubMed, Social Sciences Citation Index, Social Sciences Full Text, and Sociological Abstracts. It assesses the recall and precision of 32 searches in the field of later-life migration: nine simple keyword searches and 23 expert searches constructed by demography librarians at three top universities. For simple searches, Google Scholar’s recall and precision are well above average. For expert searches, the relative effectiveness of GS depends on the number of results users are willing to examine. Although Google Scholar’s expert-search performance is just average within the first fifty search results, GS is one of the few databases that retrieves relevant results with reasonably high precision after the fiftieth hit. The results also show that simple searches in GS, GEOBASE, PubMed, and Sociological Abstracts have consistently higher recall and precision than expert searches. This can be attributed not to differences in expert-search effectiveness, but to the unusually strong performance of simple searches in those four databases. ince its introduction in November 2004, Google Scholar (GS) has risen to promi- nence as a major bibliographic database. In a recent survey of more than 3,000 faculty, Google and Google Scholar were together identified as the third most Scommon mechanism for finding information in academic journals, after “searching electronic databases” and “following citations from other journal articles.”1 Nonethe- less, many information professionals have been reluctant to provide systematic access to Google Scholar. In 2005, just 24 percent of North American research libraries included GS in their online database lists, and fewer than twenty percent listed it as a recom- mended internet search engine. Two years later, just 32 percent of OhioLINK libraries portal: Libraries and the Academy, Vol. 11, No. 4 (2011), pp. 971–1006. Copyright © 2011 by The Johns Hopkins University Press, Baltimore, MD 21218. 972 Comparative Recall and Precision mentioned GS on their web sites.2 Although most doctoral research universities now provide access to GS through their web sites and link resolvers, most bachelor’s- and master’s-level institutions do not.3 Librarians’ lack of enthusiasm for Google Scholar may stem, at least partly, from the unconventional methods used to build the database. While most bibliographic databases index particular journals in their entirety, Google Scholar’s coverage is es- sentially document- and publisher-based. Specifically, GS gets its bibliographic records from three sources: (1) freely available web documents that “look scholarly” in their content or format; (2) articles or documents supplied by Google Scholar’s partner agencies: journal publishers, scholarly societies, database vendors, and academic institutions; (3) citations extracted from the reference lists of previously indexed documents.4 Only the records supplied by Google Scholar’s partner agencies are likely to provide consistent coverage of particular journals, and even that journal coverage is not truly comprehensive. As several authors have noted, GS does not index every article avail- able through partner agencies’ web sites.5 Librarians’ reluctance to guide patrons to Google Scholar may also result from a For many purposes, the methods misunderstanding about the database content. Some may regard GS as a subset used to build the database are less of Google, when in fact only records important than the bottom line: Are of type (1) can be found through the regular Google interface. Google Scholar searches effective in For many purposes, the methods identifying relevant documents? used to build the database are less im- portant than the bottom line: Are Google Scholar searches effective in identifying relevant documents? Previous research has shown that for simple keyword searches, GS performs well in comparison with conventional bibliographic databases. Within the field of later-life migration, for example, GS indexes 93 percent of the relevant literature and achieves high recall and precision when a simple keyword search phrase is used.6 Because the GS interface is relatively unsophisticated, however, we might expect that other databases will perform better than GS for more complex searches that draw on expert knowledge and take full advantage of the search features available within each database. This study investigates that possibility, examining Google Scholar’s ef- fectiveness as a tool for serious research. Specifically, it evaluates the performance of GS and eight other databases within the field of later-life migration, an interdisciplinary research area that encompasses elderly migration, retirement migration, post-retirement migration, and related types of geographic mobility. The primary objective of the study is to determine whether GS maintains high recall and precision when expert, rather than simple, searches are conducted. A secondary objective is to identify the databases for which simple or expert searching is especially effective—to explore why expert searching increases the effectiveness of some databases but not others. The paper presents three sets of comparisons: William H. Walters 973 (1) simple searches in GS versus those in the eight other databases; (2) expert searches in GS versus those in the eight other databases; (3) simple versus expert searches within each of the nine database. The first comparison confirms the results of an earlier study.7 The second comparison is based on expert searches constructed by the demography librarians at three major research universities. The third comparison looks at both sets of search results, compar- ing simple and expert searches within each database. Context and Previous Research Early Reviews of Google Scholar Early reviews of GS were generally negative, emphasizing its lack of controlled vo- cabulary and subject headings, its inconsistency in reporting author names and journal titles, its idiosyncratic handling of Boolean operators, and the absence of mechanisms for sorting, marking, manipulating, and exporting search results.8 Several early studies also reported that GS retrieved a relatively high proportion of non-scholarly documents. Joann M. Wleklinski, searching for information on the political scientist Ithiel de Sola Pool, found relatively many citations to sources that were neither peer-reviewed nor authoritative. Susan Gardner and Susanna Eng reached a similar conclusion when searching for papers on homeschooling in GS and three other databases: “There is more variety in Google Scholar and a higher number of results, but they are not necessarily as scholarly or relevant.” Likewise, Burton Callicott and Debbie Vaughn reported that Google Scholar’s retrieval rate, based on the first 100 hits, was somewhat lower than that of EBSCO Academic Search Premier.9 Although serious problems persist—in particular, idiosyncratic search behavior and incomplete or inaccurate bibliographic records10—recent investigations have reported favorably on Google Scholar’s coverage of the scholarly literature. This can perhaps be attributed to the more systematic nature of recent studies and to improvements in GS that have been made over the past few years. Recent evidence also suggests that undergraduates tend to prefer GS over conventional bibliographic databases due to its simplicity, its speed, and its similarity to Google.11 Coverage (Content) of the Google Scholar Database Investigations of Google Scholar’s coverage generally use author or title searches to de- termine whether particular articles are indexed by GS. These studies are concerned not with the effectiveness of the search mechanism, but with the content of the database itself. In the earliest large-scale analysis of this type, Chris Neuhaus and associates gener- ated a random sample of 2,350 journal articles from 47 bibliographic databases, then calculated the proportion of the articles for which records could be found in Google Scholar. GS included citations for sixty percent of the articles, although the results varied dramatically by subject area. On average, GS provided 76 percent coverage in the natural sciences but only 41 percent coverage in education, 39 percent coverage in the social sci- ences, and ten percent coverage in the humanities.12 In a similar investigation, Marilyn Christianson searched for articles published in the top ecology journals, reporting that 974 Comparative Recall and Precision GS included full or partial citations for 89 percent of the 840 sampled articles. Likewise, GS indexed 66 percent of 960 engineering articles selected from Compendex.13 Finally, Philipp Mayr and Anne-Kathrin Walter searched GS for journal titles, counting each journal