Exploring the Feasibility of Applying Data Mining for Library Reference Service Improvement a Case Study of Turku Main Library
Total Page:16
File Type:pdf, Size:1020Kb
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by National Library of Finland DSpace Services Ming Zhan ([email protected]) Exploring the Feasibility of Applying Data Mining for Library Reference Service Improvement A Case Study of Turku Main Library Master thesis in Information and Knowledge Management Master’s Programme Supervisors: Gunilla Widén Faculty of Social Sciences, Business and Economics Åbo Akademi University Åbo 2016 Abstract Data mining, as a heatedly discussed term, has been studied in various fields. Its possibilities in refining the decision-making process, realizing potential patterns and creating valuable knowledge have won attention of scholars and practitioners. However, there are less studies intending to combine data mining and libraries where data generation occurs all the time. Therefore, this thesis plans to fill such a gap. Meanwhile, potential opportunities created by data mining are explored to enhance one of the most important elements of libraries: reference service. In order to thoroughly demonstrate the feasibility and applicability of data mining, literature is reviewed to establish a critical understanding of data mining in libraries and attain the current status of library reference service. The result of the literature review indicates that free online data resources other than data generated on social media are rarely considered to be applied in current library data mining mandates. Therefore, the result of the literature review motivates the presented study to utilize online free resources. Furthermore, the natural match between data mining and libraries is established. The natural match is explained by emphasizing the data richness reality and considering data mining as one kind of knowledge, an easy choice for libraries, and a wise method to overcome reference service challenges. The natural match, especially the aspect that data mining could be helpful for library reference service, lays the main theoretical foundation for the empirical work in this study. Turku Main Library was selected as the case to answer the research question: whether data mining is feasible and applicable for reference service improvement. In this case, the daily visit from 2009 to 2015 in Turku Main Library is considered as the resource for data mining. In addition, corresponding weather conditions are collected from Weather Underground, which is totally free online. Before officially being analyzed, the collected dataset is cleansed and preprocessed in order to ensure the quality of data mining. Multiple regression analysis is employed to mine the final dataset. Hourly visits are the independent variable and weather conditions, Discomfort Index and seven days in a week are dependent variables. In the end, four models in different seasons are established to predict visiting situations in each season. Patterns are realized in different seasons and implications are created based on the discovered patterns. In addition, library-climate points are generated by a clustering method, which simplifies the process for librarians using weather data to forecast library visiting situation. Then the data mining result is interpreted from the perspective of improving reference service. After this data mining work, the result of the case study is presented to librarians so as to collect professional opinions regarding the possibility of employing data mining to improve reference services. In the end, positive opinions are collected, which implies that it is feasible to utilizing data mining as a tool to enhance library reference service. Key words: Data mining, Library reference service, Service improvement, Case study 1 Content ABSTRACT 1 1. INTRODUCTION 5 1.1 THE MOTIVATION OF THE THESIS WORK 6 1.2 THE AIM OF THE THESIS WORK 7 1.3 THE ORGANIZATION OF THE THESIS WORK 8 2. LITERATURE REVIEW 9 2.1 DATA MINING IN THE LIBRARY ENVIRONMENT 9 2.1.1 THE MEANING OF DATA MINING IN LIBRARIES 9 2.1.2 DIFFERENT FORMS OF DATA MINING IN LIBRARIES 10 2.1.3 THE BENEFITS AND ISSUES OF DATA MINING FOR LIBRARIES 13 2.1.4 THE PROCESS OF LIBRARY DATA MINING 16 2.2 LITERATURE REVIEW ON LIBRARY REFERENCE SERVICE 17 2.2.1 THE DEFINITION OF REFERENCE SERVICE 17 2.2.2 THREE MAIN FORMS OF REFERENCE SERVICE 18 2.2.3 MAIN CHALLENGES FOR REFERENCE SERVICE 22 2.2.4 STUDIES ON IMPROVING REFERENCE SERVICE 25 3. THE NATURAL MATCH BETWEEN DATA MINING AND THE LIBRARY 27 3.1 DATA MINING AS KNOWLEDGE 27 3.2 GOING FOR DATA MINING: AN EASY CHOICE FOR LIBRARIES 28 3.3 LIBRARIES SHIFTING FROM DATA POOR TO DATA RICH 31 3.4 DATA MINING: A WISE METHOD TO CONFRONT REFERENCE SERVICE CHALLENGES 33 4. METHODOLOGY 36 4.1 DATA COLLECTION 37 4.2 DATA CLEANSING 38 4.3 DATA PREPROCESSING 40 5. RESULT OF DATA MINING 44 5.1 THE VISITING SITUATION IN WINTER 44 5.2 THE VISITING SITUATION IN SPRING 46 5.3 THE VISITING SITUATION IN SUMMER 48 5.4 THE VISITING SITUATION IN AUTUMN 50 5.5 MODEL CHECKING AND SUMMARY 52 5.6 CLASSIFYING DISCOMFORT INDEX IN THE LIBRARY CONTEXT 55 2 5.7 THE INTERPRETATION OF DATA MINING RESULT 56 5.8 EVALUATION BY LIBRARIANS 58 5. DISCUSSION AND CONCLUSION 60 6. EXPECTATIONS FOR FUTURE STUDIES 62 REFERENCES 63 3 Table 1: The result of data mining and library related key words in Web of Science ...................... 6 Table 2: The breaking point of seasons in Turku from 2009 to 2014 .............................................. 38 Table 3: Various discomfort conditions ............................................................................................... 39 Table 4: Weather condition types and frequencies in Turku from 2009.01.01-2015.06.23 ......... 40 Table 5: Weather condition after merging groups .............................................................................. 41 Table 6: The result of dependent normality test................................................................................. 41 Table 7: The summary of nonparametric tests ................................................................................... 42 Table 8: Dummy coding for seven days in week ................................................................................. 42 Table 9: Dummy coding for weather conditions ................................................................................. 43 Table 10: The descriptive statistics and correlation between variables in winter......................... 44 Table 11: The regression model summary in winter ......................................................................... 45 Table 12: The coefficients of regression model in winter .................................................................. 46 Table 13: The descriptive statistics and correlation between variables in spring ......................... 47 Table 14: The regression model summary in spring .......................................................................... 47 Table 15: The coefficients of regression model in spring .................................................................. 48 Table 16: The descriptive statistics and correlation between variables in summer ...................... 49 Table 17: The regression model summary in summer ....................................................................... 49 Table 18: The coefficients of regression model in summer ............................................................... 50 Table 19: The descriptive statistics and correlation between variables in autumn ....................... 51 Table 20: The regression model summary in autumn ....................................................................... 51 Table 21: The coefficients of regression model in autumn ................................................................ 52 Table 22: Model Summary ..................................................................................................................... 55 Table 23: Library-Climate points in winter, summer and autumn ................................................... 56 Table 24: Interview summary ............................................................................................................... 58 Figure 1: The relationship between bibliomining and four data mining types in the library ....... 13 Figure 2: The process of library data mining ...................................................................................... 16 Figure 3: The connection between three reference service elements .............................................. 19 Figure 4: The example of homepage Q&A based reference service .................................................. 21 Figure 5: The development of Web ....................................................................................................... 28 Figure 6: The development of library model ....................................................................................... 30 Figure 7: Illustrating the natural match between data mining and the library .............................. 35 Figure 8: Model checking in winter ...................................................................................................... 53 Figure 9: Model checking in spring ....................................................................................................... 53 Figure 10: Model checking in summer ................................................................................................. 53 Figure 11: Model checking in autumn .................................................................................................. 54 4 1. Introduction Libraries,