Best Practices for Google Analytics in Digital Libraries

Total Page:16

File Type:pdf, Size:1020Kb

Best Practices for Google Analytics in Digital Libraries Best Practices for Google Analytics in Digital Libraries Authored by the Digital Library Federation Assessment Interest Group Analytics working group Molly Bragg, Duke University Libraries Joyce Chapman, Duke University Libraries Jody DeRidder, University of Alabama Libraries Rita Johnston, University of North Carolina at Charlotte Ranti Junus, Michigan State University Martha Kyrillidou, Association of Research Libraries Eric Stedfeld, New York University September 2015 The purpose of this white paper is to provide digital libraries with guidelines that maximize the effectiveness and relevance of data collected through the Google Analytics service for assessment purposes. The document recommends tracking 14 specific metrics within Google Analytics, and provides library-centric examples of how to employ the resulting data in making decisions and setting institutional goals and priorities. The guidelines open with a literature review, and also include theoretical and structural methods for approaching analytics data gathering, examples of platform specific implementation considerations, Google Analytics set-up tips and terminology, as well as recommended resources for learning more about web analytics. The DLF Assessment Interest Group Analytics working group, which produced this white paper, looks forward to receiving feedback and additional examples of using the recommended metrics for digital library assessment activities. 2 Table of contents Section I: Introduction and Literature Review A. Introduction B. Literature Review Section II: Google Analytics Prerequisites A. Learn About Google Analytics Practices and Policies B. Understand Local Digital Library Infrastructure C. Goal Setting Section III: Recommended Metrics to Gather A. Content Use and Access Counts 1. Content Use and Access Counts Defined 2. Site Content Reports 3. Bounce Rate 4. Download Counts 5. Time 6. Pageviews 7. Sessions B. Audience Metrics 1. Location 2. Mode of Access 3. Network Domain 4. Users C. Navigational Metrics 1. Path Through the Site 2. Referral Traffic 3. Search Terms Section IV: Additional Metrics and Custom Approaches A. Dashboards and Custom Reports B. Event Tracking C. Goals and Conversions Section V: Examples of Platform-specific Considerations A. CONTENTdm B. DLXS Section VI: Tips for Google Analytics Account Setup A. Terminology 1. Properties 2. Views Section VII: Further Resources on Google Analytics Section VIII: Conclusions and Next Steps Appendix Other Methods for Collecting Analytics Data Bibliography 3 Section I: Introduction and Literature Review A. Introduction Libraries invest significant resources in providing online access to scholarly resources, whether through digitization, detailed cataloging and metadata production, or other methods. Once digital materials are published online, resource managers can take a number of approaches to assessing a digital library program. Qualitative methods include web usability studies, focus groups, surveys, and anecdotal information gathering. Quantitative methods to assess digital libraries often center on tracking information about website visitors and their actions through some means of web traffic monitoring, also known as web analytics. This paper will focus on best practices for collecting and using web analytics data in digital libraries, specifically data gathered through Google Analytics.1 ​ ​ While this paper focuses on web analytics, collecting any type of assessment data and using it to inform decision-making can: ● increase understanding of the return on the investment of digital libraries; ● provide more information about users, use, cost, impact, and value; ● help guide improvement of digital library services and the user experience; ● assist in decision-making and strategic focus. No single type of data or technique can assess every aspect of a digital library; analytics are a single piece of a larger assessment puzzle. This document is intended for digital library managers and curators who want to use analytics to understand more about users of, access to, and use of digital library materials. The Digital Library Federation Assessment Interest Group (DLF AIG) is using Matusiak’s definition of a digital library as “the collections of digitized or digitally born items that are stored, managed, serviced, and preserved by libraries or cultural heritage institutions, excluding the digital content purchased from publishers.”2 The authors hope to pave the way for cross-institutional resource managers to share benchmarkable and comparable analytics. Their intention is for the information in this paper to evolve over time as more institutions utilize and enhance these guidelines. We chose to limit our scope to Google Analytics because many libraries use this tool, and because our task needed to be scoped in order to be attainable.3 It is important to be aware, however, that changes in technology may cause fluctuation in the usefulness of any tool -- including Google Analytics -- in the future. If there is enough community interest and volunteers, 1 "Google Analytics," accessed August 4, 2015, http://www.google.com/analytics/. 2 Matusiak, K. (2012). Perceptions of usability and usefulness of digital libraries. International Journal of Humanities and Arts Computing, 6(1-2), 133-147. DOI: http://dx.doi.org/10.3366/ijhac.2012.0044. ​ ​ 3 Over 60% of all websites use Google Analytics: see “Piwik, Privacy,” accessed September 16, 2015, http://piwik.org/privacy/. 4 other web analytics services could be considered for inclusion after the Digital Library Federation 2015 Forum. An overview of other methods for collecting web analytics data can be found in the appendix of this document. This document was authored by the analytics working group of the DLF AIG. The DLF AIG was formed in spring of 2014.4 The group arose from two working sessions that took place at the 2013 DLF Forum, “Determining Assessment Strategies for Digital Libraries and Institutional ​ Repositories Using Usage Statistics and Altmetrics”5 and “Hunting for Best Practices in Digital ​ ​ Library Assessment.”6 The first of these sessions was concerned with determining how to ​ measure the impact of digital collections; developing areas of commonality and benchmarks in how the community measures collections across various platforms; understanding cost and benefit of digital collections; and exploring how such information can be best collected, analyzed, communicated, and shared effectively with various stakeholders. The second working session set out to test the waters for the potential of a collaborative effort to build community guidelines for best practices in digital library assessment. The two working sessions were well attended and group leaders formed the DLF AIG to foster ongoing conversation. In the fall of 2014, volunteers from the DLF AIG formed four working groups around citations, analytics, cost assessment, and user studies. The primary purpose of each working group is to develop best practices and guidelines that can be used by all to assess digital libraries in their particular area; the white papers and other products of these working groups can be found on the DLF wiki.7 The Analytics working group is composed of library staff from around the United States working in the fields of digital programs, assessment, and electronic resources. The authors of this document are: ● Molly Bragg (Co-coordinator of the working group, Digital Collections Program Manager, Duke University Libraries) ● Joyce Chapman (Co-coordinator of the working group, Assessment Coordinator, Duke University Libraries) ● Jody DeRidder (Head of Metadata and Digital Services, University of Alabama Libraries) ● Rita Johnston (Digitization Project Librarian, University of North Carolina at Charlotte) ● Ranti Junus (Electronic Resources Librarian, Michigan State University) ● Martha Kyrillidou (Senior Director, Statistics and Service Quality Programs, ARL) ● Eric Stedfeld (Project Manager/Systems Analyst, New York University) 4 See Joyce Chapman’s blog post "Introducing the New DLF Assessment Interest Group," posted May 12, 2014, http://www.diglib.org/archives/5901/. 5 "Determining Assessment Strategies for Digital Libraries and Institutional Repositories Using Usage Statistics and Altmetrics," accessed August 4, 2015, http://www.diglib.org/forums/2013forum/schedule/21-2/. 6 "Hunting for Best Practices in Digital Library Assessment," accessed August 4, 2015, http://www.diglib.org/forums/2013forum/schedule/30-2/. 7 The DLF Assessment Interest Group wiki can be found at http://wiki.diglib.org/Assessment. As of the DLF 2015 annual meeting the citations, users and user studies, and analytics working groups have each produced a white paper and the cost assessment working group has defined digitization processes for data collection and created a digitization cost calculator using data contributed by the community. See each of the group’s individual wiki pages for links and details. 5 The group began its work by performing a literature review and defining types of audiences, content, and metrics pertinent to digital library assessment. The group continued to refine a list of core metrics to recommend for baseline collection in a digital library program. In this paper, each metric includes a definition and explanation of importance, as well as library-centric examples for how to work with the metric in Google Analytics. This document was distributed to the larger DLF AIG for feedback and comments in two drafts in July and August 2015. The white paper
Recommended publications
  • Application Log Analysis
    Masarykova univerzita Fakulta}w¡¢£¤¥¦§¨ informatiky !"#$%&'()+,-./012345<yA| Application Log Analysis Master’s thesis Júlia Murínová Brno, 2015 Declaration Hereby I declare, that this paper is my original authorial work, which I have worked out by my own. All sources, references and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Júlia Murínová Advisor: doc. RNDr. Vlastislav Dohnal, Ph.D. iii Acknowledgement I would like to express my gratitude to doc. RNDr. Vlastislav Dohnal, Ph.D. for his guidance and help during work on this thesis. Furthermore I would like to thank my parents, friends and family for their continuous support. My thanks also belongs to my boyfriend for all his assistance and help. v Abstract The goal of this thesis is to introduce the log analysis area in general, compare available systems for web log analysis, choose an appropriate solution for sample data and implement the proposed solution. Thesis contains overview of monitoring and log analysis, specifics of application log analysis and log file formats definitions. Various available systems for log analysis both proprietary and open-source are compared and categorized with overview comparison tables of supported functionality. Based on the comparison and requirements analysis appropriate solution for sample data is chosen. The ELK stack (Elasticsearch, Logstash and Kibana) and ElastAlert framework are deployed and configured for analysis of sample application log data. Logstash configuration is adjusted for collecting, parsing and processing sample data input supporting reading from file as well as online socket logs collection. Additional information for anomaly detection is computed and added to log records in Logstash processing.
    [Show full text]
  • Using Matomo in EBSCO's Discovery Service
    ARTICLES Analytics and Privacy Using Matomo in EBSCO’s Discovery Service Denise FitzGerald Quintel and Robert Wilson ABSTRACT When selecting a web analytics tool, academic libraries have traditionally turned to Google Analytics for data collection to gain insights into the usage of their web properties. As the valuable field of data analytics continues to grow, concerns about user privacy rise as well, especially when discussing a technology giant like Google. In this article, the authors explore the feasibility of using Matomo, a free and open-source software application, for web analytics in their library’s discovery layer. Matomo is a web analytics platform designed around user-privacy assurances. This article details the installation process, makes comparisons between Matomo and Google Analytics, and describes how an open-source analytics platform works within a library-specific application, EBSCO’s Discovery Service. INTRODUCTION In their 2016 article from The Serials Librarian, Adam Chandler and Melissa Wallace summarized concerns with Google Analytics (GA) by reinforcing how “reader privacy is one of the core tenets of librarianship.”1 For that reason alone, Chandler and Wallace worked to implement and test Piwik (now known as Matomo) on the library sites at Cornell University. Taking a cue from Chandler and Wallace, the authors of this paper sought out an analytics solution that was robust and private, that could easily work within their discovery interface, and provide the same data as their current analytics and discovery service implementation. This paper will expand on some of the concerns from the 2016 Wallace and Chandler article, make comparisons, and provide installation details for other libraries.
    [Show full text]
  • Forensics Investigation of Web Application Security Attacks
    I. J. Computer Network and Information Security, 2015, 3, 10-17 Published Online February 2015 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijcnis.2015.03.02 Forensics Investigation of Web Application Security Attacks Amor Lazzez, Thabet Slimani College of Computers and Information Technologies, Taif University, Kingdom of Saudi Arabia Email: [email protected], [email protected] Abstract—Nowadays, web applications are popular applications constitute a motivating environment for targets for security attackers. Using specific security attackers to perform security attacks. This involves the mechanisms, we can prevent or detect a security attack on development of various methods to perform a security a web application, but we cannot find out the criminal attack on a web application. The famous are: Cross-Site who has carried out the security attack. Being unable to Scripting, SQL injection, Code Injection, and Buffer trace back an attack, encourages hackers to launch new Overflow [1]. As long as web applications constitute the attacks on the same system. Web application forensics most important mean of data communication over the aims to trace back and attribute a web application security Internet, different techniques have been developed to attack to its originator. This may significantly reduce the protect web applications against hackers. Firewalls and security attacks targeting a web application every day, systems’ security patching are used for attack prevention; and hence improve its security. The aim of this paper is to intrusion detection systems and antivirus are used for carry out a detailed overview about the web application attack detection [1, 4]. forensics.
    [Show full text]
  • Business Intelligence and Analytics Applied to a Collaboration Platform
    IT 17092 Examensarbete 30 hp Aug 2017 Business Intelligence and Analytics applied to a collaboration platform. Adriana Patricia Devera La Rosa Masterprogram i datavetenskap Master Programme in Computer Science To the loving memory of my father, Nelson Devera Who passed away in May 2014. To my Mum Felicidad, my sister Andrea and my brother Mauricio; For all their support. 2 Abstract Business Intelligence and Analytics applied to a collaboration platform Adriana Patricia Devera La Rosa Teknisk- naturvetenskaplig fakultet UTH-enheten Idefusion AB is a start-up company which has developed a platform to simplify the collaborative process between companies’ employees, Besöksadress: university students and people in private life. Their main focus lies Ångströmlaboratoriet Lägerhyddsvägen 1 on skilled people such as company employees to produce cases based on Hus 4, Plan 0 problems in their work life. These problems can be solved together with students in an interactive environment where the employee can Postadress: ask, follow up questions and create a crowdfunding environment of Box 536 751 21 Uppsala ideas. The platform is created for multiple usage areas, such as to make possible for students to interact with a company representative Telefon: (e.g. a recruiter) and build a valuable network with it, this usage 018 – 471 30 03 includes interactions between professors and students for different Telefax: subjects. At the same time, employees can use their platform as both 018 – 471 30 00 an intranet and a tool to develop ideas and recruit knowledgeable students. Hemsida: http://www.teknat.uu.se/student The platform manages profiles with different type or users, where each user can have multiple profiles, related them to different type of organisations such as either companies or universities.
    [Show full text]
  • Uxjs: Tracking and Analyzing Web Usage Information with a Javascript Oriented Approach
    Received February 7, 2020, accepted February 26, 2020, date of publication March 2, 2020, date of current version March 12, 2020. Digital Object Identifier 10.1109/ACCESS.2020.2977879 UXJs: Tracking and Analyzing Web Usage Information With a Javascript Oriented Approach JAIME SOLÍS-MARTÍNEZ 1, JORDAN PASCUAL ESPADA1, RUBÉN GONZÁLEZ CRESPO 2, B. CRISTINA PELAYO G-BUSTELO1, AND JUAN MANUEL CUEVA LOVELLE1 1Computer Science Department, University of Oviedo, 33005 Oviedo, Spain 2Computer Science and Technology Department, International University of La Rioja, 26006 Logroño, Spain Corresponding author: Jaime Solís-Martínez ([email protected]) ABSTRACT Knowing what the user does inside your web and how he does it is crucial nowadays to understand the strengths and inconveniences of your web's design and architectural structure as well as about the usability of the site. Currently, there are several solutions that allow the tracking of the user behavior but these have some limitations due to the information they are able to capture and how they can present that information in a useful way for the web developer. Many of these platforms don't capture information about the user activity in the websites, clicks, mouse movements, etc. Some solutions do capture some of this user activity, but they only process the information visually showing heatmaps. In this paper we present UXJs, a novel research approach for collecting automatically all possible information about the user activity in websites, showing this information quantitatively and allowing its automatic statistical analysis and the rapid understanding by web developers. INDEX TERMS Web, javascript, usability, tracking, statistics. I. INTRODUCTION create standards for them.
    [Show full text]
  • Open Source Software Options for Government
    Open Source Software Options for Government Version 2.0, April 2012 Aim 1. This document presents options for Open Source Software for use in Government. 2. It is presented in recognition that open source software is underused across Government and the wider public sector. 3. This set of options is primarily intended to be used by Government to encourage IT suppliers and integrators to evaluate open source options when designing solutions and services. 4. This publication does not imply preference for any vendor or product. Open source software, by definition, is not tied inextricably to any particular commercial organisation. Any commercial entity can choose to support, maintain, or integrate open source software. 5. It is understood that the software market, and the open source ecosystem in particular, is a rapidly developing environment and any options list will be incomplete and may become outdated quickly. Even so, given the relatively low level of open source experience in Government, this options list has proven useful for encouraging IT suppliers to consider open source, and to aid the assurance of their proposals. Context 1. The Coalition Government believes Open Source Software can potentially deliver significant short and long term cost savings across Government IT. 2. Typical benefits of open source software include lower procurement prices, no license costs, interoperability, easier integration and customisation, fewer barriers to reuse, conformance to open technology and data standards giving autonomy over your own information, and freedom from vendor lock in. 3. Open Source is not widely used in Government IT. The leading systems integrators and supplies to Government do not routinely and effectively consider open source software for IT solutions, as required by the existing HMG ICT policy.
    [Show full text]
  • Pattern-Based and Visual Analytics for Visitor Analysis on Websites
    applied sciences Article Pattern-Based and Visual Analytics for Visitor Analysis on Websites Bárbara Cervantes 1 , Fernando Gómez 1 , Raúl Monroy 1 , Octavio Loyola-González 2,* , Miguel Angel Medina-Pérez 1 and José Ramírez-Márquez 3 1 Tecnologico de Monterrey, School of Engineering and Science, Carretera al Lago de Guadalupe Km. 3.5, Atizapán, Estado de México 52926, Mexico 2 Tecnologico de Monterrey, School of Engineering and Science, Vía Atlixcáyotl No. 2301, Reserva Territorial Atlixcáyotl, Puebla 72453, Mexico 3 Enterprise Science and Engineering Division, Stevens Institute of Technology, School of Systems & Enterprises, Hoboken, NJ 07030, USA * Correspondence: [email protected] Received: 3 July 2019; Accepted: 27 August 2019; Published: 12 September 2019 Featured Application: We present a tool to analyze web log files, complemented by applying pattern mining techniques to characterize segments of users. Abstract: In this paper, We present how we combined visualization and machine learning techniques to provide an analytic tool for web log data.We designed a visualization where advertisers can observe the visits to their different pages on a site, common web analytic measures and individual user navigation on the site. In this visualization, the users can get insights of the data by looking at key elements of the graph. Additionally, we applied pattern mining techniques to observe common trends in user segments of interest. Keywords: pattern mining; data visualization; log analysis 1. Introduction Analyzing and describing visitor behavior of an e-commerce site is of interest to web marketing teams, especially when assessing ad campaigns. Marketing teams are interested in quantifying their human visitors and characterizing them, for example, to discover the common elements of visitors who made a conversion (e-commerce purpose).
    [Show full text]
  • Investigating Web Attacks
    Investigating Web Attacks MODULE 9 Contents 9.1 Learning Objectives ............................................................................................................. 4 9.2 Introduction .......................................................................................................................... 4 9.2.1 Cyber-attack .................................................................................................................. 4 9.2.2 Cyber Warfare and cyber terrorism .............................................................................. 4 9.3 Types of web attacks ............................................................................................................ 5 9.3.1 Spoofing ........................................................................................................................ 5 9.3.1.1 Email spoofing ....................................................................................................... 5 1.3.1.2 Website spoofing ................................................................................................... 6 9.3.2 Repudiation ................................................................................................................... 7 9.3.3 Privacy attack ................................................................................................................ 7 9.3.4 Denial of Service........................................................................................................... 8 9.3.5 Privilege escalation ......................................................................................................
    [Show full text]
  • Requirements Change Management Based on Web Usage Mining
    FACULTY OF ENGINEERING • UNIVERSITY OF PORTO Requirements Change Management based on Web Usage Mining Jorge Manuel Esparteiro Garcia January 2016 Scientific Supervision by PhD, Ana Cristina Ramada Paiva, Assistant Professor Departmento de Engenharia Informática In partial fulfillment of requirements for the degree of Doctor of Philosophy in Informatics Engineering by the ProDEI Doctoral Programme Contact Information: Jorge Manuel Esparteiro Garcia Faculdade de Engenharia da Universidade do Porto Departamento de Engenharia Informática Rua Dr. Roberto Frias, s/n 4200-465 Porto Portugal Tel.: +351 22 508 1400 Fax.: +351 22 508 1440 Email: [email protected] URL: http://portal.ipvc.pt/images/ipvc/esce/docentes/jgarcia/ Jorge Manuel Esparteiro Garcia “Requirements Change Management based on Web Usage Mining” Copyright © 2015 Jorge Esparteiro Garcia. All rights reserved. ...to my sweet daughter Carolina ...to my lovely wife Ana This page was intentionally left blank. "The consequences of things are not always proportionate to the apparent magnitude of those events that have produced them." Charles Caleb Colton This page was intentionally left blank. Abstract In recent years, the use of the World Wide Web (WWW) has had a huge growth and there is a greater variety of web applications with an increasing importance in society and in supporting the development to all kinds of business. Often, most of websites are providing support services that must be main- tained and improved over time. This maintenance and upgrade can be difficult because frequently the requirements are no longer actual and/or often not even exist documented. Furthermore, it can also be difficult to assess what are the most critical features in order to define the changes to implement first (in the case of several requests).
    [Show full text]
  • A Brief Investigation on Web Usage Mining Tools (WUM) Vinod Kumar1, Ramjeevan Singh Thakur1 1Dept
    DOI:10.21276/sjeat.2017.2.1.1 Saudi Journal of Engineering and Technology ISSN 2415-6272 (Print) Scholars Middle East Publishers ISSN 2415-6264 (Online) Dubai, United Arab Emirates Website: http://scholarsmepub.com/ Research Article A Brief Investigation on Web Usage Mining Tools (WUM) Vinod Kumar1, Ramjeevan Singh Thakur1 1Dept. Master of Computer Application, MANIT, Bhopal, Madhya Pradesh, India *Corresponding Author: Vinod Kumar Email: [email protected] Abstract: In the era of World Wide Web, more than one billions of websites are active over the internet. To perform the log analysis on huge number of available websites, although, numerous featured log analysis tools are existing. However, the great difficulty arises in selection of suitable tools. This work provides an investigation of open source and commercial toolsets available for the analysis the study will provide many choices to pick from when deciding a toolset to manage and analyze log data. The paper will help to review the set of tools currently available and positively hook the right tool to get started on analyzing logs in their organization. Keywords: Web usage mining, Web log analysis,Web log Analyzer, Web Usage Mining Tools. INTRODUCTION paper is organized in four major section where section 1 Web has turned into the atmosphere where briefly introduces about the various popular tools for folks of all ages, tongues and cultures conduct their web log analysis, here only salient and steering features daily digital lives. Working or amusing, learning or of each tools are included to help in selecting right hang out, home or on the way, discretely or as an tools.
    [Show full text]
  • Web Analytics
    WEB ANALYTICS Weblogs APRIL 5, 2019 Asif Khan Table of Contents 1 What Is Web Analytics ........................................................................................................6 1.1 Why Web Analytics Are Important ...............................................................................6 1.2 How Web Analytics Work ............................................................................................6 1.3 Sample Web Analytics Data ..........................................................................................7 1.3.1 Audience Data .......................................................................................................7 1.3.2 Audience Behavior.................................................................................................7 1.3.3 Campaign Data ......................................................................................................7 2 Web Analytics and Social Media From UX (user experience) Professionals ........................8 2.1 Web Analytics: .............................................................................................................8 2.1.1 Who Comes to Your Web Site?..............................................................................8 2.1.2 What Is the User Doing on Your Web Site? ...........................................................9 2.1.3 When Is the User Doing It? ....................................................................................9 2.1.4 Where Is the User Doing It? ...................................................................................9
    [Show full text]
  • RBE WIKI for INDEPENDENT LEARNING Interactive Qualifying
    RBE WIKI FOR INDEPENDENT LEARNING Interactive Qualifying Project Report completed in partial fulfillment of the Bachelor of Science degree at Worcester Polytechnic Institute, Worcester, MA Submitted to: Professor Michael Gennert (advisor) Timon Butler Jonathan Estabrook Joseph Funk James Kingsley Ryan O’Meara April 26, 2011 Advisor Signature Abstract This project, which performed research on web based academic resources, revolved around the focus question “how does one utilize web-based communication media in order to facilitate the self-perpetuating exchange of knowledge in the academic engineering community?”. It exam- ined the social implications of self-sustaining web-based resources, and found that any new web resource needs to fill a previously vacant niche in order to gather the user-base required to be self-sustaining. Acknowledgements The members of the RBE Wiki for Independent Learning Interactive Qualifying Project would like to extend thanks to the students, teaching assistants, and professors of the Worcester Polytech- nic Institute Robotics Program for their time, input and feedback related to the Uki website. Also to our roommates for dealing with the erratic schedules we adopted to complete this project, and to our advisor Professor Michael Gennert for advising this project and helping us turn a simple idea we had into something which has the potential to greatly benefit the WPI community. So again thank you for all of your help from the Uki team. 2 Authorship The RBE Wiki for Independent Learning Interactive Qualifying Project was carried out by five individuals: Timon Butler, Jonathan Estabrook, Joseph Funk, James Kingsley, and Ryan O’Meara. The project as a whole accomplished what it did based upon the efforts of the group as a whole.
    [Show full text]