Quick viewing(Text Mode)

Using Learning Analytics to Understand and Support Collaborative Learning

Using Learning Analytics to Understand and Support Collaborative Learning

Using Learning to Understand and Support Collaborative Learning Mohammed Saqr Academic dissertation for the Degree of Doctor of Philosophy in Information Society at Stockholm University to be publicly defended on Monday 22 October 2018 at 09.00 in L70, NOD-huset Borgarfjordsgatan 12.

Abstract Learning analytics (LA) is a rapidly evolving research discipline that uses insights generated from to support learners and optimize both the learning process and learning environment. LA is driven by the availability of massive data records regarding learners, the revolutionary development of methods, cheaper and faster hardware, and the successful implementation of analytics in other domains. The prime objective of this thesis is to investigate the potential of learning analytics in understanding learning patterns and learners’ behavior in collaborative learning environments with the premise of improving teaching and learning. More specifically, the research questions comprise: How can learning analytics and analysis (SNA) reliably predict students’ performance using contextual, theory- based indicators, and how can be used to analyze online collaborative learning, guide a data- driven intervention, and evaluate it. The research methods followed a structured process of data collection, preparation, exploration, and analysis. Students’ data were collected from the online learning management system using custom plugins and database queries. Data from different sources were assembled and verified, and corrupted records were eliminated. Descriptive and visualizations were performed to summarize the data, plot variables’ distributions, and detect interesting patterns. Exploratory statistical analysis was conducted to explore trends and potential predictors, and to guide the selection of analysis methods. Using insights from these steps, different statistical and methods were applied to analyze the data. The results indicate that a reasonable number of underachieving students could be predicted early using self-regulation, engagement, and collaborative learning indicators. Visualizing collaborative learning interactions using SNA offered an easy-to-interpret overview of the status of collaboration, and mapped the roles played by teachers and students. SNA-based monitoring helped improve collaborative learning through a data-driven intervention. The combination of SNA visualization and mathematical analysis of students’ position, connectedness, and role in collaboration was found to help predict students’ performance with reasonable accuracy. The early of performance offers a clear opportunity for the implementation of effective remedial strategies and facilitates improvements in learning. Furthermore, using SNA to monitor and improve collaborative learning could contribute to better learning and teaching.

Keywords: Learning analytics, Social Network Analysis, Collaborative Learning, Medical , Interaction Analysis, Machine Learning.

Stockholm 2018 http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-159479

ISBN 978-91-7797-440-6 ISBN 978-91-7797-441-3 ISSN 1101-8526

Department of Computer and Systems Sciences

Stockholm University, 164 07 Kista

USING LEARNING ANALYTICS TO UNDERSTAND AND SUPPORT COLLABORATIVE LEARNING

Mohammed Saqr

Using Learning Analytics to Understand and Support Collaborative Learning

Mohammed Saqr ©Mohammed Saqr, Stockholm University 2018

ISBN print 978-91-7797-440-6 ISBN PDF 978-91-7797-441-3 ISSN 1101-8526

Printed in Sweden by Universitetsservice US-AB, Stockholm 2018 To the exquiste roses Carmen and Layla

Not knowing when the dawn will come I open every door; Or has it feathers like a bird, Or billows like a shore?

Emily Dickinson

Acknowledgement

Last year, my father left in peace without having the chance to see what he always hoped for, I always wished he could witness this day. Without his hard work throughout his life that was dedicated to the family’s prosperity, I would not have made it. His encouragement, his faith in me and his assurances were the driving force that kept me go- ing. Words cannot express the gratitude, thankful- ness, and apprecia- tion of the support my mother has offered me. It is just unfair that words are used to translate such a great feeling. I am immensely grate- ful for my main supervisor, Uno Fors. His scientific rigor, expertise, endless support, guidance, and continuous encouragement have paved the way for me to take every step. Uno has always been there for me, has always stood by me when I needed, and has provided the feedback and advice that helped me learn, progress and advance. I am also vastly grateful to Matti Tedre, my supervisor who introduced me to the art and craft of academic writing, taught me how to think critically, and pushed me forward with his thoughtful, elaborate, and constructive comments. Although he left before I finished, his guid- ance and feedback still resonate until today. I extend my sincere grat- itude to Jalal Nouri, my second supervisor, whose invaluable guid- ance, motivation and fine-grained feedback on every work made a huge difference. Jalal’s help extended to the way I manage time, study, readings, publications, and even social life. Jalal was always willing to help at all times and regarding everything. Without the steadfast support of my family, who endured difficult times until this work was completed; I would have never moved a step forward, for all what you have offered me, I am forever thankful and grateful.

I Special thanks to all my wonderful colleagues and friends who helped, supported and were there for me along this journey, namely, Hazem, Josefina, Nina, Lisa Rolf, Melinda, Qi Dang. I also extend my thankfulness and gratitude to the people who helped make the administrative issues smooth and very swift namely, Tuija Darvishi, Britt-Marie, Irma, Amos, Eija, and Katarina.

ﻣﺤﻤﺪ ﻣﺤﻤﺪ ﺻﻘﺮ ﻋﺒﺪ اﻟﺠﻠﯿﻞ Mohammed Saqr Stockholm, August 2018

II

Contents

List of tables ...... 1

List of figures ...... 2

List of articles ...... 4

Abstract ...... 5

1. Introduction ...... 7 1.1 Thesis structure ...... 10

2. Problem and motivation ...... 11 2.1 General problem definition ...... 11 2.1.1 Research in collaborative learning settings ...... 13 2.1.2 The need for efficient monitoring ...... 15 2.1.3 The need for studies on the impact of learning analytics ...... 15 2.1.4 The need for generalizability ...... 16 2.1.5 Learning analytics research in medical education ...... 16

3. Research aim ...... 18

4. Definition and taxonomy ...... 20

5. Theory and philosophy ...... 23 5.1 A new paradigm? ...... 23 5.2 Inferring learning from data ...... 25 5.3 Theory and learning analytics research ...... 26 5.4 Theoretical underpinning of my research ...... 28 5.4.1 Self-regulation ...... 29 5.4.2 Engagement ...... 32 5.4.3 The theoretical basis of collaborative learning ...... 33 5.5 My approach to the study of collaborative learning ...... 35 III

6. The learning analytics process ...... 38 6.1 Data capture ...... 39 6.2 Preprocessing and Preparation ...... 41 6.3 Analysis & Interpretation ...... 42 6.3.1 Predictive modeling ...... 42 6.3.2 Clustering ...... 43 6.3.3 Content analytics ...... 44 6.3.4 Social network analysis ...... 45 6.4 Insightful action ...... 51 6.5 Feedback ...... 53

7. Research on learning analytics ...... 54 7.1 An overview of previous research ...... 54 7.2 Notable institutional implementations ...... 56 7.2.1 Course signals ...... 56 7.2.2 E2Coach ...... 58 7.2.3 OU Analyse ...... 59

8. Challenges in the field of learning analytics ...... 61

9. Collaborative learning ...... 67

10. Methods ...... 72 10.1 The general context ...... 72 10.2 Research methodology ...... 75 10.2.1 Data capture ...... 75 10.2.2 Data preprocessing and preparation ...... 77 10.2.3 Data analysis, interpretation, and reporting ...... 77 10.3 Research ethics ...... 82 10.3.1 Data protection and privacy policy ...... 82 10.3.2 Consent ...... 83 10.3.3 Ethical approval ...... 83

11. Overview of the results ...... 85

12. Discussion ...... 92 12.1 Limitations ...... 100

13. Methodological and theoretical contributions ...... 102

IV 14. Conclusions ...... 104

16. Future work ...... 106

References ...... 108

V List of tables

Table 1: Centrality measures of the classmates’ network 50

1

List of figures

Figure 1. The learning analytics process cycle 39 Figure 2. A graph of classmates’ network. 47 Figure 3. An LA dashboard showing a visualization of a 51 student activity. Figure 4. Another example of a teacher dashboard show- 53 ing class engagement levels. Figure 5. Course Signals at Purdue showing green (safe) 58 and yellow lights (borderline safe). Figure 6. A sketch of the E2Coach system. The system 59 matches advice to students’ information and personalizes the messages accordingly. Figure 7. A schematic representation of the classic seven 73 jumps approach. Figure 8. A schematic representation of the approach 74 adopted by Qassim University Figure 9. A copy of the ethical approval 84

2

List of Abbreviations

AI ALM Automatic linear model AUC Area under the curve CART Classification and regression tree CL Collaborative learning CS Course signals CSCL Computer supported collaborative learning DLA Dispositional learning analytics EDM Educational k-NN K-nearest neighbors LA Learning analytics LAK Learning analytics & knowledge conference LMS Learning management systems NCBE National committee of bioethics OU Open university RQ Research question SNA Social network analysis TEL Technology-enhanced learning

3

List of articles

The publications below are included in this thesis:

Study 1: Saqr, M., Fors, U., & Tedre, M. (2017). How learning analytics can early predict under-achieving students in a blended medical education course. Med Teach, 39(7), 757-767. doi:10.1080/0142159X.2017.1309376 Study 2: Saqr, M., Fors, U., & Tedre, M. (2018). How the study of online collaborative learning can guide teachers and predict students' performance in a medical course. BMC Med Educ, 18(1), 24. doi:10.1186/s12909-018-1126-1 Study 3: Saqr, M., Fors, U., Tedre, M., & Nouri, J. (2018). How social network analysis can be used to monitor online collabo- rative learning and guide an informed intervention. PLoS ONE, 13(3), 1-22. doi:10.1371/journal.pone.0194777 Study 4: Saqr, M., Fors, U., & Nouri, J. (2018). Using social net- work analysis to understand online problem based learning and predict performance. PloS ONE. doi:10.1371/jour- nal.pone.0203590 Study 5: Saqr, M., Nouri, J., & Fors, U. (2018) (in press). Time to focus on the temporal dimension of learning. A learning ana- lytics study of the temporal patterns of students’ interactions and self-regulation. International Journal of Technology En- hanced Learning.

4

Abstract

Learning analytics (LA) is a rapidly evolving research discipline that uses insights generated from data analysis to support learners and optimize both the learning process and learning environment. LA is driven by the availability of massive data records regarding learners, the revolutionary development of big data methods, cheaper and faster hardware, and the successful implementation of analytics in other domains. The prime objective of this thesis is to investigate the potential of learning analytics in understand- ing learning patterns and learners’ behavior in collaborative learn- ing environments with the premise of improving teaching and learning. More specifically, the research questions comprise: How can learning analytics and social network analysis (SNA) reliably predict students’ performance using contextual, theory-based indi- cators, and how can social network analysis be used to analyze online collaborative learning, guide a data-driven intervention, and evaluate it. The research methods followed a structured process of data collection, preparation, exploration, and analysis. Students’ data were collected from the online learning management system using custom plugins and database queries. Data from different sources were assembled and verified, and corrupted records were eliminated. Descriptive statistics and visualizations were per- formed to summarize the data, plot variables’ distributions, and de- tect interesting patterns. Exploratory statistical analysis was con- ducted to explore trends and potential predictors, and to guide the selection of analysis methods. Using insights from these steps, dif- ferent statistical and machine learning methods were applied to an- alyze the data. The results indicate that a reasonable number of un-

5 derachieving students could be early predicted using self-regula- tion, engagement, and collaborative learning indicators. Visualiz- ing collaborative learning interactions using SNA offered an easy- to-interpret overview of the status of collaboration, and mapped the roles played by teachers and students. SNA-based monitoring helped improve collaborative learning through a data-driven inter- vention. The combination of SNA visualization and mathematical analysis of students’ position, connectedness, and role in collabo- ration was found to help predict students’ performance with rea- sonable accuracy. The early prediction of performance offers a clear opportunity for the implementation of effective remedial strategies and facilitates improvements in learning. Furthermore, using SNA to monitor and improve collaborative learning could contribute to better learning and teaching.

6

1. Introduction

A chasm has been growing between the aspirations of the higher education system and what we can realize as educators. The strain between the rising number of learners and the tightening resources of the overtasked teachers has allowed the chasm to grow (Aragon, 2016). Under mounting pressure from learners, the public and gov- ernments alike, the search for solutions and opportunities has never been more intense. Sharing goals, objectives, and ambitions, many have turned to technology for a panacea. Internet and technology at large have kindled the optimism that in- telligent solutions may automate a broad array of resource-inten- sive tasks, thereby enabling teachers to pay closer attention to their pertinent teaching work (Reiser, 2001; Tamim, Bernard, Borokhovski, Abrami, & Schmid, 2011). Technology provides the potential to create and reuse materials for as long as they remain relevant, in theory facilitating remarkable savings in time and ef- fort (Bichsel, 2013). Furthermore, the Internet enables the immedi- ate distribution of materials to large numbers of students at a rela- tively low cost of production (Bichsel, 2013; Gulati, 2008; Mazoue, 2014). Such delivery can be consistent for all learners; nobody is provided better service than others, and more im- portantly not worse (Bichsel, 2013). Technology-enhanced learn- ing (TEL) offers a plethora of possibilities and applications such as simulated experiments, virtual training and interactive multimedia that aim to boost engagement and motivation (Abrami et al., 2008; Bichsel, 2013; Cook et al., 2011; R. Ellaway & Masters, 2008; Wouters, Van Nimwegen, Van Oostendorp, & Van Der Spek, 2013). By enabling forms of communication that transcend the physical boundaries of time and place, the Internet has paved the

7 way for collaborative and interactive milieus that help students re- flect on learning, share their experiences, and build new knowledge (Garrison, Anderson, & Archer, 2010; Hrastinski, 2008). Despite the unyielding rise in the rate of adoption and the scale of applications in technology-enhanced learning, concerns continue to be expressed regarding their effectiveness and the optimum ap- proach (Kirkwood & Price, 2014; G. Wang, Gunasekaran, Ngai, & Papadopoulos, 2016). Successful online learning requires efficient learning design, along with specific learner qualities such as moti- vation, self-regulation, and learning strategies, as well as timely feedback and proper support (Burnette, O'Boyle, VanEpps, Pol- lack, & Finkel, 2013). The breadth of TEL offers limited personal- ized feedback for students and provides few insights regarding the online learning process (Gašević, Jovanović, Pardo, & Dawson, 2017; Graham, 2006; Pardo, Han, & Ellis, 2016). Inspired by the fruitful application of data science in a multitude of industries and services, educators have used available data in the search for solutions (H. Chen, Chiang, & Storey, 2012; Siemens, 2013). The availability of massive data records regarding learners as well as the exponential surge in computing capabilities and the burgeoning of artificial intelligence (AI) models has rendered pos- sible the handling of large volumes of data that can create a signif- icant impact when used efficiently (Kitchin, 2014). Learning analytics has emerged to explore the opportunities of sense-making of such data and create actionable insights. Such in- sights may embolden the provision of effective solutions to imperative issues in education such as attrition, personalized sup- port, quality of learning experience, improved learning, and curric- ulum design, and informing decision makers. Learning analytics may also reduce gaps in personalized feedback in the resource-lim- ited educational landscape (Lim, 2018; Papamitsiou & Economides, 2014; Siemens, 2013). It was against such a background that this thesis was conceptual- ized to explore the potential of using students’ data in collaborative learning environments in order to understand and potentially im- prove learning and teaching. The thesis aims to use a theory-guided

8 approach to track students’ online activities to identify the indica- tors of better performance or the lack thereof to find opportunities for insightful proactive action. Moreover, the thesis aims to use learning analytics to guide educators and learners to identify gaps in collaborative learning and try to find opportunities for early sup- port and intervention.

9

1.1 Thesis structure The background section of the thesis is divided into nine chapters. The first chapter defines the problem and the main drivers behind this research, followed by the aim and research questions. I then go through the definition and taxonomy of learning analytics and closely related concepts, with the aim of shedding light on important terms and how they are related. I subsequently present a detailed chapter regarding the underlying theory, its role in analyt- ics research (and learning analytics in particular), followed by the ways in which it is operationalized in the thesis. In the next chapter, I present the process of learning analytics from a methodological perspective. I place particular emphasis on the methods employed in this thesis, especially predictive modeling and social network analysis. I then explore previous research in the field with notable examples from research and institutional perspectives. This is followed by the challenges facing researchers and institutions, with a particular focus on ethics and privacy. Given that the context of this study is collaborative learning, I discuss the main ideas of this concept in Chapter 9. Following the background section, the methods chapter is introduced with a detailed description of the context, data collec- tion, ethics, analysis, and interpretation. The results of my studies are subsequently presented, followed by the discussion, the contri- butions of the studies, the limitations, the conclusions, and sugges- tions for future research.

10

2. Problem and motivation

In this section, I will cover the general problem, starting with the focus in the field of learning analytics on predictive models that rely on generic indicators with little emphasis on theory. I then highlight the importance of studying the structural and relational aspects of collaborative learning, which constitutes the central theme of the thesis. I will subsequently cover the challenges and research gaps that have been identified as motivating factors be- hind this research. These comprise linking research to theory and context, the lack of generalizability of research findings in collab- orative learning settings, and the paucity of research on using learn- ing analytics to create a real impact on learning and teaching. Ow- ing to the significance of the issue of linking research to theory, I dedicate Chapter 5 to the discussion of the theory, significance and how it is operationalized in this thesis.

2.1 General problem definition Driven by the data and the opportunities it may offer, most initial applications of learning analytics have focused on using students’ online trails to monitor their performance and to predict undera- chievers in need of support (Ferguson, 2012a; Leitner, Khalil, & Ebner, 2017; Shahiri, Husain, & Rashid, 2015; Tsai & Gasevic, 2017a). Researchers have used log data records from the counts of online activities, such as the count of logins or the frequency of clicks (generic indicators). However, these generic indicators have offered limited insights into the learning environment, have failed to contribute to the development of a valid theoretical model, have

11 only faint connection to actionable and meaningful intervention models, and have not provided suggestions for implementing a cor- rective strategy (Conde & Hernández-García, 2015; Nesbit, Xu, Winne, & Zhou, 2008; Winne, Nesbit, & Popowich, 2017). Predic- tive models that rely on generic indicators have been deemed chal- lenging to replicate or generalize, especially in blended learning programs (Conde & Hernández-García, 2015; Tempelaar, Rienties, Mittelmeier, & Nguyen, 2018). Furthermore, this “one-shot” ap- proach has been criticized for only considering the final product (e.g., the count of online activities and course views by the end of the course) and not the actual process, with little attention to time, connection to actual context and interactions with peers or com- munities (Winne et al., 2017). Learners live in communities that shape their behaviors and enable them to interact with and influence one another. Traditional meth- ods of analysis treat individuals as isolated, discrete units of anal- ysis (Borgatti, Mehra, Brass, & Labianca, 2009; Burt, Kilduff, & Tasselli, 2013; Cela, Sicilia, & Sánchez, 2014), while overlooking the importance of interactions with teachers and peers and the in- fluence of the social structure, which are particularly essential in collaborative learning settings. Advocates of social network anal- ysis argue that a student’s social capital, interactions and position in the social structure have a significant effect on academic achievement, which tends to be overlooked by attribute-based re- searchers who focus on other factors such as age, gender, and per- formance (Gašević, Zouaq, & Janzen, 2013; Joksimović et al., 2016; Scott & Carrington, 2014). Such an argument has garnered considerable support from social network researchers. However, a lack of framework exists that could capture the relational or inter- actional construct of online learning. Furthermore, no conclusive evidence exists regarding which social network constructs are cor- related with performance in different contexts, or how students reg- ulate their collaborative learning (Dowell et al., 2015; Hernández- García et al., 2015; Joksimović et al., 2016). The methodological underpinnings of learning analytics research in collaborative learning should be aligned with theoretical as- sumptions and contextual conditions in order to ensure that results 12 can be empirically reproducible and research findings can be mean- ingfully useful (Fincham, Gašević, & Pardo, 2018; Hernández- García et al., 2015). Operationalizing interaction indicators, com- munity structure, and relationships among collaborators in collab- orative learning environments is both timely and significant. A model that can capture these indicators may help understand the patterns and behaviors of students in collaborative learning and better predict students’ performance. Furthermore, it may facilitate the creation of a monitoring mechanism for efficient collaborative learning and produce and evaluate informed data-driven interven- tion. Successful operationalization may also help create more ac- curate and reproducible models that can be replicated across courses that share a similar context. Accounting for the time di- mension, self-regulation and engagement might boost accuracy and be considered highly aligned proxy indicator of learning rela- tive to generic indicators (B. Chen, Knight, & Wise, 2018; Cicchi- nelli et al., 2018; Molenaar & Järvelä, 2014). Thus, the importance of using pedagogically and contextually relevant approaches to collaborative learning cannot be overstated. Therefore, I have identified gaps that provided motivations for this research, namely the paucity of research in collaborative learning, the lack of studies concerning the impact of using interaction indi- cators to guide a data-driven intervention, the difficulty of creating generalizable models, and the potential of learning analytics in medical education, which is of particular relevance to this thesis. These gaps will be discussed in detail in the following section.

2.1.1 Research in collaborative learning settings In collaborative learning, students interact, exchange and com- municate information. Generic indicators offer very little to under- stand the interactions, relationships and groups or communities where this occurs. Interaction and social network analysis indica- tors are more relevant to participatory learning settings and are deeply grounded in collaborative learning theories. For example, for the constructivist perspective of learning, SNA offers a bird’s- eye view of learners’ interactions and information exchange among 13 peers, highlights active and inactive actors, and identifies isolated and non-interactive students (Dado & Bodemer, 2017; Shum & Ferguson, 2012; Sie, Ullmann, Rajagopal, Cela, Rijpkema, & Sloep, 2012). For connectivism learning theory, SNA maps the connections and networks of information exchange, identifies the influential nodes of information and monitors the spread and diver- sity of perspectives (Mattar, 2010). For social capital theory, SNA can map the networks of resources and help identify different sorts of privileged access to connections, such as connections to power- ful actors and bridging different communities (brokerage capital) (Dika & Singh, 2002; Kwon & Adler, 2014). Researchers have used SNA to study interactions and relationships in online learning, with the most common research subjects com- prising the study of patterns of interaction, knowledge construc- tion, computer-supported collaborative learning (CSCL), and per- formance prediction (Cela et al., 2014; Dado & Bodemer, 2017; Sie, Ullmann, Rajagopal, Cela, Rijpkema, & Sloep, 2012). Studies addressing patterns of interaction have focused on the potential of the technique, but research has failed to address the impact of using the technique to monitor collaboration and guide intervention. Most studies that have attempted to investigate the role of SNA in the prediction of performance have used correlation statistics with different centrality measures. Although results have been encour- aging and indicate a positive correlation, no uniform model to guide the operationalization of SNA indicators exists (Dado & Bodemer, 2017). A uniform model that captures the full gamut of the quantity, direction and significance of interactions among stu- dents and teachers is required. Such a model should capture the roles of collaborators and the structural properties of the collabora- tive groups and their communities. Consequently, it might help cre- ate a more efficient monitoring mechanism for collaborative learn- ing, enhance the accuracy and generalizability of predictive mod- els, and guide an informed intervention. These issues will be dis- cussed in detail in the following section.

14

2.1.2 The need for efficient monitoring Regardless of the power of CSCL as a medium for learning, it is insufficient as a guarantor of efficient collaboration. Research has shown that groups do not spontaneously engage in fruitful collab- orative interactions (Fransen, Kirschner, & Erkens, 2011; Kreijns, Kirschner, & Jochems, 2003). Fruitful online interactions require careful group composition, efficient learning design, and a stimu- lating medium that motivates interaction. They may also require the management of roles, scaffolding by teachers, and scripting or orchestrating. Therefore, different analysis mechanisms have been developed to assess the efficiency of collaborative learning to en- sure that it meets its intended pedagogical objectives (Fincham et al., 2018; Kreijns et al., 2003; Lambert & Fisher, 2013; Noroozi, Weinberger, Biemans, Mulder, & Chizari; Resta & Laferrière, 2007).

2.1.3 The need for studies on the impact of learning analytics Considerable criticism has been voiced regarding the large number of studies concerning the potential of learning analytics, whereas few studies have examined the actual impact of intervention strat- egies (S. Dawson, Jovanovic, Gašević, & Pardo, 2017a). Research into social network analysis in education began in the 1990s and tended to evaluate the potential of using SNA visualization in map- ping patterns of interactions or the roles of collaborators. However, according to systemic reviews (Cela et al., 2014; Dado & Bodemer, 2017; Sie, Ullmann, Rajagopal, Cela, Bitter, & Sloep, 2012), no research has been undertaken regarding the impact of intervention using SNA-based strategies. As such, little is known about effec- tive strategies, the extent of benefits received by learners, or the optimal time for an intervention.

15

2.1.4 The need for generalizability Some researchers have studied the potential of social network anal- ysis centrality measures as predictors of learning (Cho, Gay, Da- vidson, & Ingraffea; Dowell et al., 2015; Hernández-García et al., 2015; Jiang, Fitzhugh, & Warschauer, 2014; Joksimović, Gašević, Kovanović, Riecke, & Hatala, 2015; Joksimović et al., 2016; Romero & Ventura, 2013). The results of these studies have indi- cated a possible positive correlation between centrality measures and academic achievement. For instance, Romero and Ventura (2013) have reported on the positive correlation between perfor- mance and number of interactions (degree centrality) and number of received interactions (indegree centrality). A similar finding has been reported by Hommes et al. (2012) and Joksimović et al. (2016). However, the results were not replicated between studies, and studies of multiple courses have failed to yield consistent re- sults (Dowell et al., 2015; Hernández-García et al., 2015; Jiang et al., 2014; Joksimović et al., 2016). Researchers have suggested that context might constitute a defining factor and have called for a study to address contextual factors as a possible explanation of is- sues concerning generalizability and reproducibility. Another problem of these studies is the lack of a structured framework that can be used as a basis for a predictive model.

2.1.5 Learning analytics research in medical education Learning analytics has been demonstrated as a potentially reliable method for identifying and supporting students at risk of undera- chieving in several fields (Kuzilek, Hlosta, Herrmannova, Zdrahal, & Wolff, 2015; Pistilli & Arnold, 2010; Wolff, Zdrahal, Nikolov, & Pantucek, 2013). Today, calls for the medical education commu- nity to harness these benefits can be heard (Doherty, Sharma, & Harbutt, 2015; R. H. Ellaway, Pusic, Galbraith, & Cameron, 2014). However, little research has considered the role or potential of learning analytics in tackling the issue of drop-outs in the medical education literature (Saqr, 2018). This apparent gap and require- ment for research have stimulated prominent medical educationists

16 to advocate for the development of knowledge and capacity in the field. Ellaway, an editor of the journal Advances in Health Sciences Education and a prominent scholar, has summarized this issue in a position article on big data and analytics in 2014: “If health professional education is to control its future, be accountable for the way its programs run, and thereby maintain its contract with society then health professional educators will need to be ready to deal with the complex and compelling dynamics of analytics and Big Data. We therefore need to explore, discuss, and critique these emerging techniques to develop a robust understanding of their strengths and limitations in the service of health professional edu- cation.” (R. H. Ellaway et al., 2014, pp. 222). The reasons behind the medical education community’s delay in following other educational disciplines may be connected to the challenging nature of LA research. LA requires a multidisciplinary approach that includes education, statistics, data science, and in- structional designers. It also requires a clear ethical framework and data usage policies. Furthermore, it requires institutional readiness and stakeholders’ awareness, as well as funding and support (R. H. Ellaway et al., 2014).

17

3. Research aim

As a preparation step for this research project, a series of pilot stud- ies were conducted and presented at international conferences (Al- Gadaa, Alkadi, & Saqr, 2014; AlGhasham, Saqr, & Kamal, 2013; Saqr, AlGhasham, & Kamal, 2014; Saqr, Kamal, & AlGhasham, 2014). These studies investigated the role of learning analytics in predicting students who are at risk of failing a course as well as the study of social network analytics in the context of collaborative learning. This practical research experience, along with the litera- ture review and engagement with the education and learning ana- lytics communities, helped formulate the initial research objec- tives. Against a background of the paucity of empirical research on learn- ing analytics in collaborative learning environments in general, and medical education in particular, this thesis aimed to explore the po- tential of using theory-guided learning analytics to understand and improve collaborative learning. I endeavor to use a model that cap- tures the interactions, relations, self-regulating patterns and attrib- utes of collaborators and their communities in order to help facili- tate more effective monitoring of collaborative learning and better informed data-driven intervention, as well as to boost the accuracy of predicting underachieving students and most importantly to pro- vide a more replicable learning analytics model.

The aim of this research was:  To investigate the role of learning analytics in understanding and supporting online collaborative learning.

18

Two research questions were subsequently formulated:  How can learning analytics and social network analysis relia- bly predict students’ performance in collaborative learning en- vironments using contextual, theory-guided indicators?  How can social network analysis be used to analyze online col- laborative learning, guide a data-driven intervention, and eval- uate it?

19

4. Definition and taxonomy

Learning analytics (LA) is a relatively new and rapidly expanding research area. The field was formally recognized following its first widely known definition during the inaugural International Confer- ence on Learning Analytics & Knowledge (LAK) in 2011 (Siemens, 2013). The definition states that learning analytics is the “measurement, collection, analysis, and reporting of data about learners and their contexts, for purposes of understanding and op- timizing learning and the environments in which it occurs” (Siemens, 2013, pp. 1382). Other definitions also exist, for instance, Elias (2011), has proposed that it constitutes “the selec- tion, capture, and processing of data that will be helpful for stu- dents and instructors at the course or individual level.” (pp. 4). Cooper (2012) have defined LA as “the process of developing ac- tionable insights through problem definition and the application of statistical models and analysis against existing and/or simulated fu- ture data” (pp. 4). Erik Duval’s definition is arguably the simplest even though it is also quite comprehensive, claiming “learning an- alytics is about collecting traces that learners leave behind and us- ing those traces to improve learning.” (Baker, Shum, Duval, Stamper, & Wiley, 2012, pp. 21). Although the definition of LAK is contested, and many have called for its revision, it seems to be the most used and endorsed concept by researchers in the field. Academic analytics and (EDM) would ap- pear to be related research endeavors, albeit with a different focus and stakeholders involved. Academic analytics uses business intel- ligence and data analysis at the institutional level to support deci- sion making and hence help improve the operating efficiency of organizational processes (bin Mat, Buniyamin, Arsad, & Kassim,

20

2013; Campbell, DeBlois, & Oblinger, 2007; Ellis, 2013; Siemens, 2013). At the institutional level, analytics are used to assess pro- grams and offer inter-school, intra-school and institutional bench- marking. Its insights may enable institutions to understand their strengths and challenges and foster their growth. Academic analyt- ics has the potential to boost the retention rates of enrolled students as well as the rates of completion of degrees and consequently graduation rates (bin Mat et al., 2013; Campbell et al., 2007; Ellis, 2013; Ferguson, 2012b; Siemens, 2013; Sin & Muthu, 2015). The term was popular during the early days of learning analytics, but over time lost ground to learning analytics, which is often used in- terchangeably. EDM is a multidisciplinary research endeavor that uses data min- ing techniques to analyze the data collected or generated during teaching and learning (Bin Mat et al., 2014; Dutt, Ismail, & Hera- wan, 2017). Methods such as artificial intelligence, text analytics, information retrieval, and cognitive modeling are commonly used to find solutions to educational issues (Dutt et al., 2017; Slater, Joksimović, Kovanovic, Baker, & Gasevic, 2017). EDM has been defined by the International Educational Data Mining Society as “an emerging discipline, concerned with developing methods for exploring the unique types of data that come from educational set- tings and using those methods to better understand students, and the settings which they learn in” (Siemens & Baker, 2012, pp. 252). EDM has been used to predict students’ future learning behaviors, create, or improve content models and instructional scenarios, as well as to study the impact of different learning support methods. Other uses include the of learning, adaptive feed- back and customized recommendations (Bienkowski, Feng, & Means, 2012; Falakmasir & Habibi, 2010; Papamitsiou & Econo- mides, 2014; Peña-Ayala, 2014). A growing overlap is evident among academic analytics, educa- tional data mining and learning analytics as terms, given that the focus of these disciplines is the analysis of computer-generated data sets in the field of education research with the objective of enhancing the educational process. Researchers often use the same methods under different labels. It is possible that the future will 21 deepen or otherwise eliminate redundant distinctions in order to provide a clearer taxonomy (Siemens & Baker, 2012; Y. Y. Wong, 2016).

22

5. Theory and philosophy

As was described in the problem and motivation chapter, linking theory to context constituted a primary driver for this thesis. As such, this chapter is dedicated to discussing the subject in greater detail. First, I will address the debate surrounding the role of em- piricism in analytics research and whether it represents a new par- adigm or should be part of a continuum. A discussion of the subject of learning inference will be presented thereafter. I subsequently offer an overview of the position of theory in learning analytics research, followed by an overview of learning theories related to my thesis, and how recent research in the field is operationalizing theoretical models. The final section of the chapter discusses the theoretical underpinnings of my research, with a description of how I operationalized each theoretical model to answer the research questions.

5.1 A new paradigm? Research in the field of learning analytics has been driven by the availability of unprecedented volumes of data generated via stu- dents’ interactions with technology (H. Chen et al., 2012; Siemens, 2013). Another driving factor has been the thriving of instrumented data science methods that have leveraged opportunities to address challenging, real-life problems and deliver meaningful insights at an unprecedented scale (G. Wang et al., 2016). Education has fol- lowed in the footsteps of rewarding industry models that have har- nessed big data to boost their competitive advantage. The disrup-

23 tive nature of big data analytics in the industry has led to a substan- tial debate regarding the radical paradigm shift that analytics has brought to the ways in which knowledge is defined and the meth- ods conducted (Boyd & Crawford, 2012; Kitchin, 2014; A. F. Wise & Shaffer, 2015). In traditional hypothesis-driven research, the scientific experiment investigates a preset premise that conforms to a theory or hypothet- ical model. The results are often explanations, revisions, and re- evaluations of prior beliefs that lead to improvements in our under- standing and the emergence of new questions (Kitchin, 2014; Sie- mens & Latour, 2015). Analytics is arguably a new data-driven ap- proach to empiricism that seeks to identify novel patterns or rules through data mining and exploration (Anderson, 2008; Kitchin, 2014; Siemens & Latour, 2015). The proposition is that through the analysis of vast amounts of data, using inductive reasoning and a bottom-up approach, discoveries can be made (Kitchin, 2014). Ad- vocates of the approach claim that researchers no longer need to make educated guesses or worry about constructing a hypothesis since the analysis of full resolution large data sets can enable the automatic discovery of valuable insights (Anderson, 2008; Kitchin, 2014). They further claim that the exploration of big data can lead to the discovery of patterns and correlations that were not expected and therefore might never have been discovered using traditional approaches (Anderson, 2008; Kitchin, 2014). Although these claims are bold, supporting examples have been demonstrated, in which the analysis of big data brought forth new and completely unexpected insights (Anderson, 2008; Boyd & Crawford, 2012; Kitchin, 2014). However, opponents have questioned the validity of this approach, citing numerous examples of spurious correla- tions resulting from the analysis of big data. Therefore, the risk of misinterpretation cannot be overruled on the basis of data size or the complexity of the algorithm (Boyd & Crawford, 2012).

24

5.2 Inferring learning from data Learning is a complex concept that has over the years garnered considerable research across numerous disciplines such as neuro- science, physiology, psychology, education, philosophy, and com- puter science. Although the concept is commonly used, a consen- sus of how it should be defined does not exist (A. B. Barron et al., 2015; De Houwer, Barnes-Holmes, & Moors, 2013). Nevertheless, learning may be viewed as any process that leads to a change in capacity, behavior or adaptation due to experience or regularities in environment that cannot be exclusively explained by maturation or aging (Illeris, 2018). Our understanding of the concept of learn- ing continues to evolve. Indeed, whereas the concept was tradition- ally understood as the acquisition of knowledge and skills, it is now believed to encompass an affective, sociocultural and ontogenetic dimension (A. B. Barron et al., 2015; De Houwer et al., 2013). Given that learning is an evolutionary, dynamic and developmental process that occurs over different spatial and temporal scales from seconds to minutes to lifetime, a valid concept of learning that cap- tures its full complexity appears to be far from realizable (Horvath & Lodge, 2016). As such, inferring learning from the analytics of students’ activities is confronted by the pursuit for an operative definition of learning (Greller & Drachsler, 2012; Lodge, Alhadad, Lewis, & Gašević, 2017). Using answers from other disciplines that have operational- ized data insights has failed to help, as learning cannot be reduced to simple dichotomous possibilities as is true in e-commerce (pur- chase versus not purchase) or medicine (morbidity and mortality). Higher education institutions tend to measure learning according to performance, a practice that has been widely criticized for only capturing the momentary acquisition of knowledge or skills during or immediately following instruction, while failing to account for the acquisition of learning skills, long-term learning, and behav- ioral changes (Lodge et al., 2017; Soderstrom & Bjork, 2015). Due to the challenging nature of defining learning or learning out- come, I have cautiously tried to use assessment and grades as indi- cators of course performance. Assessment and grades define how 25 students’ work and drives their learning strategies and approaches to course activities. While it is far from perfect as a measurement for learning, it is understandably aligned with students’ efforts, plans and learning strategies (Joughin, 2009). By cautiously I mean that in my studies, I have used the relative performance of students compared to one another, rather than using grades. Modeling rela- tive achievement levels might represent a more contextually rele- vant method than setting a cut score, which tends to be arbitrary and judgmental (Kane, 2017).

5.3 Theory and learning analytics research The early field of learning analytics was traditionally focused on creating predictive deficit models for the detection of underachiev- ers, as well as predicting the indicators of poor performance. These models were not always explicit about theory (Clow, 2013; S. Dawson et al., 2017b; Tempelaar, Rienties, & Giesbers, 2015), and eventually attracted substantial criticism as a result (S. Dawson, Mirriahi, & Gasevic, 2015; A. F. Wise & Shaffer, 2015). The con- cern is that optimizing the educational system based on metrics that do not measure actual learning might risk optimizing learning away from learning, underscoring the need to build analytics around rel- evant metrics that are consistent with learning and based on theory (Clow, 2013; S. Dawson et al., 2017b). The importance of theory in the development of learning analytics cannot be overemphasized; after all, learning analytics is about learning and should contribute to the development of learning sci- ences. A well-grounded theory would help advance knowledge in the discipline, promote research about crucial problems, and help advance effective instructional practices (B. Chen, 2015; S. Daw- son et al., 2015; Reimann, 2016). The theory would guide research about study design, research methods, and the predictors and pre- dictive models that should be included in the analysis. The theory would guide researchers concerning potential confounding factors, subgroupings, and possible data covariates. Moreover, the theory

26 would offer researchers a conceptual framework to interpret the re- sults and conclusions in order to stimulate insightful action. Most importantly, such a theory would help researchers and stakeholders to generalize findings to broader contexts and population groups (Reimann, 2016; A. F. Wise & Shaffer, 2015). Some attempts at laying the theoretical grounds for the field have been made (Clow, 2012a; Colvin et al., 2015; Gašević, Kovanović, & Joksimović, 2017; Greller & Drachsler, 2012; Reimann, 2016). Nonetheless, substantial work still needs to be done (Gašević et al., 2017; A. F. Wise & Shaffer, 2015). The widely accepted proposition among the learning analytics community is that if the field cannot realize an immaculate theo- retical framework, it should at least aspire to do so (B. Chen, 2015; S. Dawson et al., 2015; Lodge et al., 2017; Reimann, 2016; A. F. Wise & Shaffer, 2015). Among calls to align research to learning theory, recent research has sought to improve data collection meth- ods by embracing proxy indicators of learning that are more aligned with learning theories. Examples include constructivism and social capital theory in the context of collaborative learning (Agudo-Peregrina, Iglesias-Pradas, Conde-González, & Hernán- dez-Garca, 2014; Chiu, 2014; Rizzuto, LeDoux, & Hatala, 2009; Saqr, 2018; Tervakari et al., 2013). In these studies, researchers have included interaction indicators, social network centrality measures of connectedness and information exchange, and collab- orative network structure (Agudo-Peregrina et al., 2014; Chiu, 2014; Rizzuto et al., 2009; Saqr, 2018; Tervakari et al., 2013). An- other emerging research trend comprises attempts to contextualize the digital traces recorded from students’ online activities to infer their self-regulation patterns and learning strategies (Barnard, Lan, To, Paton, & Lai, 2009; Cicchinelli et al., 2018; Gašević, Jo- vanović, et al., 2017; Schraw, 2010; Winne, 2017; Winne et al., 2017). Early reports suggest the validity of using these records of online activity as a representation of online self-regulation (Cicchi- nelli et al., 2018; Winne et al., 2017).

27

5.4 Theoretical underpinning of my research Given that the context of this research was collaborative learning, its operationalization of data collection, analysis and inference builds on existing learning theories relevant to the collaborative learning context. The epistemological basis of collaborative learn- ing comprises diverse educational philosophies and rich theoretical roots, “more like an arbor of vines growing in parallel, crossing and intertwining” (Macgregor, 1990, pp. 21). Constructivist learning theory is relevant to this study and the analysis methods imple- mented because it deals with students’ interactions, dialogue and shared co-construction of knowledge (Fosnot, 2013; Illeris, 2018; Schunk, 2012). Another theory that explores information exchange in networked learning is the connectivism learning theory, which is concerned with learning occurring in networked learning and the exchange of information (Bell, 2010; Siemens, 2004). Further- more, social capital theory explains how individuals capitalize on their social relations and connections and benefit from their social attributes (Cloete, 2014; Kwon & Adler, 2014). There is ample empirical evidence that indicates that strategies that aim to facilitate collaborative learning can benefit learning as a whole. For instance, a meta-analysis by Bernard et al. (2009) con- cluded that by increasing interactions among learners, the teacher or learning content demonstrates an adjusted average 0.38 signifi- cant positive effect on performance. Another meta-analysis found that courses designed to enhance collaborative learning and inter- action among learners improved students’ engagement signifi- cantly and resulted in superior academic achievement (Borokhovski, Bernard, Tamim, Schmid, & Sokolovskaya, 2016). Research has also illustrated that promoting certain practices can enhance knowledge construction in collaborative settings. Such practices include facilitating discourse (Garrison & Arbaugh, 2007; Lambert & Fisher, 2013) promotive interaction (Gillies, 2016; Kreijns et al., 2003; Resta & Laferrière, 2007), argumenta- tion (Noroozi et al., 2012; Nussbaum, 2008; Wecker & Fischer, 2014), positive interdependence (Abrami, Bernard, Bures, Borokhovski, & Tamim; Resta & Laferrière, 2007) and individual 28 accountability (Gillies, 2016; Kreijns et al., 2003). In this context, teaching presence is an important determinant of a student’s sense of community, satisfaction, and perceived learning. (Bernard et al., 2009; Garrison & Arbaugh, 2007; Kreijns et al., 2003; Webb, 2009). Furthermore, interaction with teachers has been found to positively influence student academic achievement (Bernard et al., 2009) and predicts good performance (Cho et al., 2007; Joksimović et al., 2015; Romero & Ventura, 2013). Two constructs are of prime importance here. First, the concept of engagement, as online learning requires students to be engaged, ac- tively participating, and demonstrate considerable motivation, and persistence. Second, self-regulation, as online learning is a de- manding process and requires successful strategies and regulation (Burnette et al., 2013; Reschly & Christenson, 2012; Winne, 2011). Each of these theories and constructs will be discussed in greater detail in terms of their relevance and a description of how they were operationalized.

5.4.1 Self-regulation Self-regulation (SRL) is an effortful process that requires substan- tial cognitive resources to set goals, monitor progress, adopt learn- ing strategies and acquire academic skills (Broadbent & Poon, 2015; Grossman & McDonald, 2008; Winne, 2011). The process of SRL can be viewed in terms of four loosely sequenced phases. In the first phase, the student researches aspects of the task in terms of constraints and affordances. Some of these aspects might be ex- ternal, such as time allocated, access to learning resources and availability of support. Other factors might be internal, such as pre- vious knowledge, motivation, available strategies to handle the task and estimated gains or penalties if not performed. The second phase involves planning and goal-setting based on the earlier task research. In the third phase, the learner engages in performing the task and evaluates his or her own performance. The feedback ob- tained enables the learner to adjust the approach. In the fourth phase, the learner evaluates the whole process and makes large-

29 scale adaptations that may require some of the previous phases to be repeated, or asking for help (Winne, 2011). SRL may be defined as “an active, constructive process whereby learners set goals for their learning and then attempt to monitor, regulate, and control their cognition, motivation, and behavior, guided and constrained by their goals and the contextual features in the environment” (Pintrich, 2000, pp. 453). The definition con- veys several intertwined elements. The first is the active, purpose- ful engagement of the learner in learning. The second element is the learner’s goal-directed focus on achievement. The third is the use of proactive learning strategies to regulate cognition and en- hance performance. The fourth element is contextual and refers to the environment and how it might be either supportive or obstruc- tive. Given that the process of self-regulation is taxing, a student must be motivated to perform, and be supported when this motiva- tion is lacking (Pintrich, 2000; Winne, 2011; Zimmerman, 1990). Various protocols have been developed to measure self-regulation. As a metacognitive process, it contains both an aptitude aspect and a temporal aspect. In the classroom, aptitude measures the meta- cognitive approach to learning, the learning strategies, and how a learner monitors his or her approach. The temporal aspects meas- ure the events as they occur over a timeline of events (Winne, 2011; Zimmerman, 2008). As students work in online environments, their activities are recorded in the form of logs with time-stamps. The logs are observable indicators created by students during their en- gagement with the learning task. Researchers can contextualize these traces to build a description of the event (Baggetun & Was- son, 2006; Schraw, 2010; Winne et al., 2017). Certain self-regula- tion assumptions can be made according to the type of the trace recorded. For instance, access to course objectives, schedules, and requirements can be described as traces of planning (Cicchinelli et al., 2018; Winne et al., 2017).

5.4.1.1 Temporality The time-stamps recorded for each online activity define the tem- poral aspect of self-regulation. The construct of time and temporal- ity of students’ activities are central concepts in self-regulation 30 learning theory (B. Chen et al., 2018; Winne et al., 2017). Students who efficiently use their time to manage their learning strategies are expected to perform better than their counterparts who do not. A large volume of empirical evidence seems to support this prop- osition (Broadbent & Poon, 2015; Burnette et al., 2013; Job, Wal- ton, Bernecker, & Dweck, 2015; Wolters & Hussain, 2015). Pro- crastination in performing learning tasks provides a reliable indi- cator of the failure of self-regulation and consequently poor perfor- mance (Steel, 2007; You, 2015). Self-regulation was operationalized as a theoretical basis for my first article in contextualizing data collection (Saqr, Fors, & Tedre, 2017). Data regarding early course login (before the start date), course orientation materials, course schedule and course booklet containing the intended learning objectives, course requirements and assessment methods were all used as indicators of task orien- tation and planning. Data regarding course view, logins, forum postings, access to course materials, and announcements were op- erationalized as indicators for engagement with the tasks. Data re- garding formative assessment and access to grades were used as indicators of self-monitoring and self-evaluation. In order to reflect the continuous and purposeful engagement and motivation of the learner over the duration of the course, I calculated the engagement sub-indicators on a weekly basis; students who were constantly engaged received higher scores. Other aspects of self-regulation were also inferred from the collected data, and the proactive ap- proach and goal-directedness were inferred from traces of early course access, regularity of course access, regularity of self-assess- ment, and access to course assessment materials and grades. Fur- thermore, time was an essential factor in the calculation of engage- ment indicators (both early participation and regularity). The temporality aspect of self-regulation was further studied in de- tail in article 5 (Saqr, Nouri, & Fors, 2018), in which I mapped patterns of participation in online collaborative learning with em- phasis on early contribution during different time constructs. An early pattern reflected proper time management skills, motivation to perform the required task and proactive engagement. Patterns of engagement with learning tasks even during college working hours 31 were also used as indicators for regular and sustainable engage- ment.

5.4.2 Engagement A considerable proportion of robust empirical evidence supports the assertions that engagement is a reliable indicator of learning, such that its value is “no longer questioned” (Reschly & Christen- son, 2012, pp. 4; P. Trowler & Trowler, 2010). In addition, strate- gies that aim to increase engagement are likely to result in a desir- able outcome for all students. Therefore, engagement has attracted a vast volume of research in higher education as well as in other fields (Reschly & Christenson, 2012; P. Trowler & Trowler, 2010; V. Trowler, 2010). Although there is considerable consensus re- garding the significance of the construct in education, definition, elements and measurement techniques remain contested (Kahu, 2013). Nevertheless, elements of engagement can be inferred from students’ behavior in online environments (Ma, Han, Yang, & Cheng, 2015; Santos, Klerkx, Duval, Gago, & Rodríguez, 2014). In study 1, I attempted to extrapolate online activities as proxy in- dicators for aspects of engagement (Saqr et al., 2017). For instance, commitment, involvement in learning, and rule-following were in- ferred from the regularity of logging to the LMS and followed up with course announcements and regular access to course materials. Students’ investment of quality efforts in learning, along with self- regulation of efforts, were inferred from the sustainability of learn- ing efforts, time-on-task and purposeful access to formative assess- ments. Engagement has a social aspect, and so interaction with peers and teachers, and participation in information co-construction, can all be deemed signs of engagement. Similarly, disengagement in par- ticipatory and collaborative environments could be manifested as a lack of interaction and social isolation. Therefore, students’ course interactions were operationalized as aspects of social engagement in the final three articles using interaction indicators (Saqr et al., 2017; Saqr, Fors, & Nouri, 2018; Saqr, Nouri, & Fors, 2018).

32

5.4.3 The theoretical basis of collaborative learning

5.4.3.1 Constructivist learning theories Based on Jean Piaget’s ideas, constructivist theorists view learning as a contextual, goal-oriented and self-regulated process. The pro- cess is cumulative through the utilization of previous background knowledge along with current experiences to construct novel con- cepts or acquire new information. Learners are expected to take re- sponsibility for constructing their own knowledge. Teachers are viewed as facilitators who stimulate learners’ curious investiga- tion, motivate them to reflect, raise questions, explore answers, jus- tify their own perspectives and debate with others. Learning occurs as a result of self-organization, reflective abstraction, dialogue with the community and challenging misunderstandings and erroneous conceptions (Fosnot, 2013; Illeris, 2018; Liu & Matthews, 2005; Schunk, 2012). Social constructivists emphasize the dialectic aspect of learning. Knowledge is socially (rather than individually) created by groups of learners, leading to the active co-construction of meaning. Learning is shaped by the social and physical environment as well as in dialogue among peers. Vygotsky’s sociocultural theory, alt- hough constructivist in nature, emphasizes interpersonal influences and the social context as mediators for knowledge construction and learning (Liu & Matthews, 2005; Vygotsky, 1987). Vygotsky views peer interaction as a very useful tool for the development of knowledge and skills. The zone of proximal development (ZPD) is a concept that com- pares what can be achieved by working with peers with that achieved by working alone. Vygotsky postulated that when a learner who lies inside the ZPD for a specific task is provided with the adequate support and scaffolding, the learner becomes empow- ered to complete the job efficiently (Liu & Matthews, 2005; Po- dolskiy, 2012; Vygotsky, 1987). Thus, social constructivism em- phasizes the role of group work and interactions with the commu-

33 nity to enable students to learn and build upon their ZPD. The pro- cess may be enhanced through scaffolding by a teacher or a knowl- edgeable instructor (Podolskiy, 2012; Vygotsky, 1987).

5.4.3.3 Connectivism learning theory Connectivism is a learning theory that aims to address learning in the digital age and accounts for the societal changes where learning as a process is no longer an individual activity (Bell, 2010; Kop & Hill, 2008; Siemens, 2004). According to connectivism, knowledge, and learning exist in the multiplicity of opinions; learn- ing occurs as a process of connecting and interacting with the sources of information. Moreover, connectedness is an important skill for acquiring information and remaining up-to-date in the long term. The theory of connectivism has important implications. First, it emphasizes skills for obtaining and evaluating information and staying up-to-date. Second, it underscores the value of collabora- tion in a functional community of knowledge and information shar- ing. Third, it lays the ground to develop a greater understanding of information exchange using modern analytic techniques such as social network analytics (Bell, 2010; Siemens, 2008; Siemens, 2004).

5.4.3.4 Social capital theory According to social capital theory, the benefits one attains from having connections, relationships and position in the social struc- ture are collectively known as social capital. Benefits include ac- cess to resources and opportunities as well as psychological and emotional support, enabling individuals to overcome challenging situations (Burt, 2002, 2004, 2015, 2017; Dika & Singh, 2002; Kwon & Adler, 2014). Moreover, relationships may provide happiness and motivation and act as a driver for the accomplishment of challenging tasks (Kwon & Adler, 2014; A. J. Martin & Dowson, 2009). Individuals who have ties to others in different groups have access to diverse opinions and fresh perspectives; these ties –although weak – pro- vide a competitive advantage of being the gateway to external novel ideas, and offer a higher return of brokerage effort. Acting as 34 a broker between unconnected groups and bridging the communi- cation gaps (social holes) brings another form of social capital: the social capital of structural holes (Burt, 2002, 2004, 2015, 2017).

5.5 My approach to the study of collaborative learning For the study of collaborative learning, I used interaction indicators (numbers of posts, reads, views, replies) in study 1 (Saqr et al., 2017) and interaction indicators as well as social network analysis in the four subsequent articles 2, 3, 4, 5 (Saqr et al., 2017; Saqr, Fors, & Nouri, 2018; Saqr, Fors, & Tedre, 2018; Saqr, Fors, Tedre, & Nouri, 2018; Saqr, Nouri, & Fors, 2018). Given that SNA offers a quantification of interaction indicators, the term SNA will be used throughout this thesis to denote both interaction indicators and social network indicators. SNA may be a convenient as well as practical choice for the study of collaborative learning. SNA visual analysis and mathematical indicators (centrality measures and cen- tralization) are highly aligned with participatory settings and are deeply grounded in collaborative learning theories. For the sake of operationalizing the SNA in a collaborative learning environment, I performed an extensive review of the literature to develop a rep- resentative framework that can capture the gamut of relationships, interactions and structural properties on individual collaborators and groups (Cela et al., 2014; Dado & Bodemer, 2017; Sie, Ullmann, Rajagopal, Cela, Bitter, & Sloep, 2012). Six main facets can be revealed by SNA, and are highlighted here with their theo- retical basis:  The visual mapping of communicational activities and the characteristics of the social structure (constructivist learning and connectivism);  The quantity, direction, and significance of interactions (con- structivist learning and social capital theories);  Position in information transfer (connectivism, constructivist learning, and social capital theories);

35

 Connectedness and relationships (connectivism and social cap- ital theories);  Role in collaboration (Constructivist learning and community of inquiry);  Community structure such as groups and classes (constructiv- ism). From the constructivist view of learning, SNA offers a bird’s-eye view of learners’ interactions, highlighting the participatory status of the group and the roles played by the individual participants. Individual participants may play an active role in the group, such as leaders who drive and stimulate others to collaborate; coordina- tors who coordinate and moderate interactions; or isolated who rarely participate. In study 2, mapping the interactions revealed a non-collaborative pattern of interactions, where information giving and receiving networks were dominated by the teacher. These find- ings stimulated the design of study 3 to explore an intervention so- lution (Saqr, Fors, & Tedre, 2018). In study 3, I operationalized different SNA centrality measures and interaction analysis indica- tors to map the roles of collaborators, revealing a non-participatory pattern on the side of the students and a dominating teacher. These non-collaborative patterns were a target of intervention that helped bring a meaningful improvement (Saqr, Fors, Tedre, et al., 2018). In study 4, the SNA centrality measures and roles played by the collaborators in the form of the interaction framework to predict performance and create a reproducible model proved reasonably reliable (Saqr, Fors, & Nouri, 2018). Furthermore, the role and position in information exchange were indicated by SNA centrality measures. Three centrality measures are of particular relevance to information transfer: betweenness centrality (the times a user helped connect or broker information), closeness centrality (the ease by which a collaborator can be reached), and information centrality (the amount of information passing through the participant). A preferential role in the exchange or brokerage of information is a valuable source of social capital as it has greater access to resources as well as diverse perspective (Burt, 2014; Cloete, 2014; Dika & Singh, 2002; Isba, Woolf, &

36

Hanneman, 2017; Mattar, 2010). I used the measures of infor- mation transfer to monitor students’ roles in study 2, which re- vealed little participation on the side of students and required at- tention (Saqr, Fors, & Tedre, 2018). In article 3, the centrality measures were used to monitor improvements after introducing re- medial measures (Saqr, Fors, Tedre, et al., 2018), and to understand the process of interaction as well as to predict performance in study 4 (Saqr, Fors, & Nouri, 2018). From a constructivist and social capital learning theory perspec- tive, SNA maps the connectedness of a learner, the strength of the ego network (own network) and the networks of resources, and helps identify different types of privileged access (Burt, 2007; Dika & Singh, 2002). The concept of connectedness and significance of relationships were operationalized using various centrality measures in my studies, in addition to the information transfer cen- trality measures mentioned earlier. In addition, prestige measures of range of connections (domain prestige), strength of peers (Eigen centrality) and rank prestige (power and performance of connec- tions) were utilized. The centrality measures of connectedness were positively correlated with performance in studies 2, and 4, especially the centrality measures that were representative of strength of connections to high-performing and well-connected students (Saqr, 2018; Saqr, Fors, & Nouri, 2018; Saqr, Fors, & Tedre, 2018) It is worth noting here that a single centrality measure is not enough as a representative for a specific concept. In my research, I used multiple centrality measures along with visual representation to in- fer the role played by collaborators, taking into account the situa- tional context. As an example from study 3, for a student to be la- beled as a coordinator, the student needed to have high to moderate out-degree centrality, betweenness centrality, indegree centrality and information centrality as well as average domain prestige (Saqr, Fors, Tedre, et al., 2018). A more detailed discussion of cen- trality measures and centralization is presented in Section 6.3.4.

37

6. The learning analytics process

The importance of the process of learning analytics stems from the fact that it provides the methodological basis of how learning ana- lytics research is conceptualized, conducted, implemented, and evaluated. In this chapter, I introduce the process of learning ana- lytics, followed by a detailed account of each stage, with examples where possible. Special emphasis is placed on data capture, analy- sis, and interpretation. The subject of social network analysis, a central issue in this thesis, is introduced in greater detail, covering the general concept, terminology, visualization, and mathematical analysis. An example of a social network is also added and ana- lyzed mathematically and visually. The process of learning analytics can be conceptualized as a se- quence of interrelated steps or stages (see Figure 1). These com- prise: 1) capturing data generated by learners; 2) preprocessing the data for analysis; 3) statistical and machine-learning methods for analysis; and 4) the insights are used to apply data-driven interven- tions or aid the decision-making process. Given that learning ana- lytics is an iterative process, feedback can help improve and refine the methods (H. Chen et al., 2012; Clow, 2012b; Siemens, 2013).

38

Capture

preprocessing Feedback and Preparation

insightful Analysis & Action Interpreation

Figure 1: The learning analytics process cycle (Clow, 2012b).

6.1 Data capture Every activity a student makes online can be recorded and used for learning analytics, including visiting the online management sys- tem in order to access course resources, searching for information in the online library, taking an online examination, writing an as- signment, or communicating with a colleague. Hypothetically, there are no limits to what can be collected or used in learning an- alytics research as long as it conforms to ethical and privacy stand- ards (Leitner et al., 2017; Sergis, Sampson, Leitner, Khalil, & Eb- ner, 2017). The scope of learning analytics data collection is rapidly expanding in both breadth and scale due to improved learning technologies and increased awareness among stakeholders regarding the im- portance of learning analytics. The most common sources are the

39 logs that are recorded by learning management systems (LMS). Ex- amples of data derived from LMS include frequency of logins, page views, hits on learning resources, chat records, formative as- sessments, time spent on tasks and work on assignments. These data may be extracted using analysis of the LMS log system, cus- tom plugins that extract specific parameters such as time-on-task, or by using LMS built-in analytics dashboards (where applicable) (F. Martin & Ndoye, 2016; Misiejuk & Wasson, 2017; Na & Tasir, 2017; Sergis et al., 2017). The gathered data may also include registration and demographic data such as age, gender, language proficiency, prior grades, soci- oeconomic status, scholarships, and residence (Arnold & Pistilli, 2012a, 2012b; Jayaprakash, Moody, Lauria, Regan, & Baron, 2014; Kuzilek et al., 2015; Rienties et al., 2016). In collaborative learning environments, the collected data encompass interaction indicators such as the number, size, and frequency of forum post- ing, relationships, connections, and learners’ positions. Interaction parameters may be captured via social network analysis tech- niques, while the content of interactions may be analyzed using data mining and content analytics (Chiu, 2014; Macfadyen & Daw- son, 2010; Misiejuk & Wasson, 2017; Saqr, 2018; Shahiri et al., 2015; Tervakari et al., 2013). Self-reported surveys concerning learning disposition, motivation, and engagement have also been used to account for affective, be- havioral, and cognitive dimensions (S. P. Dawson, Macfadyen, & Lockyer, 2009; Shum & Crick, 2012; Tempelaar et al., 2015, 2018), an approach labeled dispositional learning analytics (DLA). Advocates of DLA argue that these data are more actionable and can trigger more informed data-driven interventions (Shum & Crick, 2012; Tempelaar et al., 2018). Temporal profiling of online activity using time series modeling has also been examined as a means of accounting for temporal patterns of variations in students’ activity (Brooks, Thompson, & Teasley, 2015; Goda et al., 2015). Multimodal learning analytics extends recording to neurophysio- logical, cognitive, physical, and digital data (Andrade & Danish, 2016; Blikstein, 2013; Blikstein & Worsley, 2016). Multimodal capture methods can potentially gather large volumes of fine- 40 grained information that extends beyond the limited scope of tradi- tional learning analytics. Advocates of this approach postulate that such high-resolution, fine-grained physiological data can offer an unprecedented breadth of information about learning (Blikstein, 2013). Whereas quantitative data collection methods capture the number of times a student uses a specific learning resource, they do not cover the rationale, motives or experiences of the users or stake- holders. The answers to such questions lie within the realm of qual- itative research. Qualitative data collection methods include struc- tured interviews and focus group discussions with students and teachers regarding their experience of the visualizations and in- sights through dashboards. Other common methods include sur- veys, open-ended questions, video recordings and analysis of re- flections and feedback (Ali, Hatala, Gašević, & Jovanović, 2012; Knight, Tech, States, & Novoselich, 2016; Mazza & Dimitrova, 2004a; Ott, Robins, Haden, & Shephard, 2015; Silius, Tervakari, & Kailanto, 2013; A. Wise, Zhao, & Hausknecht, 2014).

6.2 Preprocessing and Preparation At this stage, several processes are performed to prepare the data for analysis. The data are gathered from different sources and then compiled into a single data set. Thereafter, the cleaning of incor- rect, corrupt or mislabeled data is performed to avoid erroneous conclusions. The final step of processing is data wrangling or trans- forming into a suitable format for statistical analysis. The process might entail text parsing, visualizations, aggregation, and calcula- tions of certain metrics or feature engineering. The result of the process is a dataset ready for analysis. Mature learning analytics systems automate this process of integration and feature prepara- tion through programmable modules (Mattingly, Rice, & Berge, 2012; Siemens, 2013; Slater et al., 2017).

41

6.3 Analysis & Interpretation In tandem with the quest for relevant data sources, the exploration of data analysis methods has been relentless. Several techniques and methods have been used, with common strategies including statistical analysis and predictive modeling, clustering, content an- alytics and social network analysis. These methods will be re- viewed with special emphasis on predictive modeling and social network analysis. Social network analysis will be covered in a sep- arate chapter as it constitutes a principal method in the thesis (Mis- iejuk & Wasson, 2017; Nunn et al., 2016).

6.3.1 Predictive modeling The predictive modeling of academic achievement may be the most common objective of modeling. The objective is the early identifi- cation of students at risk of low achievement or dropping-out, in order to offer them a proper intervention that might benefit their progress (Gardner & Brooks, 2018; Leitner et al., 2017). Other less common goals are to forecast course enrollments, predict the po- tential return on investment in intervention strategies, and to alert students’ advisors to potential disengaged students. Predictive modeling mainly uses data and digital traces recorded by learning management systems as well as other information about students’ demographic and registration data (bin Mat et al., 2014; Leitner et al., 2017; Nunn et al., 2016). Early studies used simple bivariate correlations to identify factors that correlate with students’ greater performance (A. Y. Wang & Newlin, 2002a, 2002b). Later, regression models became increas- ingly widely used (Gašević et al., 2016; Macfadyen & Dawson, 2010; Romero & Ventura, 2013; Tempelaar et al., 2015). Recently, machine learning and artificial intelligence models have become more common such as clustering algorithms (Pardo, Han, & Ellis, 2017), classification algorithms (Romero & Ventura, 2013), deci- sion trees (Spoon et al., 2016) and k-nearest neighbors (Kuzilek, Hlosta, Herrmannova, Zdrahal, Vaclavek, & Wolff, 2015).

42

The accuracy of predictive models has steadily improved with the expansion of data sources and recent developments in statistical and machine learning methods. Furthermore, different validation and quality assurance methods are being introduced to boost the reliability of prediction results (Gardner & Brooks, 2018). Two types of statistical predictive modeling are frequently used. The first type (explanatory model) uses causal explanation by means of statistical modeling to build and test a predefined hypoth- esis. In this method, only the variables thought to cause a certain outcome or supported by a theoretical model are used as predictors (Brooks & Thompson, 2017; Shmueli, 2010). For example, a re- searcher interested in studying self-regulatory behavior would only include variables of task orientation, task engagement, evaluation, and adjustment as features within a predictive model for self-regu- lation. Recently, explanatory modeling has gained traction in learn- ing analytics research driven by the growing interest in aligning methods with theory. The second type is predictive modeling, in which no causal assumptions are made. The main goal in this case is to use the available data to early predict a future event or an out- come such as passing a course or completing a degree (Brooks & Thompson, 2017; Shmueli, 2010). Predictive modeling requires no theoretical model and is not concerned with explaining the phe- nomena under study. Rather, the objective of predictive modeling is to flag a future event such as a student possibility of failing a course or dropping out (Brooks & Thompson, 2017; Shmueli, 2010). By flagging a future event, educators can work to reduce the possibility of it occurring, through remedial and supportive strate- gies.

6.3.2 Clustering Clustering is concerned with identifying groups that share similar characteristics based on subsets of data points. Clustering is a widely used method with a varied spectrum of applications that in- cludes the organization of course materials and assignments at dif- ferent levels of difficulty, the automatic classification of different categories of course feedback, and provision of formative feedback 43 to different subgroups of students. Clustering is also used to offer personalized recommendations for subsets of students based on shared goals or performance. Personalized recommendations may be helpful resources according to choices of similar students or strategy recommendations according to intended goals (Avella, Kebritchi, Nunn, & Kanai, 2016; Romero & Ventura, 2010).

6.3.3 Content analytics The focus of content analytics is the content, which includes the content of learning resources such as books, handouts, and web re- sources, as well as learning products such as discussion forums, assessments, assignments, and the content of social interactions. Content analytics may be defined as “automated methods for ex- amining, evaluating, indexing, filtering, recommending, and visu- alizing different forms of digital learning content, regardless of its producer (e.g., instructor, student) with the goal of understanding learning activities and improving educational practice and re- search” (Kovanovic, Joksimovic, Gasevic, Hatala, & Siemens, 2017, pp. 78). The field of content analytics includes a wide range of applications such as writing analytics, discourse analytics, and assessment ana- lytics. Content analytics has been used for the creation of recom- mendation systems, provision of personalized feedback, knowledge and topic discovery in forum discussion, and analysis of learning outcomes (Kovanovic et al., 2017; Misiejuk & Wasson, 2017; Romero & Ventura, 2010).

44

6.3.4 Social network analysis Social network analysis is an area of data analysis concerned with the mapping of social structures and relationships and the interac- tions among actors within these social structures (Borgatti et al., 2009; Burt et al., 2013). Through SNA methods, researchers can study the effects and influences of the structures and an individ- ual’s positions and ties. SNA has been used across a wide array of fields and purposes, and new aspects continue to emerge. For in- stance, in economics, SNA has been used to analyze organizations’ communicative activities, management efficiency, and decision- making processes (Burt et al., 2013; Fang et al., 2015). In medicine, SNA is rapidly growing to include network analysis of genes, dis- ease propagation, brain communicative activities, and enzyme in- teractions (Barabasi, Gulbahce, & Loscalzo, 2011; Martínez‐ López, Perez, & Sánchez‐Vizcaíno, 2009). In academic authorship, it has been used to study analysis, publication trends, and scientific collaboration. The list also includes political, economic, historical, social and criminal sciences, among numerous other fields (Mingers & Leydesdorff, 2015).

6.3.4.1 Basics of social network analysis The social structure is always referred to as a network, and a net- work is a group of actors who are connected through a relation. For instance, a group of school friends forms a network defined by the friendship relationship. The actors are usually referred to as nodes or vertices; nodes are members of the network, and ties (commonly referred to as edges) are the links that connect the nodes to form a social structure (a network). Actors can comprise humans, organi- zations, countries and any entity that can be grouped in terms of a relationship. Edges can also be diverse, with examples including marriage, kinship, friendship, business and any relationship that can link two entities (Borgatti et al., 2009; Burt et al., 2013). In this thesis, we will focus on human interactions and particularly learn- ers’ interactions in academic environments. Networks are often an- alyzed via two methods: visual and mathematical analysis.

45

6.3.4.2 Visual analysis of social networks The visual analysis of social networks displays relationships among network actors in terms of a graph, commonly known as a sociogram. In sociographs, nodes are displayed as circles, and edges as lines. In directed networks (where the target and the source of interaction are defined), the line is rendered as an arrow from the source to the target. The visual analysis offers an overview of the configuration of the network and the relationships among the actors. It can illustrate actors who are strongly connected, isolated, or who possess an advantageous position such as brokers who bridge communications. The possibility of summing all interac- tions into one graph can be helpful and informative (Borgatti et al., 2009; Burt et al., 2013). In order to demonstrate the basics of social networks, I present in Figure 2 a group of 10 classmates who form a group of friends. Each pupil is depicted as a circle (node) and each relationship as an arrow (edge), starting from the source to the target. The whole group is referred to as a network and is visualized as a graph or sociogram. The network is based on nomination, for instance Sofia considers Jeanette, Sara, and Carmen as friends, and so there are two arrows representing this relationship, starting from Sofia. Sim- ilarly, two pupils (Sara and Janette) consider Sofia a friend. This visual representation (sociogram) is configured to show the node’s size proportional to the number of ties, hence we can see that Sara has the largest size and most connections. It is also configured to show pupils who are connecting and lying in between others in blue.

46

Figure 2: A graph of classmates’ network. Each circle is a node, and each arrow is an edge. The size of each edge is configured to reflect the degree centrality and the color to reflect the between- ness centrality.

6.3.4.3 Mathematical analysis of social networks Networks can also be mathematically characterized to quantify re- lationships, connectedness and actors’ importance in the network. Mathematical analysis can describe the whole network, the nodes, or the ties (Borgatti et al., 2009; Golbeck, 2013). Examples of net- work parameters are size, the number of nodes in the network (this is 10 in the classmates’ network presented in Figure 2), or average 47 degree (the average number of edges in the network). Density, which is the total number of interactions relative to the maximum possible (0.56 in the classmates’ network), reflects the degree of cohesion of a network. At the individual level, nodes are characterized by what is known as centrality measures (prestige measures). Centrality is a mathe- matical expression of the node’s role, importance (prestige), con- nectedness, or position. Given that nodes may play different roles and that importance may be viewed differently according to the sit- uation, different centrality measures exist that describe different roles. The centrality measures most relevant to the collaborative context will be discussed here with examples from the classmates’ network (Golbeck, 2013).

The quantity of ties  Indegree centrality represents the total number of edges or links an actor receives. In the classmates’ network, it was the number of nominations by others as friends. Nodes with a high indegree centrality are considered popular, as influencers or leaders. For instance, in a social network such as Twitter, lead- ers and media stars have a large number of followers and are often seen as social influencers (Golbeck, 2013).  Outdegree centrality represents the number of outgoing edges or links from an actor. In the classmates’ network, this was the number of nominations as own friends. Outdegree centrality is a measure of activity, sociability, and gregariousness (Golbeck, 2013).  Degree centrality is the sum of both outdegree and indegree centrality measure and is used to quantify all of an actor’s con- nections or interactions (Golbeck, 2013; Rochat, 2009).

Role in relay brokerage of information  Betweenness centrality represents the number of times an actor has been in between two other actors and hence connecting them. Actors with high betweenness centrality have an

48

important role in facilitating communications, connecting oth- ers and moderating interactions, and enjoy access to diverse perspectives. High scores of betweenness centrality are deemed brokerage social capital (Golbeck, 2013; Salter- Townshend, White, Gollini, & Murphy, 2012; Golbeck, 2013). In the classmate's network, Sara and Michael had the highest betweenness centralities, and so were most important in con- necting others.  Information centrality measures the importance of a node in information flow through the network. It is defined as the rel- ative disruption of traffic through the network if the actor were to be removed, and as such it represents a measure of cohesion. Sara had the highest information centrality in the classmates’ network and was the most important person in the passage of information in the network (Latora & Marchiori, 2007).  Closeness centrality measures how reachable an individual is to all other collaborators and members of the network (Golbeck, 2013; Rochat, 2009). Michael and Janette were the easiest to reach in the classmates’ network.

Connectedness  Eigenvector centrality measures the connectedness of an indi- vidual and his or her network connectedness. A person con- nected to well-connected actors would demonstrate higher Ei- genvector values (Golbeck, 2013). Sara and Michael were the most connected to a powerful network in the classmates’ net- work. In order to demonstrate how centrality measures work, I calculated the centrality measures for the classmates’ network displayed in Figure 2. Table 1 displays the centrality measures of each person. Sara has the most nominations as a friend (indegree), and so she is the most popular pupil. Michael and Janette have nominated the highest number of friends, and so they are the most social. How- ever, Michael has the highest degree and largest network of friends. Janette is the easiest to reach (closeness) by any pupil in the network, which may enable her to acquire information more

49

quickly than the others. Although Sara does not have the most con- nections, she is connected to the most powerful in the network (Eigen centrality), connects most people together (betweenness centrality), and most information passes through her. The interpre- tation of centrality measures depends on the situation. In the case of information exchange, Sara and Janette have better positions. However, in the case of spreading infection, they might be the most vulnerable, whereas Peter may be the safest. As such, isolating Sara and Michael might be the most effective solution to prevent the spread of infection to other classmates.

Table 1: Centrality measures of the classmates’ network Node Indegree Outde- Degree Close- Between- Eigen Infor- gree ness ness mation Carmen 7 4 11 0.50 0.06 2.14 2.14 Layla 4 4 8 0.57 0.06 1.86 1.86 Carl 2 2 4 0.57 0.02 1.43 1.43 Sofia 2 5 7 0.62 0.04 1.84 1.84 Sara 8 4 12 0.67 0.21 2.78 2.78 Michael 5 8 13 0.73 0.19 2.41 2.41 Janette 2 8 10 0.80 0.05 2.59 2.59 Oscar 5 5 10 0.67 0.05 2.39 2.39 Peter 5 0 5 0.00 0.00 1.84 1.84

50

6.4 Insightful action The ultimate goal of learning analytics is to offer students, teach- ers, and stakeholders actionable insights. A common way of pre- senting insights to students and raising their awareness is to use a web panel (dashboard) that provides information regarding activi- ties in the form of graphs and visualizations. Prompts and person- alized recommendations can be delivered to the dashboard or via customized messages or notifications. Dashboards (see Figure 3) offer students a visualization of their learning activities, the level of mastery of learning goals, and comparison with peers, in order to enhance their awareness and reflection and to track learning pro- gression (Atif, 2013; Bodily & Verbert, 2017; Duval, 2011; Nuss- baumer et al., 2012; Verbert et al., 2014).

Figure 3: An LA dashboard showing a visualization of a student activity. The image is available through Creative Commons At- tribution-Non-Commercial License from the OU Analyse project.

51

Institutions may use learning analytics to monitor students’ activi- ties and predict underachievers in order that they can be supported (Arnold & Pistilli, 2012; Misiejuk & Wasson, 2017). Learning an- alytics’ insights extend to tutoring, feedback, personalization, rec- ommendations, and reflection (Moissa, Gasparini, & Kemczinski, 2015). At the teacher level, learning analytics have helped identify effective teaching strategies (Fritz, 2011, 2013), improve instruc- tional design, and assess the effectiveness of (Bakharia et al., 2016; Lockyer, Heathcote, & Dawson, 2013; Mor, Ferguson, & Wasson, 2015; Persico & Pozzi, 2015; B. T. M. Wong, 2017). It also supports informed decision-making (Ruipérez-Valiente, Muñoz-Merino, Leony, & Kloos, 2015; B. T. M. Wong, 2017), un- derstand learners’ behaviors and class tendencies (Ruipérez- Valiente et al., 2015), and recognize disengaged students who might need support (Jayaprakash et al., 2014; Kuzilek, Hlosta, Herrmannova, Zdrahal, & Wolff, 2015). An example of a teacher dashboard showing students’ levels of engagement is displayed in Figure 4.

52

Figure 4: Another example of a teacher dashboard showing class engagement levels. The image is available through Creative Com- mons Attribution-Non-Commercial License from the OU Analyse project.

6.5 Feedback Each cycle of learning analytics applications generates valuable in- sights into the process. These insights are expected to enlighten practitioners about improved ways of implementing solutions, avoiding mistakes, and refining strategies. Given that LA is an it- erative process, greater knowledge of the students would improve the methods, refine the analytical models, and improve the process. As technology advances, new venues and further opportunities ex- ploration and progress emerge (Clow, 2012a).

53

7. Research on learning analytics

7.1 An overview of previous research Having introduced the process of learning analytics research in the previous section, in this chapter I provide an overview of how re- search in this field has progressed and yielded tangible results. I begin with exploring the early trials of inferring learning from stu- dents’ activities and how research progressed, with examples from landmark studies. I then present how learning analytics research materialized at the research and institutional levels, using three noteworthy examples (Course signals, E2Coach, OU Analyse). These examples were chosen for their historical, technological, and methodological importance. The concept of studying trials of students’ online activities dates back a decade prior to the formalization of the field of learning an- alytics in 2011. For instance, Wang and Newlin (2000) reported that course page hits and forum postings correlated with the per- formance of online course takers (A. Y. Wang & Newlin, 2000). In a related study, they explored the potential of studying the data rec- orded by learning management systems (LMS) in informing teach- ers about students’ learning and ultimately predicting performance. The study included 121 students enrolled in a Research Methods in Psychology course, with the collected data comprising views of the course page, forum posts, and reads. They used a bivariate correla- tion to report the correlation between the sum of points earned by the students and their activities. Their results indicated a positive correlation between early page views and course performance (A. Y. Wang & Newlin, 2002b). An example of the early implementations of visual dashboards dis- playing students’ activities were reported by Mazza et al. (2004). 54

They used visualization tracking to inform teachers about students who may require attention. The visualization helped teachers to gain an accurate view of the cognitive, behavioral, and social as- pects of their students. The authors suggested that the tool helped the course instructors prevent course problems such as by identify- ing disengaged students at risk of dropping out. The teachers in- volved in the study commended the information offered by the vis- ualization and described how it helped them track students’ activi- ties (Mazza & Dimitrova, 2004b). Ramos et al. (2008) investigated the role of various course compo- nents of online activities on the positive final course outcomes of 67 students enrolled in a Community Psychology course. They adopted a novel approach to data analysis by using a step-wise mul- tiple regression model to predict the final outcomes from students’ tracking data. To further validate their findings, they performed cross-validation by using the equation generated from the “screen- ing sample” to predict another test sample, “the calibration sam- ple,” in a Psychopharmacology course with 52 students. Their re- sults indicated that most of the variance in the outcome was ex- plained by the number of clicked pages (Ramos & Yudko, 2008). Dawson et al. (2009) studied the indicators of student motivation and correlated the findings with data derived from the LMS to test if patterns of online activity could indicate achievement orientation and therefore underlying motivation. Their findings suggested a positive correlation between participation in discussion forums and achievement orientation (S. P. Dawson et al., 2009). A year later, Macfadyen and Dawson (2010) published a proof of concept paper about how “Mining LMS data” can be used to develop an early warning system for teachers. The study broadened the scope of the data gathered to include network analysis, which included both cor- relations and multiple regression. Their findings indicated that the total number of forum posts, number of emails sent, and number of exams completed accounted for 30% of the variation in the stu- dents’ final grades. The regression model was able to correctly identify 81% of the students who failed. However, the first note- worthy implementation of an early warning system was described

55 by Kimberly E. Arnold (2012), who used “Course signals” at Pur- due University to show safety levels as colored lights. A detailed discussion of course signals will be presented in the case studies in the next section. With the introduction of the International Conference on Learning Analytics, research has substantially increased in volume and types of inquiry. Many institutions and countries are racing to harness the power of using data to support students, teachers, and stake- holders. Progress in the field of LA has been seen in three primary areas. First, the expansion of data scope by adding more digital sources, or adding innovative data capture methods such as digital sensors and physiological trackers. Second, improvements to data analysis methods by using advanced machine learning, data min- ing, or artificial intelligence methods. Third, exploring new objec- tives such as assessment of the relationship of physiological func- tions with learning, the exploration of high-resolution digital traces to infer self-regulation, or the study of the temporal aspects of learning (Blikstein & Worsley, 2016; Rajesh Kumar & Hamid, 2017).

7.2 Notable institutional implementations The successful implementation of learning analytics to support stu- dents’ success and increase retention has proven to be fruit-bearing on the institutional level. Below is a discussion of three case studies that are either noteworthy for their historical importance (Course Signals), developed in-house (E2Couch), or for their scale of ap- plication (OU Analyse).

7.2.1 Course signals Purdue University faced challenges of budget restraints, retention problems, and unprepared students. They sought solutions in de- veloping an analytics system through a project titled “Course Sig- nals” (CS) to help identify students at risk of underachievement (Arnold & Pistelli, 2012; Pistilli & Arnold, 2010). Course Signals 56 monitored students’ activities in real time and channeled their data to a predictive algorithm. The predictive algorithm received input from four sources:  Student demographic data such as gender, age, or location;  Prior academic history, including previous high school grades, academic preparation and examination grades;  Students’ performance measured by the sum of grades earned so far in the course;  Effort as defined by learning management system metrics com- pared to peers. The algorithm calculated each student’s risk score and displayed a red, yellow or green signal on the student’s course page (see Figure 5). A red signal alerted a high likelihood of underachievement and thus an impending problem; a green signal indicated a high likeli- hood of passing the course; and a yellow signal indicated a border- line status (Arnold & Pistilli, 2012). Intervention would then be initiated by instructors according to each case; intervention starts by sending an email or an SMS, referral to an academic advisor or a meeting with the course instructor. The system was received pos- itively by both students and staff members. Moreover, the univer- sity reported that using learning analytics through Course Signals improved the success rates of first- and second-year students, as well as retention rates (Arnold & Pistilli, 2012; Dietz-Uhler & Hurn, 2013).

57

Figure 5: Course Signals at Purdue showing green (safe) and yel- low lights (borderline safe). Provided courtesy of Purdue Univer- sity, image used with permission.

7.2.2 E2Coach E2Coach is an adaptive advice system developed by the University of Michigan to provide personalized recommendations and guid- ance tailored to students’ goals. The advice was modeled on in- sights derived from peers in the form of motivation, recommendation of learning resources and feedback about perfor- mance. E2Coach (see Figure 6) creates a personalized dashboard for each student, containing a list of personal learning goals and objectives and delivering tailored recommendations. As the course advances, the system delivers custom personalized web updates and personal messages on the dashboard. The personal messages address students’ online regulation, deliver supportive messages, and contain comparative graphics and benchmarks. The dashboard 58 works as an awareness tool that prompts students to adopt a more effective strategy. E2Coach has helped motivate students to com- plete their courses and improve their study habits, and thus achieve more highly than their expected grades by 5%. Moreover, it offers students the advice they require in uncomfortable situations such as transitioning to new courses where they have little knowledge of what to do. Teachers have considered E2Coach a supportive tool that helps their students to focus on the important subjects (McKay, Miller, & Tritz, 2012; Wright, McKay, Hershock, Miller, & Tritz, 2014).

Figure 6: A sketch of the E2Coach system. The system matches advice to students’ information and personalizes the messages ac- cordingly. The image is available from Educause.edu through Creative Commons Attribution-Non-commercial License.

7.2.3 OU Analyse Open University OU Analyse is a modern learning analytics sys- tem, with the primary goal of early predicting underperforming stu- dents, whereby a significant intervention can have a meaningful

59 impact. OU Analyse receives data from two main sources: static (demographic) data and data derived from students’ interactions with the learning management system. Four predictive models are used to analyze the data: a Bayesian classifier for finding the most significant LMS resources, k-nearest neighbors (k-NN) and classi- fication and regression tree (CART) for the analysis of demo- graphic/static data and k-NN with LMS data. Results from the pro- ject indicated that identifying at-risk students was possible with 50% accuracy at the beginning of the semester, and accuracy in- creased to over 90% by the end of the semester (Kuzilek, Hlosta, Herrmannova, Zdrahal, Vaclavek, & Wolff, 2015).

60

8. Challenges in the field of learning analytics

The journey towards the development of learning analytics re- search has not been smooth. Indeed, numerous difficulties and challenges are associated with the journey, some of which are known, whereas others might emerge in the future. I present here a brief discussion of the most common and pressing challenges in the field faced by researchers, stakeholder, and institutions, with em- phasis on ethical and privacy issues as they are currently most per- tinent considering the European General Data Protection Regula- tion (GDPR). As a relatively new research discipline, many challenges and open questions face researchers and institutions implementing learning analytics for the benefit of their students or institutions. These problems have been the subject of research and inquiry, and nu- merous international scientific communities are working together to find solutions (Avella et al., 2016; Chatti et al., 2014; Daniel, 2015l Ferguson, 2012a, 2012b; Nunn et al., 2016; Tsai & Gasevic, 2017a, 2017b).

8.1 Theory and connection to learning sciences. One of the most pressing issues in learning analytics is how learn- ing analytics is connected to learning, informed by theory, or can result in sensible development in learning sciences. A discussion of the theory and its significance was presented in Chapter 5.

61

8.2 Ever-expanding data sources The diversity and scope of data sources will continue to grow ow- ing to digitalization and innovation, bringing new challenges of us- ability, integration, storage, privacy, and relevance (Ferguson, 2012b; Siemens, 2013). Moreover, working with data from differ- ent sources and attempting to optimize and integrate these large heterogeneous data sets will require new skills, tools, infrastructure and most importantly research methods (Chatti et al., 2014; Fergu- son, 2012b).

8.3 Student-centered analytics Analytics has always focused on institutional goals and perspec- tives rather than the needs of the learners themselves, a problem that highlights the importance of a shift towards new areas, beyond grades and performance to include motivation, self-regulated learning, confidence, and meeting future professional objectives. Analytics can also help emphasize the role of formative assess- ments and students’ work during the course (Ferguson, 2012b).

8.4 Capabilities Learning analytics requires trained personnel from different spe- cialties and disciplines. Furthermore, it requires highly trained data scientists and statisticians. With current shortages in such special- ties, it might prove difficult to build a learning analytics team. Sim- ilarly, deficiencies in leadership and expertise that can strategically build and maintain the implementation of learning analytics solu- tions currently exist (Nunn et al., 2016; Tsai & Gasevic, 2017b).

8.5 Validity Validity points to how accurate is the analysis. An inaccurate pre- diction might wrongly mislabel students or expose them to unde- sired stress. Validity can always be boosted by choosing relevant sources of data, applying the correct algorithms, verifying results and , and applying a feedback mechanism that permits evaluation of the entire process to avoid making conclusions based on erroneous correlations (Ferguson, 2012a).

62

8.6 Replication The validation and replication of results remain a challenge in learning analytics as a field and most likely in education as a whole. The lack of replicability of research findings as well as the rela- tively small number of evidence-based approaches to analytics may hinder many institutions from adopting, investing, or funding learning analytics research (Pashler & Wagenmakers, 2012; Tsai & Gasevic, 2017b).

63

8.7 Ethics and privacy issues Although learning analytics has existed for some time, the volume of research in the field regarding ethical and privacy issues remains relatively low. Moreover, the available guidelines or policies re- garding the ethics of conducting analytics research or user privacy had not matched the rapid pace of emerging research. Privacy is- sues can prove challenging because LA deals with data from mul- tiple sources, potentially entailing different policies (Drachsler et al., 2015; Lang, Macfadyen, Slade, Prinsloo, & Sclater, 2018; Pardo & Siemens, 2014; Prinsloo & Slade, 2018; Rubel & Jones, 2016). The absence of a uniform legal or standard ethical frame- work has stimulated the LA community to negotiate these complex and relatively new challenges by relying on inspiration from other disciplines such as medical research and industry, as well as the relevant regulatory, cultural, and legal frameworks (Drachsler & Greller, 2016; Tsai & Gasevic, 2017a). The main principles to con- sider when conducting analytics research or implementing LA in education are summarized below.

8.7.1 Transparency Transparency, openness, and clarity should be applied at all stages of learning analytics. Stakeholders should be adequately informed about the purposes of the captured data, collection methods, and processing. They should also be informed about the ways in which data will be analyzed if any third parties or companies will partici- pate in data processing. Students should be offered clear guarantees that data will not be sold or transferred to other institutions or legal entities without proper consent (Bienkowski et al., 2012; Rubel & Jones, 2016; Tsai & Gasevic, 2017a; Tsai et al., 2018).

8.7.2 Control over data Although challenging in practice, giving users (students or instruc- tors) the right to access and correct the data collected is in the best interest of all parties involved in LA. Users’ control of data would boost the accuracy of results and enhance user trust in the system and its validity (Pardo & Siemens, 2014; Petersen, 2012; Slade &

64

Prinsloo, 2013). This can be critical with certain types of data, such as correcting the dates of enrollment and financial data, modifying the duration of calculation of their course access due to an excused leave or erroneously captured information (Pardo & Siemens, 2014).

8.7.3 Agency Students should be encouraged to enjoy agency and voluntarily use the data for their own benefit and development, rather than merely for institutional profiling purposes or to decrease attrition rates. Agency emphasizes that students are participants in a learner-cen- tric model, in contrast to an intervention-centric model that has the primary aim of improving general retention and attrition rates through intervention (Rubel & Jones, 2016; Slade & Prinsloo, 2013).

8.7.4 Consent Obtaining informed consent is an established tradition in research practice, but in a rapidly evolving field with new tools and meth- ods, it is difficult to maintain a form of consent that covers the cur- rent and future development of analytics research (Mantelero, 2014; Roberts, Chang, & Gibson, 2017; Slade & Prinsloo, 2013). Concerns have also been raised that the consent and ethical frame- works created in recent decades may not be suitable in the age of big data and machine learning, citing the vast volume and diversity of collected data and the complexity of procedures, which regular users may be unable to understand in any depth. This may necessi- tate an increased role for independent authorities and data protec- tion bodies in order to enact the adequate necessary measures to protect users and ensure the correct balance between the benefits of the institution and user privacy (Mantelero, 2014; Prinsloo & Slade, 2018). Consent in learning analytics does essentially share the same gen- eral principles as consent in other research disciplines. Users should be free to give consent voluntarily without being subject to any kind of negative consequences should they reject. Users should

65 be well-informed of the whole process, the types of collected in- formation, storage, de-identification, future usage, and analysis. Details should be given transparently and in full (Kay, Korn, & Oppenheim, 2012; Pardo & Siemens, 2014; Slade & Prinsloo, 2013; Tsai & Gasevic, 2017a). Consent also needs to be revocable, so that users can opt out at any time, and the institution stops pro- cessing their information or the inclusion of their data in any pro- cessing. Full documentation of the entire process would be helpful in case of conflict (Gruschka & Jensen, 2014; Kay et al., 2012; Pardo & Siemens, 2014; Slade & Prinsloo, 2013).

8.7.5 Responsibility Institutional policies detailing rules and responsibilities concerning data processing and the entire process of analytics (capture-report- ing-analysis-feedback) should be clearly defined. A mechanism for complaint that would allow for the investigation of cases of abuse or breach of privacy should be in place, so that users can file con- cerns and request the removal of their information (Kay et al., 2012; Rodríguez-Triana, Martínez-Monés, & Villagrá-Sobrino, 2016; Roberts et al., 2017).

8.7.6 Minimizing Harm Potential adverse impacts of LA have been identified in terms of learning analytics, such as labeling students, categorization, bias and promoting discriminatory practices. Efforts should thus be made to proactively guard against such issues (Rodríguez-Triana et al., 2016; Tsai & Gasevic, 2017a).

66

9. Collaborative learning

In this chapter, I present the contextual background of the thesis: the collaborative learning environment. The chapter begins with an introduction to collaborative learning. I then introduce computer- supported collaborative learning (CSCL) as a medium for collabo- ration. One of the significant pedagogical developments in the last century was the introduction of collaborative learning, where a redefinition of the relationship between the student and the teacher was made. The teacher would no longer be regarded as a “sage on the stage” who instructs listening students, but instead a moderator who facil- itates and scaffolds students to actively search, share, collaborate and most importantly assume responsibility for their learning (Johnson & Johnson, 2008; Laal & Laal, 2012; Macgregor, 1990). Collaboration requires participants to mutually and intellectually coordinate their efforts to interdependently solve a problem, per- form a task, or work together on a project. Collaborative learning typically occurs in groups that share a common learning objective, and it is often characterized by substantial dialogue, argumenta- tion, and debate (Brindley, Blaschke, & Walti, 2009; Jeong & Hmelo-Silver, 2016; Johnson & Johnson, 2009).

9.1 What makes learning collaborative? Successful collaborative learning requires five essential elements (Johnson & Johnson, 2008, 2009): Positive interdependence: collaborators work together to accom- plish a mutual goal. The success or failure of the group’s task is dependent on the work of each individual. Therefore, participants

67 have a genuine interest in supporting each other in performing their roles (Johnson & Johnson, 2008, 2009). Accountability: positive interdependence increases participants’ feelings of responsibility to do their share of the task, and to facil- itate other members’ work. Promotive interaction: participants interact in a meaningful way by exchanging necessary resources and knowledge, providing sup- port and constructive feedback to one another, and debating and refining conclusions in order to improve the decision-making pro- cess. Appropriate use of social skills: for interpersonal interactions to be successful, collaborators should have social and small group skills that can help them to manage conflicts and stresses. Group processing: collaborators evaluate the group’s perfor- mance and reflect on its successes and failures. The process is ex- pected to improve future group performance and dynamics.

9.2 Benefits of collaborative learning Efficient collaborative learning has several benefits that fit into three main categories: psychological, social, and academic. Psy- chological benefits include boosting learners’ self-esteem, reduc- ing anxiety and promoting a positive student-teacher relationship. Socially, collaborative learning helps build a supportive environ- ment that promotes cooperation and diversity of understanding, as well as strengthening learning communities. Academically, collab- orative learning enhances the development of critical thinking and problem-solving skills. Collaborative learning helps students to re- tain information for longer, and motivates them to achieve and ac- complish their learning objectives (Johnson & Johnson, 2008, 2009; Laal & Ghodsi, 2012).

9.3 Computer-supported collaborative learning Ever since computers became part of mainstream education, edu- cators have sought to utilize information technology effectively. With the introduction of the Internet, computer-supported collabo- rative learning (CSCL) has become an essential tool for collabora- tive learning (Jonassen et al., 1995; Stahl, Koschmann, & Suthers, 68

2006; Wasson & Mørch, 2000). CSCL offers learners the possibil- ity to interact asynchronously regardless of time or location, thereby enabling collaborators to work at their own pace (Hew, Cheung, & Ng, 2009; Lou, Abrami, & d’Apollonia, 2001; Stahl et al., 2006). CSCL facilitates the process of creating and exchanging information through discourse and interaction with peers and ex- perts (Jeong & Hmelo-Silver, 2016; Resta & Laferrière, 2007). CSCL can be implemented in diverse ways, such as collaborative writing, project-based learning, and technology-mediated dis- course. A common implementation of technology-mediated dis- course is the threaded discussion board, known as a forum. Forums allow learners to interact asynchronously, collaborate, and con- struct new knowledge (Garrison, Anderson, & Archer, 1999; Hew et al., 2009; Janssen & Bodemer, 2013). Discussion boards permit permanent access to interactions, enabling learners to reflect and learn from each other in a transparent way (B. Barron, 2003; Janssen & Bodemer, 2013; Slof et al., 2010). Forums have become a standard feature of most major learning management systems and continue to constitute among the most commonly used modules by educators (Dahlstrom, Brooks, & Bichsel, 2014).

9.4 Analysis of collaborative learning Although CSCL may represent an effective medium for collabora- tion, research has shown that different mechanisms are required to motivate learners, facilitate discourse, manage conflicts, and most importantly monitor the collaboration itself. Four main methods are commonly used for the analysis of collaborative learning: in- teraction analysis, content analysis, educational data mining (con- tent space), and social network analysis (relational space) (Brind- ley et al., 2009; Dado & Bodemer, 2017; De Wever, Schellens, Valcke, & Van Keer, 2006a; Jeong & Hmelo-Silver, 2016).

9.4.1 Interaction analysis Interaction analysis presents statistics regarding forum activities and the contributions of participants such as the frequency of posts by a student, participation statistics, and the number of replies in a discussion. Although these statistics are informative about the level 69 of activity in a forum, they fall short in terms of relational aspects and can hardly be used to judge the quality or dynamics of the in- teractions thread (Dimitracopoulou, 2009; Rodríguez-Triana et al., 2013).

9.4.2 Content analysis Content analysis is concerned with the qualitative analysis of tran- scripts in order to reveal the process of knowledge construction and the social dynamics of interactions (De Wever et al., 2006a; Stem- ler, 2015). Content analysis requires an analysis instrument and manual coding. Content analysis instruments have been criticized for lack of clear synergy between theory and design with less than optimal validation procedures (Calvani, Fini, Molino, & Ranieri, 2010; De Wever, Schellens, Valcke, & Van Keer, 2006b; Mayring, 2014). Manual coding is time-consuming and labor-intensive and may produce contradictory results. Given that no unequivocal standard exists by which the reliability of analysis can be judged, the practice is only suitable for experienced researchers. Although manual content analysis can be useful in a research setting, it is an impractical choice for teachers or administrators wishing to moni- tor students’ interactions in real-time for intervention.

9.4.3 Educational data mining Educational data mining (EDM) tools may offer an automated analysis of large amounts of text (Al-Razgan, Al-Khalifa, & Al- Khalifa, 2014; Peña-Ayala, 2014; Romero & Ventura, 2013). However, existing tools are very technical for teachers to use or interpret (Al-Razgan et al., 2014; Papamitsiou & Economides, 2014; Romero & Ventura, 2013), are regularly misinterpreted (Pa- pamitsiou & Economides, 2014; Peña-Ayala, 2014), and the im- plementations often produce contradictory judgments (Pa- pamitsiou & Economides, 2014).

9.4.3 Social network analysis Social network analysis may be particularly useful for the analysis of the relational space of CSCL as it offers the detailed mapping of

70 relations and interactions. The technique was discussed in the anal- ysis in Section 6.3.4.

71

10. Methods

The five studies of this thesis were based on 13 different courses that share several features. I will discuss the general context of ed- ucation in the university as well as the specific context of the stud- ies. This will be followed by the approach used for data capture, preprocessing and analysis.

10.1 The general context Data collection for the studies took place at the College of Medi- cine, Qassim University, Saudi Arabia, (Studies 1-3), and the Col- lege of Dentistry (Studies 4 and 5). Both colleges have curricula based on problem-based learning (PBL), which uses problem sce- narios (for instance, clinical cases) as triggers to facilitate discourse and interaction among students. Discussions occur in small groups and are facilitated by teachers (Neville, 2009). The PBL process is structured in a way to help students to elaborate and activate previ- ous information (Neville, 2009; Schmidt, Rotgans, & Yew, 2011). Several implementations of PBL can be identified; in particular, the university adopted the Maastricht seven jumps approach (Davis & Harden, 1999). A schematic representation of the classic seven jumps approach is shown in Figure 7. In the seven jumps approach, students are expected to attend two sessions physically, one at the beginning of the week and another at the end. During the first session, students clarify new vocabulary to find its meaning, then identify the problems that need to be ad- dressed. They subsequently brainstorm possible solutions to iden-

72 tify the learning objectives they need to study. In the second ses- sion, students share the knowledge and wrap up what they have learned (Davis & Harden, 1999).

Clasify terms and objectives of the session

Conclude results Identify the and evaluate problem performance

share the results and learning Brain storming resources

identify the Study privately learning objectives

Figure 7: A schematic representation of the classic seven jumps approach.

Qassim University has introduced a blended problem-based learn- ing approach by introducing an online forum so that students can continue to interact asynchronously for the whole week until a new problem is introduced. In fact, most interactions are taking place online. As such, students discuss learning issues, share infor- mation, collaboratively construct the required knowledge, and work towards achieving the goals and objectives of the PBL prob- lem on the online forum (Alamro & Schofield, 2012). A schematic

73 representation of the approach adopted by Qassim University is shown in Figure 8.

• Clarifying • Wrap-up terms and from Sunday to Thursday session • to discuss identifying (Online) learning the objectives conclusion • Online discussion of the problem, sharing s. explanations, solutions and resources Sunday (face-to-face) Thursday (face-to-face)

Figure 8: A schematic representation of the approach adopted by Qassim University

Problem-based learning is the primary learning strategy in the basic science phase (years 1-3). In the clinical phase, the curriculum is based on the traditional lecture and clinical sessions approach. Studies 1, 3 and 4 were conducted in problem-based courses, and Study 2 was conducted in a traditional course. E-learning has been used extensively at Qassim University since its introduction in 2008 by the author of this thesis, where he served as the supervisor of the e-learning unit. All courses rely on e-learning for announce- ments, lecture delivery, formative assessments, and collaborative computer-supported discussions. Students are expected (but not obligated) to use the e-learning portal for a wide range of learning activities, rendering it key to their educational progress. As such, the activities of students in e-learning are reflective of students’ engagement, self-regulation, and self-evaluation. The e-learning portal is based on the open source learning manage- ment system, Moodle, available at www.moodle.org. Moodle is a

74 free modular software built to enhance interactivity through a con- structivist-centered design (Dougiamas & Taylor, 2003). Moodle has built-in modules for course delivery, interactivity, assessments, announcements, and feedback, as well as very vibrant, community- developed plugins that allow for the extension of functionality. Alt- hough Moodle has a very mature logging system, it lacks the fea- tures necessary for analytics. Nevertheless, Moodle’s developers recently launched a new project, Inspire, which can be deemed Moodle’s answer to learning analytics. Inspire is still in the early stages of its development, with limited functionality. Therefore, re- searchers interested in applying learning analytics research must use external plugins and manual analysis of the Moodle logs to ex- tract the required information (Falakmasir & Habibi, 2010; HQ, 2017).

10.2 Research methodology The methodological outline of learning analytics research follows the learning analytics process discussed earlier, depicted in Figure 1 (H. Chen et al., 2012; Clow, 2012b; Siemens, 2013). The process starts with data capture, followed by preprocessing and prepara- tion, and later analysis, interpretation, and action. The first three stages are analogous to the standard data mining procedures com- mon to analytics research (Gandomi & Haider, 2014; Wolff, Zdra- hal, Herrmannova, Kuzilek, & Hlosta, 2014). The general structure of the methods process is outlined, followed by specific infor- mation concerning each step.

10.2.1 Data capture This stage involved the recording and gathering of data from the learning management system as well as other sources such as per- formance data and attendance data. The methods used for data ac- quisition in this thesis included external Moodle plugins and man- ual extraction of data from the Moodle logs using structured lan- guage queries. While the process was similar in all studies, Study 75

1 involved slightly different indicators. The details of data collec- tion of each study are as follows: Study 1: Three types of indicators were used in this study: generic indicators, calculated engagement sub-indicators, and access to specific resources (orientation, evaluation, course objectives, etc.) (Saqr et al., 2017). The data were acquired using external Moodle plugins and database queries. The data were extracted each week separately to account for temporal activities and the regularity of using the online portal. The plugin Attendance Register was used to calculate the total duration of students being online each week . The configurable reports plugin was used to extract page views, logins, forum posts, forum edits and forum reads. The plugin Ana- lytics Graphs was used to calculate the number of unique days of course and resource access. These plugins are available at the Moodle plugins directory at www.moodle.org/plugins. Lastly, No- deXL, a social network analysis software, was used to calculate the Betweenness centrality (Smith et al., 2009). Data for views of course orientation (course objectives, requirements, and assess- ment), and self-evaluation (formative assessment trials and grades obtained) were also collected using the configurable reports plugin queries. While the external plugins helped gather the indicators of students’ activities, the collected indicators comprised the counts of clicks and views, or “generic indicators” that have in the past been criti- cized for being less representative of learning and for having only a weak connection to learning theory. Therefore, I calculated the engagement sub-indicators. A detailed explanation of the concept and method of the engagement indicators will be discussed in the data preprocessing and preparation section. Studies 2, 3, 4 and 5: The four studies used social network analysis methods, hence data acquisition and processing was similar. Given that Moodle does not have a plugin to handle or export social in- teraction data, I used custom structured language queries to extract raw interaction data with the attributes of the users. The attributes collected were: the course ID, the group ID, the source of the inter- action, the target of the interaction, the subject of the interaction, the content of the interaction, the forum ID and the sub-forums’ 76

IDs, and the time-stamp) (Saqr, Fors, & Nouri, 2018; Saqr, Fors, & Tedre, 2018; Saqr, Fors, Tedre, & Nouri, 2018; Saqr, Nouri, & Fors, 2018).

10.2.2 Data preprocessing and preparation In this stage, data extracted through plugins were combined, in- spected, and verified for quality. Mislabeled data were corrected, and corrupted records were removed. For Study 1, the engagement sub-indicators were calculated to reflect the regular usage of the online portal over the course duration, comparing each student to the average activity of his or her peers (Saqr et al., 2017). In other words, a student was considered engaged if his or her online activ- ity was more than a 1-z score of the activity of peers. The engage- ment sub-indicators were calculated for each week, and the total score of the six weeks for each student was computed. The engage- ment sub-indicators were computed for course views, forum posts, online time, and forum reads. For Studies 2, 3, 4 and 5, the data were assembled, verified, and converted to a file format ready for analysis by the social network analysis software (Saqr, Fors, & Nouri, 2018; Saqr, Fors, & Tedre, 2018; Saqr, Fors, Tedre, & Nouri, 2018; Saqr, Nouri, & Fors, 2018).

10.2.3 Data analysis, interpretation, and reporting During this stage, I performed summarization of data through vis- ualization and descriptive statistics to detect patterns and the dis- tribution of variables and to attain a general overview of the data set. This was followed by exploratory statistical analysis, a step to- wards the identification of the outliers and significant predictors, and to test the possible analytical methods (Velleman & Hoaglin, 2012). According to the insights from the visualization and the de- scriptive and exploratory statistical analysis of the data, the appro- priate statistical models were identified.

77

10.2.3.1 Statistical analysis In order to answer the question of which variable correlates with better performance, a correlation test was performed in Studies 1, 2 and 4. The correlation test was computed using permutation methods in cases where the data set constituted social network data to avoid the problem of interdependence. Interdependence is a con- cern as SNA data are relational by design and each connection may be dependent on other connections. In such situations, permutation methods might be considered robust and credible when inferring SNA data (Borgatti, Everett, & Johnson, 2018; Hothorn, Hornik, Van De Wiel, & Zeileis, 2008). In permutation, the software gen- erates plentiful resampled random distributions (10,000 in this the- sis), which are later compared to the observed data to calculate the significance. In order to predict students’ performance, I used automatic linear regression (ALM) in Study 1 as well as in Study 2 where the data belonged to a single course (Saqr et al., 2017; Saqr, Fors, & Tedre, 2018). ALM is a machine-learning algorithm that has the capacity to normalize outliers, including all predictors in the analysis, and to identify and remove collinear predictors. Such features are lack- ing in simple linear regression models. Furthermore, the model of- fers ensemble capabilities; hence I used the ensemble method boot- strap aggregation (bagging). Bagging is a machine-learning method that enhances the accuracy and stability of the model, re- ducing the variance and overfitting. Bagging is performed by gen- erating random subsamples (100 subsamples in this thesis) of the data and using these subsamples for training. The resulting model is an average of all iterations in the model (Filippou, Cheong, & Cheong, 2015; Yang, 2013). ALM was chosen because it is an au- tomatic model that demonstrates its feasibility for a learning ana- lytic dashboard that offers insights without human intervention (Filippou et al., 2015; Yang, 2013). In order to classify achievement levels, I used logistic regression (Saqr et al., 2017; Saqr, Fors, & Nouri, 2018; Saqr, Fors, & Tedre, 2018). Logistic regression is a powerful algorithm for the classifi-

78 cation of binary outcomes. Logistic regression does not require lin- earity, normality, or equal variance as assumptions. To evaluate the model performance, I used the -2 log likelihood, Cox & Snell R- squared, the omnibus tests of model coefficients, and Hosmer and Lemeshow goodness-of-fit (Bewick, Cheek, & Ball, 2005; Brooks & Thompson, 2017; Gardner & Brooks, 2018; Peng, Lee, & Inger- soll, 2002). I subsequently used machine learning methods with k-nearest neighbors algorithm (k-NN) and naive Bayes algorithms (Saqr, Nouri, & Fors, 2018). KNN was selected as it classifies students based on similarities, hence students who might share specific ac- tivity features may become easier to group together. Naive Bayes algorithm was also used to evaluate the same data as it proved in our case to perform more effectively when classifying undera- chievers (Brooks & Thompson, 2017; Kotsiantis, Zaharakis, & Pintelas, 2007; Zhang, 2016). Both algorithms were performed with a ten-fold cross-validation method. The ten-fold cross-valida- tion was performed by rotatory partitioning of the data set into ten random equal subsamples. Model training was undertaken 10 times, and in each, training was completed using nine subsamples, with the remaining subsample used for validation. Thus, each sub- sample was used for training and validation. The method had the advantage of more generalizable models and less overfitting. To judge the model performance, I used accuracy, precision, F meas- ure, and area under the curve (AUC) (Saqr, Fors, & Nouri, 2018; Saqr, Nouri, & Fors, 2018). Other statistical tests were also used. Indeed, the Wilcoxon signed ranks test was used to compare the difference between students’ interactivity parameters before and after intervention. The Shapiro- Wilk test of normality proved that most variables in the study fol- lowed a non-normal distribution (Saqr, Fors, Tedre, et al., 2018). Model stability, reliability, and validity were stressed in all of the studies. In the Studies 1 and 2, an ensemble technique was used to create 100 subsamples and to calculate the average prediction re- sults (Saqr et al., 2017; Saqr, Fors, & Tedre, 2018). In Study 4, a following year’s data set was used to validate the results and demonstrate the possibility of forecasting future performance 79

(Saqr, Fors, & Nouri, 2018). In Study 5, a ten-fold cross-validation technique was used to improve the generalizability of results. Moreover, the data from each course was used to validate the pre- dictions of the subsequent courses (Saqr, Nouri, & Fors, 2018).

10.2.3.2 Social network analysis A combination of visual and mathematical analysis was used. Gephi 0.9 was used for the visual network analysis and for the cal- culation of centrality measures (the mathematical analysis) (Bastian, Heymann, & Jacomy, 2009). SNA visualization was used to summarize the interactions and relationships and to visualize the social structure. Visualization was also used to guide the mathematical analysis and identification of collaborators’ roles. Visualization was undertaken using a force directed algorithm, which is an aesthetically plausible and simple-to-interpret algo- rithm that renders the nodes according to their structural properties. Furthermore, it can be used to create a continuous animated time- line of interaction events (a dynamic network). An animated com- pilation of all interactions was created in the form of a time- lapse video to demonstrate the timeline of events and better repre- sent the changes of behavior across the time dimension (Bastian et al., 2009; Jacomy, Venturini, Heymann, & Bastian, 2014).

Mathematical analysis Two levels of mathematical analysis were performed, at the individual level and the group level. At the individual level, a model was developed to capture the interactions, relationships, and attributes of the collaborators. The model can be summarized as follows:

Individual collaborators’ level:

The quantity of interactions:  Outdegree centrality represents the quantity of outgoing interactions;

80

 Indegree centrality represents the total count of received interactions;  Degree centrality is the total of both outdegree and indegree centrality measures.

Role in relay and transfer of information:  Betweenness centrality represents the number of times a collaborator connected others or mediated their interac- tions;  Closeness centrality represents the closeness of a collabo- rator with others for communication;  Information centrality measures the role of a collaborator in information flow within his or her group.

Connectedness:  Eigenvector centrality measures the connectedness of an individual, and the strength of his or her network con- nectedness;  Indegree prestige is the number of directly connected col- laborators and an estimate of the ego network;  Proximity prestige is the number of directly or indirectly connected collaborators;  Rank prestige represents the strength of the network;  Domain prestige measures the popularity of the collabora- tor.

Role in collaboration: Based on the visual and centrality measures analysis, roles were extrapolated for students and teachers guided by the work of Marcos-García, Martínez-Monés, and Dimitriadis (2015). The following roles were identified:  Leader: an active collaborator who leads the discussion;  Coordinator: an active collaborator who coordinates inter- actions;  Active collaborator: an active collaborator;  Peripheral: an isolated participant. 81

The group level: The parameters of group size, average degree, average indegree, average outdegree, density, and clustering coefficients were cal- culated. These parameters capture the different interactivity constructs in a network as well as the community’s cohesive- ness. A detailed discussion of the centrality measures and social network analysis was presented in Chapter 6 in Section 6.3.4.

10.3 Research ethics Navigating ethical approval options was not a straightforward en- deavor. In my institution, Qassim University in Saudi Arabia, no precedent existed to guide me to the available options of using stu- dents’ online data for research. Given that working with such kinds of data was new, I tried to tackle the problem by first creating an online privacy and data protection policy; second an informed con- sent covering general data usage purposes such as daily trends of system usage; and third obtained ethical approval from the govern- ing body.

10.3.1 Data protection and privacy policy My first step was to create a data protection and privacy policy that governs the ethical use of data and protects users’ rights. The data protection policy was inspired by policies from leading institutions, namely the Open University United Kingdom (University, 2014), the Learning Analytics Community Exchange (LACE) (Drachsler & Greller, 2016), and the Centre for and Interoperability and Standards (CETIS) (Sclater & Bailey, 2015). The policy contains ten guarantees for the students (use the data for students’ benefit, transparency, compliance with law, no harm, an- onymity, personal data protection, right to correct data, right to withdraw, control over data, and consent for any usage beyond terms). The policy was distributed to all students and teachers in the institution via email and published on the website, and students

82 had to acknowledge it on their first login. The policy was later re- vised, updated, and ratified among the college policies by the Pol- icies unit. Throughout this research, the data protection policy was followed strictly. All data were anonymized, personal identifying infor- mation was masked and no private data or personal information were stored, used in the analysis or published. It is also important to mention that use of the e-learning portal is neither graded nor mandatory, and depends solely on a student’s interest. Moreover, the researcher was not a teacher and did not participate in the grad- ing of any of the courses used in this research.

10.3.2 Consent The second step was to develop an online consent form that users of the LMS could sign. The consent form granted limited rights for using the anonymized data within the framework of “Data protec- tion and privacy policy” for research purposes. The consent form was also used to communicate information and privacy rights to the students, and detailed cases requiring the student’s explicit ap- proval or approval from a governing ethical body. The consent form was built in compliance with the National Committee of Bi- oethics (NCBE) on Living Creatures and good research practice (Hermerén, 2011), and guided by international standards (Tsai & Gasevic, 2017a). The consent form also offered students the option to withdraw with no consequences or denial of service, and empha- sized their rights to privacy as well as the possible uses of data for research (Hermerén, 2011; Qassim College of Medicine, 2015).

10.3.3 Ethical approval Given that the research contained data from multiple sources in- cluding students’ grades and logged records from the LMS, I ap- plied for and obtained ethical approval from Qassim Regional Re- search Ethics Committee. The proposal was reviewed and ap- proved by the university research center. It was then reviewed by the Qassim Regional Research Committee, which reviewed the 83 study proposal, data collection methods, reporting methods, data handling policies, and outline of the five studies. Having reviewed the documents, approval was issued, and the committee stressed that the privacy of students should be protected. A copy of the eth- ical approval is shown in Figure 9.

Figure 9: A copy of the ethical approval

84

11. Overview of the results

This thesis is composed of five interrelated studies that build on insights from one another. The first study attempted to use engage- ment and self-regulation as proxy indicators for learning in the early identification of underachievers (Saqr et al., 2017). Through the first study, I identified the importance of collaborative and net- worked aspects of learning. The second study then investigated the use of collaborative learning indicators that are contextually aligned with the process of collaborative learning (Saqr, Fors, & Tedre, 2018). The study objectives comprised how to inform teach- ers and improve the prediction of students’ performance. The third study built on the second, using the insights gained from the mon- itoring of collaborative learning to create an informed intervention and evaluate its efficacy (Saqr, Fors, Tedre, et al., 2018). The fourth study sought to build on the insights gained from the previ- ous three studies and create a more reliable predictive model of achievement in the online PBL environment (Saqr, Fors, & Nouri, 2018). The fifth study tried to account for the dimension of time and temporality in a collaborative learning environment (Saqr, Nouri, & Fors, 2018). The five studies thus help answer the two research questions. The results will be discussed in detail in the following section.

RQ1: How can learning analytics and social network analysis reliably predict students’ performance in collaborative learning environments using contextual, theory-guided indicators? The first study was conducted with the aim of investigating the online activity variables that best correlate with student perfor- mance in a PBL-based course, as well as to identify the indicators

85 that can support the early prediction of underachievers (Saqr et al., 2017). The results indicated a weak correlation between the generic indicators (count of page hits, resource views, number of forum postings and duration of being online) and student performance. Nonetheless, the parameters of engagement and self-regulation demonstrated a greater and more consistent correlation with per- formance. The most significant variables were those reflecting the task orientation, self-monitoring, evaluation, and consistency of using the e-learning portal. The results of automatic linear modeling demonstrated that the var- iables that reflect self-regulation (task orientation, engagement sub-indicators, and self-evaluation) and forum postings (interac- tions) represented the most significant predictors of performance. The adjusted R-squared of the model was 0.64, an indication that the model could explain 64% of the variance of the grades. A bi- nary logistic algorithm was then used to investigate the possibility of classifying underachievers using the significant parameters identified in the previous step. I was able to detect 21 out of 26 underachievers with a sensitivity of 80.8% CI (60.7% to 93.5%) and a specificity of 96.3% CI (93.4% to 99.8%); the AUC value was 0.9 and the F-score was 0.96. Using the data of early participation up to the midterm, the prediction accuracy was 85.7%, the sensitivity of detecting high achievers was 96.2%, and the specificity was 42.3%. The AUC was 0.69 and the F-score was 0.91. The results of Study 1 indicated that engagement and self-regula- tion indicators were reliable predictors of performance and could be used to early predict a substantial number of potential underachievers. Given that engagement and self-regulation are modifiable, they are amenable to positive intervention and may be used by educators and stakeholders for data-driven action. While Study 1 included CSCL forum postings, reads, and views, and investigated how students self-regulated their learning activi- ties, Study 2 examined the relational, structural and interactive side of collaborative learning (Saqr, Fors, & Tedre, 2018), specifically how participation, interactions and network parameters were cor-

86 related with academic performance. The operationalization of in- teractivity was achieved using a framework capturing the quantity, role, information transfer and connectedness, along with the group’s structural and interactive properties. Two statistical methods were used – correlation coefficients and automatic linear regression – in order to investigate how social net- work parameters could predict the final grades or account for grade variance. The results of the correlation coefficients and automatic linear regression showed that the parameters of the quantity of in- teractions, akin to the generic indicators of counts of activities, were weakly correlated with better performance. However, indegree centrality showed the highest correlation with perfor- mance, indicating that the quality of contributions that stimulated replies from peers was most significant. The results also showed that the SNA variables that were indicative of the role in infor- mation transfer and connectedness were moderately correlated with performance. Using a classification binary logistic algorithm, I was able to successfully classify 85.7% of the students according to their performance (91.7% of high achievers, 72.7 % of undera- chievers). These results demonstrate the value of using social net- work analysis indicators as predictors of performance. Having obtained reasonable predictive accuracy using SNA as in- dicators for performance, the next step was to investigate the pos- sibility of creating a model that is reproducible from one course to another, via a theory-guided approach. Although a general model may not be feasible in all contexts, a context-specific model might be achievable in some. Online problem-based learning may repre- sent a good candidate for such a trial because online PBL is a struc- tured approach to collaborative learning and is relatively uniform. The teaching model of PBL has three essential factors: the students, the group, and the tutor (Neville, 2009; Schmidt et al., 2011). As such, the fourth study was conceptualized to study the operational- ization of the interactions and collaboration among the three factors (Saqr, Fors, & Nouri, 2018). The study accounted for the roles of students, tutors and their groups using SNA in four courses span- ning a full year of education.

87

The results indicated a positive correlation between interactivity parameters and performance in all the courses studied, regardless of the course subject. Students who were strongly connected to sig- nificant collaborators in small interactive groups performed best. The results of the correlation test were confirmed by the regression analysis. The regression model was able to classify students’ per- formance with an accuracy of 93.3% and an F-measure of 90%. To validate the results and to investigate if learning analytics data can be used to forecast future students’ performance, the following year’s data set was used. The results revealed that students’ future performance could be predicted with an accuracy of 83.1% and an F-measure of 87.6% when all of the students’ data were used. The average accuracy improved to 90.9% when each course was used to predict the corresponding next year’s course iteration. The results of the fourth study offered evidence that learning ana- lytics can use a theory-guided approach to create reliable future predictions. Furthermore, the results provided proof of the power of SNA in operationalizing the activities of students in online col- laborative learning environments. Such results present an oppor- tunity to create support strategies that can contribute to better learn- ing and teaching in collaborative learning environments. Having studied the interactivity constructs in the previous studies, I looked into the time factor in Study 5 (Saqr, Nouri, & Fors, 2018). I hypothesized that temporality might offer a clue regarding stu- dents’ self-regulation of their collaborative learning and may help improve the replicability and reproducibility of the predictive mod- els. The study investigated the temporal patterns of interactions in collaborative problem-based learning. The study investigated four patterns of temporality (day, week, course, and year level) over a full academic year, in which there were 2,783 interactions by 185 students in four courses. In the first part of the study, I used visualization to summarize the interactions in a representative graph for each time period so that any patterns could be observed. Plotting the early weekly interac- tions demonstrated that high achievers were more likely to partici- pate early and consistently. The same findings were also evident at the course level: early course interactions indicated a pattern of 88 high achievers, although this was statistically insignificant. How- ever, at the year level, high achievers tended to be active during the first course of the academic year. This finding was reversed by the end of the year and close to the exams, when low achievers partic- ipated more often. Using the temporality variables, I was able to predict high and low achievers with reasonable accuracy. I used data from the first course to predict the second course; the first and second courses to predict the third course; and data from the first three courses to predict achievement in the fourth course. The results showed that temporality data could provide a reasonable predictor of perfor- mance, where the class recall of identifying low achievers ranged from 91.7% to 92.9%. The findings of the predictive algorithm confirm an important finding: that timing can be used with a good level of accuracy to predict students’ achievement. Furthermore, given that early participation is the predictor, an early alert indica- tor may be yielded to stimulate timely intervention.

RQ2: How can social network analysis be used to analyze online collaborative learning, guide a data-driven intervention, and evaluate it? In order to investigate the role of social network analysis in monitoring the efficiency of collaborative learning, visualization was used as a tool to inform students and teachers at a single dis- cussion level, and the whole course level in Study 2 (Saqr, Fors, & Tedre, 2018). The use of social network analysis to map the course interactions enabled the analysis of a large number of interactions among students in a single visualization. The course’s network vis- ualization demonstrated the role of the teacher, which was domi- nating the discussions. The visualization also enabled the identifi- cation of engaged and disengaged students. Such insights offered an obvious opportunity to use the monitoring of collaborative learning in guiding teachers about the status of their students. The study also highlighted the value of visualizing individual discus- sions as the unit that makes up the full course network, and a pos- sible target for intervention where it becomes dysfunctional or non- participatory. Furthermore, visualizing the information-giving and 89

-receiving networks demonstrated how information exchange could be tracked in a course, as well as how these visualizations can shed light on the key actors of information exchange. Visualization was augmented by mathematical network analysis or centrality measures in four aspects: the quantity of interactivity, the role and position in information transfer, and connectedness as well as group interactivity. Mathematical network analysis offered a precise view of the course network, each user’s position, level of connectedness, and interactions. Such insights are not plausible us- ing traditional interaction analysis methods, which only count the hits or replies while ignoring the importance of structure, relation- ships and interactions. Given that the previous study demonstrated that SNA could effec- tively enable the monitoring of collaborative interactions, the next logical step was to apply the findings. Thus, Study 3 investigated the possibility of using social network analysis to monitor the effi- ciency of online collaboration among students and to identify the gaps and shortcomings that may need to be addressed. The second aim of the study was to use the insights from the monitoring to create insightful actions that would help improve and evaluate col- laborative learning (Saqr, Fors, Tedre, et al., 2018). The studied courses make use of online clinical case discussions to motivate students to engage in collaborative discussion regarding patients’ management, clinical reasoning, and co-construction of knowledge. The study was divided into three stages. The aim of Stage 1 was to implement a monitoring mechanism, whereby I monitored the three courses using SNA visual and mathematical analysis. Moni- toring included the centrality measures relevant to collaborative knowledge construction, namely level of activity, role, and position in information exchange and the role played in the group. The group was also monitored for interactivity and cohesion. In Stage 2, the data were analyzed to identify gaps and areas in need of im- provement, before designing an applicable, data-driven interven- tion. In Stage 3, I evaluated the efficacy of the intervention by com- paring pre-to-post intervention and assessing the value of the pro- cedures. 90

Monitoring revealed a non-participatory pattern of interactions, where the teacher dominated the interactions and few student-to- student interactions were seen. It also revealed limited information exchange or negotiation on the part of the students. This infor- mation was used to design an intervention to address the shortcom- ings identified in the monitoring stage. The intervention included raising awareness, orientation about the goals of collaboration, and training of teachers and students. The clinical case discussions were improved using a collaborative script to stimulate the argu- mentation and co-construction of knowledge. The evaluation of in- tervention revealed a significant improvement in collaborative in- teractions among students, greater number of active students par- ticipating in discussions, and increased group cohesion. As such, this study offered evidence of an impactful intervention using learning analytics methods.

91

12. Discussion

The discussion chapter begins with a general introduction, fol- lowed by a discussion of the approach and findings of each re- search question. Examination of the significance of the findings of each research question will be introduced thereafter. I will subse- quently discuss why SNA was chosen over other methods. The fi- nal section discusses the difficulties and the limitations of the the- sis. Although learning analytics for the early identification of undera- chievers can offer an obvious opportunity for educators to imple- ment supportive remedial strategies, the process has to provide in- sights into the corrective strategy that should be implemented. Moreover, learning analytics research has to contribute to the un- derstanding of learning and how the learning process unfolds. The prime focus of this research was to use learning analytics to under- stand the collaborative learning process through pedagogically and contextually aligned indicators of learning. It also sought to pro- duce insights to help guide an informed, data-driven intervention with real impact for educators and students alike. I additionally sought to include the different dimensions where learning occurs and evaluate the ways in which they affect learning and learners.

RQ1: How can learning analytics and social network analysis reliably predict students’ performance in collaborative learning environments using contextual, theory-guided indicators? The first study built upon previous research findings (Saqr et al., 2017) and sought to extend data collection by including indicators regarding access to resources, course views, time on task, online interactions, social network parameters, and formative assessments

92 across different points in time (Gašević et al., 2016; Macfadyen & Dawson, 2010; Tempelaar et al., 2015; Wolff et al., 2014). In order to improve inference, I calculated the engagement sub-indicators (computed indicators that reflect sustained engagement with the online course materials and normalize extreme clicking behavior) as well as access to resources that reflect students’ self-regulation (course objectives and orientation materials, engagement with the tasks, formative assessments and self-evaluation). As such, I used indicators that are more pedagogically aligned with learning and avoided the one-shot approach that merely accounts for the final counts of clicks and views of course pages and resources. This ap- proach highlighted the importance of time, sustainable effort, and regularity. Building the inference of students’ learning on the proxy indicators of engagement and self-regulation has the ad- vantage of being based on a valid pedagogical model, which teach- ers can use to scaffold, support, and proactively apply a meaningful intervention (Cicchinelli et al., 2018; Reschly & Christenson, 2012). The subsequent studies sought to increase the resolution of data collection and account for the interactions, relations, connected- ness, and the role collaborators in the collaborative learning envi- ronment. In Study 2, the results from the exploratory statistical analysis with correlation tests as well as the predictive models across different points of time demonstrated that students’ indegree, role in information exchange and strength of connected- ness were significant indicators of performance. However, indica- tors of the quantity of interactions were not correlated with perfor- mance. Such findings demonstrate the importance of the quality rather than quantity of students’ contributions and underscore the role of SNA over traditional interaction analysis, which offers the simple quantification of students’ number of postings (Saqr, Fors, & Tedre, 2018). In the next study, I used a theory-driven model that accounted for the different factors influencing the learning process, in order to produce a model that may have the advantage of generalizability and that could augment understanding of the learning environment. In order to achieve this goal, the problem-based learning context 93 was chosen because it is a fairly uniform teaching model (Neville, 2009; Schmidt et al., 2011). Study 4 offered evidence that models can be generalizable when guided by theory and context (Saqr, Fors, & Nouri, 2018). In contrast to studies that face the replicabil- ity problem of using SNA in collaborative settings (Dowell et al., 2015; Hernández-García et al., 2015; Joksimović et al., 2016; Jiang et al., 2014), the results of this thesis were reasonably replicated from year to year and from course to course. The reason might be the carefully chosen variables, the operationalization of the con- text, and the similarity of the teaching methods in the four courses. Due to the potential significance of the time construct and seeking to further explore the dimension of temporality in collaborative learning, I investigated the role of time as a factor that may explain students’ self-regulation (Saqr, Nouri, & Fors, 2018). The results showed that time plays a pivotal role in students’ self-regulation of their learning and can be used as a valid metric to differentiate stu- dents according to achievement. A predictive model based on tem- porality indicators was reliable from one course to the next, which is an important step towards more generalizable and reproducible models. The online learning environment is rather demanding and requires autonomy and support. This renders the promotion of self-regula- tion skills of utmost importance to performance (Barnard et al., 2009; Sonnenberg & Bannert, 2015). Research has shown that strategies that target the disengagement of learners can efficiently improve learning and students’ achievement (Ballard & Butler, 2016; Cruz-Benito, Therón, García-Peñalvo, & Lucas, 2015; Tempelaar et al., 2015). Accordingly, interventions that target the promotion of self-regulation can have a significant and positive im- pact on student academic achievement, as reported in strategies im- plemented in classrooms, online and in the workplace (de Boer, Donker-Bergstra, Kostons, Korpershoek, & van der Werf, 2013; Kistner et al., 2010; Sitzmann & Ely, 2011; Zumbrunn, Tadlock, & Roberts, 2011). Therefore, it is reasonable to conclude that using learning analytics to enhance self-regulation and to support stu- dents is timely and in the best interests of the students and the teachers. 94

A positive association between performance and social network metrics has been described by different researchers. For instance, degree and closeness centrality have been reported by Cho et al. (2007) and Gašević et al. (2013). Similarly, degree centrality was reported by Hommes et al. (2012), weighted degree by Joksimović et al. (2016), and out-degree and clustering coefficient by Reychav, Raban, and McHaney (2017). Nevertheless, there is a lack of con- clusive evidence regarding which social network constructs corre- late best with performance (Dowell et al., 2015; Hernández-García et al., 2015; Joksimović et al., 2016). In my thesis, I have attempted to create a broader model that captures the different interactivity constructs in contrast to previous research that has used different centrality measures that only partially account for students’ inter- actions (Cho et al., 2007; Joksimović et al., 2015; Romero & Ventura, 2013). The value of the current model is that it combines the significant constructs of interactions: quantity of interactions, information transfer, connectedness, and social capital at the indi- vidual level as well as in terms of group interactivity and cohesion variables. As such, it lays the groundwork for a structured approach to using social network analysis for a greater understanding of col- laboration and the factors behind improved performance. Indeed, the model helped create a predictive model that has been replicated from year to year.

RQ2: How can social network analysis be used to analyze online collaborative learning, guide a data-driven intervention, and evaluate it? Monitoring the interaction using visualization offered an easy-to- interpret general view of the course status of collaboration, and mapped the roles played by teachers and students. In Study 2, SNA visualizations were used as a tool to inform teachers, while the mathematical analysis was used to offer a precise estimation of the properties of the actors and their networks (Saqr, Fors, & Tedre, 2018). Given that the second study offered insights regarding course monitoring, it was important to investigate how SNA can be used to monitor online collaboration, and how the monitoring data can be used to design, apply and evaluate a data-driven intervention 95 using social network visual and quantitative analysis (Saqr, Fors, & Nouri, 2018). The monitoring relied on an improved version of the framework developed in the previous study, which included the three constructs for each learner (level of interactivity, information exchange, and role in collaboration) as well as the group parame- ters. The monitoring of the three courses during the first term of the study revealed a seemingly active (in terms of quantity of in- teractions) yet non-interactive or participatory learning environ- ment. Moreover, few students played an effective role in infor- mation exchange, whereas teachers assumed a central dominating role. The intervention was designed to address such gaps identified in the monitoring stage of the study. The evaluation revealed a sig- nificant increase in student-student and teacher-student interac- tions, improved group cohesion, as well as a participatory pattern of interactions among course participants. A considerable volume of research exists in the field of collabora- tive learning using social network analysis. Most studies have ad- dressed the potential to study patterns of interactions in online learning, knowledge construction and the identification of promi- nent actors in collaborative groups (Cela et al., 2014; Dado & Bodemer, 2017). However, my study is probably one of the first empirical reports regarding the potential of SNA-based interven- tions in a real-life application (Cela et al., 2014; Dado & Bodemer, 2017). The findings of my study underscore the role of SNA-based indicators as valid and reliable metrics that can guide a real impact in collaborative contexts. This research project can be considered a rather complete represen- tation of a complete learning analytics cycle. A learning analytics cycle is made up of five interrelated and consecutive stages, as highlighted in Figure 1 and detailed in a separate chapter. The stages of the cycle comprise data capture, preprocessing, analysis, action, and feedback (Clow, 2012b). Data capture, preprocessing and analysis were performed in all of the studies. The analysis of the first study data led to the improvement and refinement of Study 5 (Saqr et al., 2017; Saqr, Fors, & Tedre, 2018). The insights of the monitoring in the first study were further used as feedback to create an intervention in Study 3 and to conceptualize Study 4 (Saqr, Fors, 96

& Nouri, 2018; Saqr, Fors, & Tedre, 2018; Saqr, Nouri, & Fors, 2018). As such, this thesis is a complete realization of the whole learning analytics research process. The breadth, speed, and feasibility of SNA, as well as the rich in- formation it offers, renders it more practical than traditional, time- and labor-intensive content analysis methods that necessitate lengthy manual coding and subjective analysis (Mayring, 2014; Stemler, 2015). Given that the study included more than 18,000 interactions from around 950 students and educators, manual con- tent coding and analysis of interactions would have been very lengthy and demanding, and also laden with issues such as problematic validity, reliability, trustworthiness, and credibility. Furthermore, it would contradict the spirit of this study, which at- tempted to present an automated, cost-effective and easy-to-imple- ment method that can help support learners and educators without placing additional burdens on their time or energy. While qualita- tive content analysis methods are useful as research tools and bring rigor and richness to the analysis of discourse, they remain far from practical to implement in the classroom or in terms of the real-time monitoring of interactions. The difficulty of applying manual con- tent analysis methods on the overwhelming abundance of interac- tions data is of particular concern and stimulates the search for au- tomated content analysis methods such as EDM (Mayring, 2014; Stemler, 2015). The inclusion of automated content analysis methods or text ana- lytics in tandem with social network analysis might have added valuable insights to the interaction analysis. However, like manual content analysis, the current status of automated content analytics is not optimal for producing reliable insights into students’ perfor- mance or grades. Furthermore, EDM tools are difficult to standard- ize, sophisticated for non-experts and difficult to interpret by ordi- nary educators (Al-Razgan et al., 2014; Dutt et al., 2017; Papamitsiou & Economides, 2014; Romero & Ventura, 2013; Slater et al., 2017). I have led some early trials that use automated content analysis tools, but the results were less than satisfactory for performance prediction and difficult to recommend beyond the re- search setting (AlGhasham et al., 2013; Al-Gadaa et al., 2014; 97

Saqr, AlGhasham, et al., 2013; Saqr, Kamal, et al., 2014). The most troubling issue I found was the scarcity of tools designed for edu- cational use, especially in problem-based settings. Given that reli- able “student modeling commands the focus of more than half the approaches”, new tools that are specific to education must be de- veloped and validated before being used reliably (Peña-Ayala, 2014, pp. 1457). Although LA has enabled the early prediction of students’ perfor- mance as well as the introduction of remedial strategies, a wide array of challenges and issues behind the successful application of the methods remain. A previous section was dedicated to discuss- ing the general challenges. However, it is critical to discuss the challenges identified through this research. I will discuss these in tandem with the learning analytics process framework. Data capture: although it might be expected that computer-rec- orded data are structured, this was rarely the case. Data recorded by learning management systems demonstrated errors such as cor- rupted records, incorrect information, missing fields and erroneous encodings of information, as well as an absence of important fields due to software or configuration errors. Therefore, the data were carefully inspected and verified, and the corrupted records or erro- neous information were excluded. Furthermore, the process of data extraction was arduous because the tools required for the extraction of raw data were not always readily available. The available tools also necessitated strenuous efforts to keep them current and com- patible with the learning environments. For instance, when the the- sis project was conceptualized, the intention was to use a group of plugins called “engagement analytics” to extract the Moodle data, but by the time the first study was performed, the Moodle logging system had been completely overhauled and the “engagement ana- lytics” plugins were no longer maintained. Similarly, SNA data extraction was planned to be performed using a software application called Meerkat-ed, but this became outdated one year before the studies were conducted. Another issue that was discussed in the ethics chapter comprised the ethical and privacy issue of collecting, storing and handling students’ data. At the time when the thesis project was conceptualized, no policy existed to 98 handle students’ LMS data for research purposes, necessitating the creation of a new policy. Data preprocessing: given the complexity and time required to extract information, the extracted data were not easy to compile, label, and verify. Moreover, as different methods were used to extract the data, the resulting extracts were in different formats and structure. Compiling, as well as verifying and labeling the data, re- quired a serious effort. The quality assurance of data integrity at this stage is of paramount importance; otherwise the results may be flawed. Analysis and interpretation: students’ data were plentiful and perhaps even overwhelming. A large amount of the data recorded by the learning management systems were not relevant to students’ learning and added unnecessary noise to the analysis. For instance, the LMS records every click a student makes, such as clicks on a profile page, editing online personal data or looking at other par- ticipants’ information. Moreover, the length of time a student was logged in to the online system is also recorded, regardless of the fact that the student may be idle or multi-tasking. Thus, the simple counts of clicks or views or the total duration of being online are problematic metrics. Therefore, thoughtful identification of the relevant information to be included in the analysis was conducted. This may, however, be easier said than done, because a clear guide does not exist and researchers have to use multiple exploratory and experimental methods before arriving at a useful answer. As data, methods and software evolve, this challenge will continue to be of serious concern. Another serious problem is how algorithms and artificial intelligence work in general. In most of the studies where machine-learning algorithms were used, they proved effective in predicting the future performance of students. However, they of- fered no indication about the predictors. As such, I used two pre- dictive algorithms to identify the significant predictors. Insightful action: using student data to stimulate action constitutes the basis of learning analytics. However, action can involve numer- ous parties, including students, teachers, administrators, as well as stakeholders. It requires a considerable effort to change attitudes, practices, and policies. While this thesis project attempted to bring 99 tangible changes to students’ learning, the change was not easy be- cause it is required at different levels, such as how students’ ap- proach their learning, how teachers deal with students, and how curricula are organized. Data insights may be fast, and so the changes required may not be as responsive as desired. Furthermore, the availability of data regarding students does not always translate into institutional change; in our case, we encountered various lev- els of resistance at every level during the process of implementing learning analytics.

12.1 Limitations Although the visual and mathematical analysis of collaborative learning provides rich information covering a significant side of collaboration and interaction, it is far from perfect, and considera- ble improvement is required in order to help streamline students’ online interactions by building better forums that record messages and replies with greater accuracy. Existing forums are confusing: for instance, a reply might be recorded to the thread initiator even if it was directed to another collaborator. In my experience, this issue requires manual correction and adjustment. Moreover, the current mathematical analysis of social networks builds upon inter- action models created outside the educational field. Educational re- searchers may need to review these concepts, and improve and cre- ate novel ways of quantifying interactivity at the individual and group levels. Throughout the five studies, emphasis was placed on the generali- zability of the approach so that they may be applied to a broad range of contexts. Nevertheless, the results of these studies are con- textually limited, and the generalization of the findings remains a question for further research. Another limitation of this research project was the relatively small number of students in Study 2, who were selected to demonstrate an easy-to-understand unit of the collaborative structure with a re- ductionist approach. A general limitation of the five studies may be the exploratory nature of the research approach. As a new research 100 endeavor, exploratory methods continue to be the main impetus of research and this will possibly continue in future. The lack of content analysis represents another limitation. Indeed, a valid and reliable content analysis method would have added to the insights. However, as argued in the discussion section, such a method was not readily available. Moreover, the methods that are currently available are reused from other domains. Given that the studies were conducted in blended learning environ- ments, the data captured were limited to the online environment. As such, the face-to-face dimension was not represented in this re- search.

101

13. Methodological and theoretical contributions

The methodological contribution of this thesis lies in the operation- alization of social network analysis in the context of collaborative learning through a structured framework for both mathematical and visual analysis. The framework accounted for the individual learner, connections, interactions, role in information exchange, and the influence of the social structure. The centrality measures outlined by the framework were used to predict a substantial num- ber of potential underachievers with reasonable accuracy, at a time when they can still benefit from early supportive measures. Fur- thermore, social network analysis was used to monitor, guide, and test an informed intervention that helped produce a significantly more collaborative pattern of learning, offering a proof of concept of the real potential of the technique. Using a compilation of all interactions to depict dynamic networks represents a new approach that can facilitate understanding of the timeline of events in online collaborative environments. In Study 2, this approach enabled identification of how the online commu- nity was built, the initiators of the interactions, and how the inter- actions were driven. In Study 3, the time-lapse video of interactions demonstrated the differences in course status before and after the intervention. A further methodological contribution of this thesis is its approach to the study of the time dimension in learning analytics. I used time in the first study as a dimension of self-regulation by stressing regularity. In the fifth study, the patterns of interactions across dif- ferent periods were structured into a time-based temporality model. The inclusion of the temporality dimension has proven fruitful in 102 the characterization of the patterns of self-regulation, improving the accuracy of predictive models and bringing important insights regarding a critical aspect of learning that represents a fertile area of inquiry. Furthermore, the methods used in this study have laid a basic framework on which future research can build, improve, and potentially innovate. Another notable characteristic of this thesis was its focus on the reproducibility and validation of results. In each of the five cases studied, the validation of results was central to the method of anal- ysis, including resampling, permutation, 10-fold cross-validation, and data from different courses to replicate the results. Although the replication of results in learning analytics remains a challenging task, it was achieved with reasonable reliability using a combina- tion of temporality and interaction metrics. The course data were used to reliably predict future underachievers in the subsequent courses. This finding offers evidence of the possibility of creating more reliable models that can enhance the portability of models, add to the understanding of theory, and embolden trust in the field. The research presented in this thesis contributes to evidence re- garding learning analytics in three main areas: the creation of learner models, support for teaching, and impact on teaching (Ferguson & Clow, 2017). The results of research concerning learners proved that learning analytics could be used to create in- formative learner models regarding self-regulation, collaboration, and time strategies. The results have also proven that collaboration analysis can be used to monitor, support and improve collaborative learning. The research articles presented in this thesis are among the first empirical research studies in the field of medical education in terms of learning analytics and social network analysis. They are pre- sented with the hope that they might stimulate dialogue about op- portunities for a data-driven approach to learning, as well as the challenges that the medical education community needs to discuss (Isba et al., 2017; Saqr, 2018).

103

14. Conclusions

The findings of these studies have demonstrated that a significant number of underachievers can be early predicted using learning an- alytics methods. Metrics of self-regulation, engagement, interac- tivity, and temporality were the most significant indicators of better performance. Educators can use these indicators (which are essentially modifiable and malleable) to provide interventions and early remedial strategies. A theory-grounded approach to data col- lection, analysis and interpretation has been proved both feasible and fruitful. Time and temporality are essential dimensions of how students self-regulate their learning. Students’ early and proactive engage- ment with their tasks, sustained effort and consistency proved to be an important potential predictors of performance. Moreover, using time-based social network analysis has highlighted the social dy- namics of collaborative learning, especially regarding how stu- dents engage over time and how the timeline of interactions evolves. Using social network visualization to summarize the interactions, the relationships and structure of collaborative groups have pre- sented a feasible and understandable perspective of the status of collaborators and groups. Through combining visualization with mathematical network analysis and centrality measures, an accu- rate view of collaboration was derived. Using social network anal- ysis to monitor and guide a data-driven intervention proved feasi- ble, cost-effective and efficient. SNA-based interventions helped create more cohesive and interactive groups with fewer isolated collaborators.

104

A framework that captures the gamut of interactivity and relation- ships and the position of collaborators has helped create a reasonably accurate picture of the collaborative environment. In the process, this has facilitated the prediction of underachievers, as well as the monitoring and correcting of inefficient collaborative practices.

105

16. Future work

This research might stimulate future research in numerous ways. An analysis of self-regulation could be performed using finer- grained, high-resolution data points. The inclusion of fine-grained activities such as highlighting concepts, taking notes, or creating learning artifacts may contribute to better contextual inferences (Cicchinelli et al., 2018; Molenaar & Järvelä, 2014; Winne et al., 2017). The abundance of such thorough information could be used to create an accurate contextual timeline of students’ activities, strategies for learning and self-regulation (Sonnenberg & Bannert, 2015; Winne, 2017). Furthermore, it might facilitate understanding of the different factors behind self-regulation that may need to be targeted via intervention or support (Molenaar & Järvelä, 2014; Shum & Crick, 2012; Tempelaar et al., 2018). The focus of this research was on online learners’ interactions. However, students also interacted with learning resources. Using social network analysis to map the interactions with different resources, concepts, and competencies could offer a detailed map of the interconnected networks of learning competencies that enable us to understand the full picture of learning. Using network analysis along with self-regulation and the temporality of students might provide further insights into the learning process. Another venue for future research may be the introduction of cen- trality measures that are specific to collaborative learning and in- formation exchange in educational settings. Although centrality measures are numerous and often relevant to learning, situations occur where present centrality measures lack the necessary speci- ficity. An example is the lack of a centrality measure that correlates with argumentation in discussions. While betweenness centrality

106 can serve as a measure here, it does not represent an accurate meas- ure of such property. Many unanswered questions remain, including the breadth of trac- ing, the temporality of self-regulation in the long term beyond time-on-task, the optimum delivery of online resources that do not place excessive stress on learners, and the online strategies that could be used effectively to support learners (Winne, 2017). Additional research venues that might be explored in the future are listed below:  The development and validation of a content analysis method specific to an online problem-based learning context;  The further development of the social network analysis frame- work with the inclusion of dynamic and temporal parameters;  The study of signed and weighted networks in the context of online collaboration;  Using temporal generative network analysis models to under- stand why and how ties are formed and maintained or dissolved over time.

107

References

Abrami, P. C., Bernard, R. M., Borokhovski, E., Wade, A., Surkes, M. A., Tamim, R., & Zhang, D. (2008). Instructional interventions affecting critical thinking skills and dispositions: A stage 1 meta-analysis. Review of Educational Research, 78(4), 1102-1134. doi:10.3102/0034654308326084 Abrami, P. C., Bernard, R. M., Bures, E. M., Borokhovski, E., & Tamim, R. M. (2011). Interaction in distance education and online learning: Using evidence and theory to improve practice. Journal of Computing in Higher Education, 23(2- 3), 82-103. doi:10.1007/s12528-011-9043-x Agudo-Peregrina, Á. F., Iglesias-Pradas, S., Conde-González, M. Á., & Hernández-Garca, Á. (2014). Can we predict success from log data in VLEs? Classification of interactions for learning analytics and their relation with performance in VLE-supported F2F and online learning. Computers in Human Behavior, 31(1), 542-550. doi:10.1016/j.chb.2013.05.031 Alamro, A. S., & Schofield, S. (2012). Supporting traditional PBL with online discussion forums: A study from Qassim Medical School. Med Teach, 34(sup1), S20-S24. doi:10.3109/0142159X.2012.656751 Al-Gadaa, A. H., Alkadi, A., & Saqr, M. (2014). Social Network Analytics to Enhance Students' Learning: An Intervention Study. Paper presented at the SIMEC2014, Qassim. AlGhasham, A., Saqr, M., & Kamal, H. (2013). Using Learning Analytics to Evaluate the Efficacy of Blended Learning in

108

PBL Based Medical Course. Paper presented at the AMEE2013. Al-Razgan, M., Al-Khalifa, A. S., & Al-Khalifa, H. S. (2014). Educational data mining: A systematic review of the published literature 2006-2013. In T. Herawan, M. M. Deris, & J. Abawajy (Eds.), Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013) (pp. 711-719). Singapore: Springer Singapore. Ali, L., Hatala, M., Gašević, D., & Jovanović, J. (2012). A qualitative evaluation of evolution of a learning analytics tool. Computers & Education, 58(1), 470-489. doi:10.1016/j.compedu.2011.08.030 Anderson, C. (2008). The end of theory: The data deluge makes the scientific method obsolete. Wired Magazine, 16(7), 16-07. Andrade, A., & Danish, J. A. (2016). Using multimodal learning analytics to model student behaviour: A systematic analysis of behavioural framing. Journal of Learning Analytics, 3(2), 282-306. doi:10.18608/jla.2016.32.14 Aragon, S. (2016). Teacher shortages: What we know. Teacher shortage series. Education Commission of the States. Arnold, K. E., & Pistilli, M. D. (2012). Course signals at Purdue. Proceedings of the 2nd International Conference on Learning Analytics and Knowledge - LAK '12(May), 267- 267. doi:10.1145/2330601.2330666 Arnold, K. E., & Pistilli, M. D. (2012). Course Signals at Purdue: Using Learning Analytics to Increase Student Success. Paper presented at the Proceedings of the 2nd international conference on learning analytics and knowledge. Atif, A., Richards, D., Bilgin, A., & Marrone, M. (2013). A Panorama of Learning Analytics Featuring the Technologies for the Learning and Teaching Domain. In Proceedings of the 30th Australasian Society for Computers in Learning in Tertiary Education Conference (pp. 68-72). Avella, J. T., Kebritchi, M., Nunn, S. G., & Kanai, T. (2016). Learning analytics methods, benefits, and challenges in

109

higher education: A systematic literature review. Online Learning, 20(2), 13-29. Baggetun, R., & Wasson, B. (2006). Self‐regulated learning and open writing. European Journal of Education, 41(3‐4), 453-472. Baker, R. S. J. D., Buckingham Shum, S., Duval, E., Stamper, J., & Wiley, D. (2012). Panel: Educational Data Mining Meets Learning Analytics. In Proceedings of the Second International Conference on Learning Analytics and Knowledge (pp. 20-22) doi:10.1145/2330601.2330613 Bakharia, A., Corrin, L., Barba, P. D., Kennedy, G., Gasevic, D., Mulder, R., . . . Lockyer, L. (2016). A conceptual framework linking learning design with learning analytics. Proceedings of LAK16 6th International Conference on Analytics and Knowledge 2016(April), 329-338. doi:10.1145/2883851.2883944 Ballard, J., & Butler, P. I. (2016). Learner enhanced technology. Journal of Applied Research in Higher Education, 8(1), 18-43. doi:10.1108/jarhe-09-2014-0074 Barabasi, A. L., Gulbahce, N., & Loscalzo, J. (2011). Network medicine: A network-based approach to human disease. Nat Rev Genet, 12(1), 56-68. doi:10.1038/nrg2918 Barnard, L., Lan, W. Y., To, Y. M., Paton, V. O., & Lai, S.-L. (2009). Measuring self-regulation in online and blended learning environments. The Internet and Higher Education, 12(1), 1-6. doi:10.1016/j.iheduc.2008.10.005 Barron, A. B., Hebets, E. A., Cleland, T. A., Fitzpatrick, C. L., Hauber, M. E., & Stevens, J. R. (2015). Embracing multiple definitions of learning. Trends in Neurosciences, 38(7), 405-407. doi:10.1016/j.tins.2015.04.008 Barron, B. (2003). When smart groups fail. The Journal of the Learning Sciences, 12(3), 307-359. doi:10.1207/S15327809JLS1203_1 Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: An open source software for exploring and manipulating networks. Icwsm, 8(2009), 361-362. Bell, F. (2010). Connectivism: Its place in theory-informed research and innovation in technology-enabled learning. 110

The International Review of Research in Open and Distributed Learning, 12(3), 98-118. Bernard, R. M., Abrami, P. C., Borokhovski, E., Wade, C. A., Tamim, R. M., Surkes, M. A., & Bethel, E. C. (2009). A meta-analysis of three types of interaction treatments in distance education. Review of Educational Research, 79(3), 1243-1289. doi:10.3102/0034654309333844 Bewick, V., Cheek, L., & Ball, J. (2005). Statistics review 14: Logistic regression. Crit Care, 9(1), 112-118. doi:10.1186/cc3045 Bichsel, J. (2013). The state of e-learning in higher education: An eye toward growth and increased access. EDUCAUSE Center for Analysis and Research. Bienkowski, M., Feng, M., & Means, B. (2012). Enhancing teaching and learning through educational data mining and learning analytics: An issue brief. Washington, DC: SRI International, 1-57. doi:10.2991/icaiees-13.2013.22 bin Mat, U., Buniyamin, N., Arsad, P. M., & Kassim, R. (2013). An overview of Using Academic Analytics to Predict and Improve Students' Achievement: A Proposed Proactive Intelligent Intervention. Paper presented at the Engineering Education (ICEED), 2013 IEEE 5th Conference on. Bin Mat, U., Buniyamin, N., Arsad, P. M., & Kassim, R. A. (2014). An overview of using academic analytics to predict and improve students' achievement: A proposed proactive intelligent intervention. 2013 IEEE 5th International Conference on Engineering Education: Aligning Engineering Education with Industrial Needs for Nation Development, ICEED 2013(August), 126-130. doi:10.1109/ICEED.2013.6908316 Blikstein, P. (2013). Multimodal Learning Analytics. Paper presented at the Proceedings of the third international conference on learning analytics and knowledge. Blikstein, P., & Worsley, M. (2016). Multimodal learning analytics and education data mining: Using computational technologies to measure complex learning tasks. Journal

111

of Learning Analytics, 3(2), 220-238. doi:http://dx.doi.org/10.18608/jla.2016.32.11 Bodily, R., & Verbert, K. (2017). Review of research on student- facing learning analytics dashboards and educational recommender systems. IEEE Transactions on Learning Technologies, 10(4), 405-418. doi:10.1109/Tlt.2017.2740172 Borgatti, S. P., Everett, M. G., & Johnson, J. C. (2013). Testing Hypothesis. In Analyzing Social Networks (pp. 125–148). Sage. Borgatti, S. P., Mehra, A., Brass, D. J., & Labianca, G. (2009). Network analysis in the social sciences. Science, 323(5916), 892-895. doi:10.1126/science.1165821 Borokhovski, E., Bernard, R. M., Tamim, R. M., Schmid, R. F., & Sokolovskaya, A. (2016). Technology-supported student interaction in post-secondary education: A meta-analysis of designed versus contextual treatments. Computers & Education, 96, 15-28. doi:10.1016/j.compedu.2015.11.004 Boyd, D., & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society, 15(5), 662-679. Brindley, J., Blaschke, L. M., & Walti, C. (2009). Creating effective collaborative learning groups in an online environment. The International Review of Research in Open and Distributed Learning, 10(3). Broadbent, J., & Poon, W. (2015). Self-regulated learning strategies & academic achievement in online higher education learning environments: A systematic review. The Internet and Higher Education, 27, 1-13. doi:10.1016/j.iheduc.2015.04.007 Brooks, C. A., Thompson, C. (2017) Predictive Modelling in Teaching and Learning. In Charles Lang, George Siemens, Alyssa Wise, Dragan Gasevic (Eds.), Handbook of Learning Analytics.. (pp. 61-68). Society for Learning Analytics and Research (SOLAR) doi: 10.18608/hla17

112

Brooks, C., Thompson, C., & Teasley, S. (2015). A Time Series Interaction Analysis Method for Building Predictive Models of Learners using Log Data. Paper presented at the Proceedings of the Fifth International Conference on Learning Analytics And Knowledge - LAK '15. Burnette, J. L., O'Boyle, E. H., VanEpps, E. M., Pollack, J. M., & Finkel, E. J. (2013). Mind-sets matter: A meta-analytic review of implicit theories and self-regulation. Psychological Bulletin, 139(3), 655. doi:10.1037/a0029531 Burt, R. S. (2002). The social capital of structural holes. New Directions in Economic , 201-247. Burt, R. S. (2004). Structural holes and good ideas. The American Journal of Sociology, 110(2), 349-399. doi:10.1086/421787 Burt, R. S. (2015). Reinforced structural holes. Social Networks, 43, 149-161. doi:10.1016/j.socnet.2015.04.008 Burt, R. S. (2017). Structural holes versus network closure as social capital. In N. Lin, K. Cook, & R. S. Burt (Eds.), Social capital: Theory and research. New York: Aldine De Gruyter. Burt, R. (2001) Structural Holes versus Network Closure as Social Capital. In: Lin, N., Cook, K.S. and Burt, R.S., Eds., Social Capital: Theory and Research, Aldine de Gruyter. Burt, R. S., Kilduff, M., & Tasselli, S. (2013). Social network analysis: Foundations and frontiers on advantage. Annual Review of Psychology, 64, 527-547. doi:10.1146/annurev- psych-113011-143828 Calvani, A., Fini, A., Molino, M., & Ranieri, M. (2010). Visualizing and monitoring effective interactions in online collaborative groups. British Journal of Educational Technology, 41(2), 213-226. doi:10.1111/j.1467- 8535.2008.00911.x Campbell, J. P., DeBlois, P. B., & Oblinger, D. G. (2007). Academic analytics: A new tool for a new era. Educause Review, 42(October), 40-57. Cela, K., Sicilia, M. Á., & Sánchez, S. (2014). Social network analysis in e-learning environments: A preliminary 113

systematic review. Educational Psychology Review, 27(1), 219-246. doi:10.1007/s10648-014-9276-0 Chatti, M. A., Lukarov, V., Thüs, H., Muslim, A., Yousef, A. M. F., Wahid, U., ... & Schroeder, U. (2014). Learning analytics: Challenges and future research directions. eleed, 10(1). Chen, B. (2015). From theory use to theory building in learning analytics: A commentary on “Learning Analytics to Support Teachers during Synchronous CSCL”. Journal of Learning Analytics, 2(2), 163-168. doi:10.18608/jla.2015.22.12 Chen, B., Knight, S., & Wise, A. F. (2018). Critical issues in designing and implementing temporal analytics. Journal of Learning Analytics, 5(1), 1-9. Chen, H., Chiang, R. H. L., & Storey, V. C. (2012). and analytics: From big data to big impact. MIS Quarterly, 36(4), 1165-1188. Chiu, C.-C. (2014). Use social network analysis to identify the knowledge-hole from learning portfolios structures through web logs in cognitive apprenticeship and goal- based web-based learning system. PACIS 2014 Proceedings. Cho, H., Gay, G., Davidson, B., & Ingraffea, A. (2007). Social networks, communication styles, and learning performance in a CSCL community. Computers & Education, 49(2), 309-329. doi:http://dx.doi.org/10.1016/j.compedu.2005.07.003 Cicchinelli, A., Veas, E., Pardo, A., Pammer-Schindler, V., Fessl, A., Barreiros, C., & Lindstädt, S. (2018). Finding Traces of Self-regulated Learning in Activity Streams. Paper presented at the 8th International Conference on Learning Analytics and Knowledge. Cloete, A. (2014). Social cohesion and social capital: Possible implications for the common good. Verbum et Ecclesia, 35(3), 1-6. doi:10.4102/ve.v35i3.1331 Clow, D. (2012a). The Learning Analytics Cycle: Closing the Loop Effectively. Paper presented at the Proceedings of the 2nd

114

international conference on learning analytics and knowledge. Clow, D. (2012b). The learning analytics cycle. Proceedings of the 2nd International Conference on Learning Analytics and Knowledge - LAK '12, 134-134. doi:10.1145/2330601.2330636 Clow, D. (2013). An overview of learning analytics. Teaching in Higher Education, 18(6), 683-695. doi:10.1080/13562517.2013.827653 Colvin, C., Rogers, T., Wade, A., Dawson, S., Gašević, D., Buckingham Shum, S., & Fisher, J. (2015). Student retention and learning analytics: A snapshot of Australian practices and a framework for advancement. Sydney: Australian Office for Learning and Teaching. Conde, M. Á., & Hernández-García, Á. (2015). Learning analytics for educational decision making. Computers in Human Behavior, 47, 1-3. doi:10.1016/j.chb.2014.12.034 Cook, D. A., Hatala, R., Brydges, R., & et al. (2011). Technology- enhanced simulation for health professions education: A systematic review and meta-analysis. JAMA, 306(9), 978- 988. doi:10.1001/jama.2011.1234 Cooper, A. (2012). What is analytics? Definition and essential characteristics. CETIS Analytics Series, 1(5), 1-10. Cruz-Benito, J., Therón, R., García-Peñalvo, F. J., & Lucas, P. E. (2015). Discovering usage behaviors and engagement in an educational virtual world. Computers in Human Behavior, 47(2015), 18-25. doi:10.1016/j.chb.2014.11.028 Dado, M., & Bodemer, D. (2017). A review of methodological applications of social network analysis in computer- supported collaborative learning. Educational Research Review, 22, 159-180. doi:10.1016/j.edurev.2017.08.005 Dahlstrom, E., Brooks, D. C., & Bichsel, J. (2014). The Current Ecosystem of Learning Management Systems in Higher Education : Student , Faculty, and IT Perspectives. Retrieved from http://www.educause.edu/ecar

115

Daniel, B. (2015). Big data and analytics in higher education: Opportunities and challenges. British Journal of Educational Technology, 46(5), 904-920. Davis, M., & Harden, R. (1999). AMEE Medical Education Guide No. 15: Problem-based learning: a practical guide. Med Teach, 21(2), 130-140. doi:Doi 10.1080/01421599979743 Dawson, S., Jovanovic, J., Gašević, D., & Pardo, A. (2017a). From Prediction to Impact: Evaluation of a Learning Analytics Retention Program. Paper presented at the Proceedings of the Seventh International Learning Analytics & Knowledge Conference. Dawson, S., Jovanovic, J., Gašević, D., & Pardo, A. (2017, March). From prediction to impact: Evaluation of a learning analytics retention program. In Proceedings of the Seventh International Learning Analytics & Knowledge Conference (pp. 474-478). doi:10.1145/3027385.3027405 Dawson, S., Mirriahi, N., & Gasevic, D. (2015). Importance of theory in learning analytics in formal and workplace settings. Journal of Learning Analytics, 2(2), 1-4. doi:10.18608/jla.2015.22.1 Dawson, S. P., Macfadyen, L., & Lockyer, L. (2009). Learning or performance: Predicting drivers of student motivation. Ascilite 2009 - Auckland, 184-193. de Boer, H., Donker-Bergstra, A. S., Kostons, D. D. N. M., Korpershoek, H., & van der Werf, M. P. C. (2013). Effective Strategies for Self-regulated Learning: A Meta- analysis: GION/RUG Groningen, NL. De Houwer, J., Barnes-Holmes, D., & Moors, A. (2013). What is learning? On the nature and merits of a functional definition of learning. Psychonomic Bulletin & Review, 20(4), 631-642. doi:10.3758/s13423-013-0386-3 De Wever, B., Schellens, T., Valcke, M., & Van Keer, H. (2006a). Content analysis schemes to analyze transcripts of online asynchronous discussion groups: A review. Computers and Education, 46(1), 6-28. doi:10.1016/j.compedu.2005.04.005 De Wever, B., Schellens, T., Valcke, M., & Van Keer, H. (2006b). Content analysis schemes to analyze transcripts of online 116

asynchronous discussion groups: A review. Computers & Education, 46(1), 6-28. doi:http://dx.doi.org/10.1016/j.compedu.2005.04.005 Dietz-Uhler, B., & Hurn, J. (2013). Using learning analytics to predict (and improve) student success: A faculty perspective. Journal of Interactive Online Learning, 12, 17-26. Dika, S. L., & Singh, K. (2002). Applications of social capital in educational literature: A critical synthesis. Review of Educational Research, 72(1), 31-60. doi:10.3102/00346543072001031 Dimitracopoulou, A. (2009). Computer based interaction analysis supporting self-regulation: Achievements and prospects of an emerging research direction. Technology, Instruction, Cognition & Learning, 6(4). Doherty, I., Sharma, N., & Harbutt, D. (2015). Contemporary and future eLearning trends in medical education. Med Teach, 37(1), 1-3. doi:10.3109/0142159X.2014.947925 Dougiamas, M., & Taylor, P. (2003). Moodle: Using Learning Communities to Create an Open Source Course Management System. Paper presented at the World Conference on Educational Media and Technology. Dowell, N. M. M., Skrypnyk, S., Joksimović, S., Graesser, A., Dawson, S., Gašević, D., … Kovanović, V. (2015). Modeling Learners ’ Social Centrality and Performance through Language and Discourse. In Educational Data Mining - EDM’15 (pp. 250–257). Drachsler, H., & Greller, W. (2016). Privacy and Analytics– It’s a DELICATE Issue. A Checklist to Establish Trusted Learning Analytics. Paper presented at the 6th Learning Analytics and Knowledge Conference 2016. Drachsler, H., Hoel, T., Scheffel, M., Kismihók, G., Berg, A., Ferguson, R., . . . Manderveld, J. (2015). Ethical and privacy issues in the application of learning analytics. Proceedings of the Fifth International Conference on Learning Analytics And Knowledge - LAK '15, 390-391. doi:10.1145/2723576.2723642 117

Dutt, A., Ismail, M. A., & Herawan, T. (2017). A systematic review on educational data mining. IEEE Access, 5, 15991-16005. Duval, E. (2011). Attention Please! Learning Analytics for Visualization and Recommendation. In Proceedings of the 1st international conference on learning analytics and knowledge, 9-17. Elias, T. (2011). Learning analytics: The definitions, the processes, and the potential. Learning, 1-22. Ellaway, R., & Masters, K. (2008). AMEE Guide 32: e-Learning in medical education Part 1: Learning, teaching and assessment. Med Teach, 30(5), 455-473. doi:10.1080/01421590802108331 Ellaway, R. H., Pusic, M. V., Galbraith, R. M., & Cameron, T. (2014). Developing the role of big data and analytics in health professional education. Med Teach, 36(3), 216-222. doi:10.3109/0142159X.2014.874553 Ellis, C. (2013). Broadening the scope and increasing the usefulness of learning analytics: The case for assessment analytics. British Journal of Educational Technology, 44(4), 662-664. doi:10.1111/bjet.12028 Falakmasir, M. H., & Habibi, J. (2010). Using Educational Data Mining Methods to Study the Impact of Virtual Classroom in E-Learning. Paper presented at the EDM. Fang, R., Landis, B., Zhang, Z., Anderson, M. H., Shaw, J. D., & Kilduff, M. (2015). Integrating personality and social networks: A meta-analysis of personality, network position, and work outcomes in organizations. Organization Science, 26(4), 1243-1260. Ferguson, R. (2012a). The state of learning analytics in 2012: A review and future challenges. Technical Report KMI-12- 01, 4(March), 18-18. doi:10.1504/IJTEL.2012.051816 Ferguson, R. (2012b). Learning analytics: Drivers, developments and challenges. International Journal of Technology Enhanced Learning, 4(5-6), 304-317. doi:10.1504/IJTEL.2012.051816 Ferguson, R., & Clow, D. (2017). Where is the Evidence?: A Call to Action for Learning Analytics. Paper presented at the

118

Proceedings of the seventh international learning analytics & knowledge conference. Filippou, J., Cheong, C., & Cheong, F. (2015). Designing persuasive systems to influence learning: Modelling the impact of study habits on academic performance. PACIS 2015. Fincham, E., Gašević, D., & Pardo, A. (2018). From social ties to network processes: Do tie definitions matter? Journal of Learning Analytics, 5(2), 9-28. doi:10.18608/jla.2018.52.2 Firm, A. (1992). Homophily and differential returns: Sex differences in network structure and access in an advertising firm. Administrative Science Quarterly, 37(3), 422-447. Fosnot, C. T. (2013). Constructivism: Theory, Perspectives, and practice. Teachers College Press. Fransen, J., Kirschner, P. A., & Erkens, G. (2011). Mediating team effectiveness in the context of collaborative learning: The importance of team and task awareness. Computers in Human Behavior, 27(3), 1103-1113. Fritz, J. (2011). Classroom walls that talk: Using online course activity data of successful students to raise self-awareness of underperforming peers. The Internet and Higher Education, 14(2), 89-97. doi:10.1016/j.iheduc.2010.07.007 Fritz, J. (2013). Using analytics at UMBC: Encouraging student responsibility and identifying effective course designs. EDUCAUSE Center for Applied Research. Retrieved from https://www.educause.edu/ir/library/pdf/ERB1304.pdf Gandomi, A., & Haider, M. (2014). Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management, 35(2), 137-144. doi:10.1016/j.ijinfomgt.2014.10.007 Gardner, J., & Brooks, C. (2018). Evaluating predictive models of student success: Closing the methodological gap. Journal of Learning Analytics, 5(2), 105-125. doi:10.18608/jla.2018.52.7 Garrison, D. R., Anderson, T., & Archer, W. (1999). Critical inquiry in a text-based environment: Computer 119

conferencing in higher education. The Internet and Higher Education, 2(2), 87-105. Garrison, D. R., Anderson, T., & Archer, W. (2010). The first decade of the community of inquiry framework: A retrospective. Internet and Higher Education, 13(1-2), 5- 9. doi:10.1016/j.iheduc.2009.10.003 Garrison, D. R., & Arbaugh, J. B. (2007). Researching the community of inquiry framework: Review, issues, and future directions. The Internet and Higher Education, 10(3), 157-172. Gašević, D., Dawson, S., Rogers, T., & Gasevic, D. (2016). Learning analytics should not promote one size fits all: The effects of instructional conditions in predicting academic success. The Internet and Higher Education, 28, 68-84. doi:10.1016/j.iheduc.2015.10.002 Gašević, D., Jovanović, J., Pardo, A., & Dawson, S. (2017). Detecting learning strategies with analytics: Links with self-reported measures and academic performance. Journal of Learning Analytics, 4(2), 113-128. doi:10.18608/jla.2017.42.10 Gašević, D., Kovanović, V., & Joksimović, S. (2017). Piecing the learning analytics puzzle: A consolidated model of a field of research and practice. Learning: Research and Practice, 3(1), 63-78. doi:10.1080/23735082.2017.1286142 Gašević, D., Zouaq, A., & Janzen, R. (2013). “Choose your classmates, your GPA is at stake!”. American Behavioral Scientist, 57(10), 1460-1479. doi:10.1177/0002764213479362 Gillies, R. (2016). Cooperative learning: Review of research and practice. Australian Journal of Teacher Education, 41(3), 39-54. doi:10.14221/ajte.2016v41n3.3 Goda, Y., Yamada, M., Kato, H., Matsuda, T., Saito, Y., & Miyagawa, H. (2015). Procrastination and other learning behavioral types in e-learning and their relationship with learning outcomes. Learning and Individual Differences, 37, 72-80. doi:10.1016/j.lindif.2014.11.001 Golbeck, J. (2013). Network Structure and Measures. In Analyzing

120

the Social Web (pp. 25–44). Elsevier. https://doi.org/10.1016/B978-0-12-405531-5.00003-1 Graham, C. R. (2006). Blended learning systems: Definition, current trends, and future directions. In C. J. Bonk & C. R. Graham (Eds.), The handbook of blended learning: Global perspectives, local designs (pp. 3−21). San Francisco, CA: Pfeiffer. Greller, W., & Drachsler, H. (2012). Translating learning into numbers: A generic framework for learning analytics. Journal of Educational Technology & Society, 15(3), 42. Grossman, P., & McDonald, M. (2008). Back to the future: Directions for research in teaching and teacher education. American Educational Research Journal, 45(1), 184-205. doi:10.3102/0002831207312906 Gruschka, N., & Jensen, M. (2014). Aligning User Consent Management and Service Process Modeling. Paper presented at the GI-Jahrestagung. Gulati, S. (2008). Technology-enhanced learning in developing nations: A review. International Review of Research in Open and Distance Learning, 9(1), 1-16. Hermeren, G., Almgren, K., Bengtsson, P., Cannon, B., Eriksson, S., & Hoglund, P. (2011). Good research practice. Swedish Research Council 1: 1–131. Research Council 1: 1–131.Hernández-García, Á., González- González, I., Jiménez-Zarco, A. I., & Chaparro-Peláez, J. (2015). Applying social learning analytics to message boards in online distance learning: A case study. Computers in Human Behavior, 47, 68-80. doi:10.1016/j.chb.2014.10.038 Hew, K. F., Cheung, W. S., & Ng, C. S. L. (2009). Student contribution in asynchronous online discussion: A review of the research and empirical exploration. Instructional Science, 38(6), 571-606. doi:10.1007/s11251-008-9087-0 Hommes, J., Rienties, B., de Grave, W., Bos, G., Schuwirth, L., & Scherpbier, A. (2012). Visualising the invisible: A network approach to reveal the informal social side of

121

student learning. Adv Health Sci Educ Theory Pract, 17(5), 743-757. doi:10.1007/s10459-012-9349-0 Horvath, J. C., & Lodge, J. M. (2016). A framework for organizing and translating science of learning research. JC Horvath., JM Lodge, & J. Hattie.(Eds.), From the laboratory to the classroom: Translating science of learning for teachers, 7- 20. Hothorn, T., Hornik, K., Wiel, M. A. van de, & Zeileis, A. (2008). Implementing a Class of Permutation Tests: The coin Package. Journal of Statistical Software, 28(8). https://doi.org/10.18637/jss.v028.i08 HQ, M. (2017). Moodle analytics. Retrieved from https://docs.moodle.org/34/en/Analytics Hrastinski, S. (2008). Asynchronous and synchronous e-learning. Educause Quarterly, 31(4), 51-55. Illeris, K. (2018). Contemporary Theories of Learning: Learning Theorists... In Their Own Words: Routledge. Isba, R., Woolf, K., & Hanneman, R. (2017). Social network analysis in medical education. Medical Education, 51(1), 81-88. doi:10.1111/medu.13152 Jacomy, M., Venturini, T., Heymann, S., & Bastian, M. (2014). ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. PLoS ONE, 9(6), e98679. Janssen, J., & Bodemer, D. (2013). Coordinated computer- supported collaborative learning: Awareness and awareness tools. Educational Psychologist, 48 (December 2014), 40-55. doi:10.1080/00461520.2012.749153 Jayaprakash, S. M., Moody, E. W., Lauria, E. J. M., Regan, J. R., & Baron, J. D. (2014). Early alert of academically at-risk students: An open source analytics initiative. Journal of Learning Analytics, 1(1), 6-47. doi:10.18608/jla.2014.11.3 Jeong, H., & Hmelo-Silver, C. E. (2016). Seven affordances of computer-supported collaborative learning: How to support collaborative learning? How can technologies help? Educational Psychologist, 51(2), 247-265. doi:10.1080/00461520.2016.1158654

122

Jiang, S., Fitzhugh, S. M., & Warschauer, M. (2014). Social Positioning and Performance in Moocs. Paper presented at the Workshop on Graph-Based Educational Data Mining, London, United Kingdom. Job, V., Walton, G. M., Bernecker, K., & Dweck, C. S. (2015). Implicit theories about willpower predict self-regulation and grades in everyday life. Journal of Personality and Social Psychology, 108(4), 637. Johnson, D. W., & Johnson, R. T. (2008). Social interdependence theory and cooperative learning: The teacher’s role. The Teacher’s Role in Implementing Cooperative Learning in the Classroom, 9-36. Johnson, D. W., & Johnson, R. T. (2009). An educational psychology success story: Social interdependence theory and cooperative learning. Educational Researcher, 38(5), 365-379. doi:10.1037/pspa0000044 Joksimović, S., Gašević, D., Kovanović, V., Riecke, B. E., & Hatala, M. (2015). Social presence in online discussions as a process predictor of academic performance. Journal of Computer Assisted Learning, 31(6), 638-654. doi:10.1111/jcal.12107 Joksimović, S., Manataki, A., Gašević, D., Dawson, S., Kovanović, V., & de Kereki, I. F. (2016). Translating Network Position into Performance. Paper presented at the Proceedings of the sixth international conference on learning analytics & knowledge., Edinburgh, Scotland. Jonassen, D., Davidson, M., Collins, M., Campbell, J., & Haag, B. B. (1995). Constructivism and computer‐mediated communication in distance education. American Journal of Distance Education, 9(2), 7-26. Joughin, G. (2009). Assessment, learning and judgement in higher education: A critical review. In G. Joughin (Ed.), Assessment, learning and judgement in higher education (pp. 13–29). New York, NY: Springer. Kahu, E. R. (2013). Framing in higher education. Studies in Higher Education, 38(5), 758-773. doi:10.1080/03075079.2011.598505

123

Kane M.T. (2017) Using Empirical Results to Validate Performance Standards. In: Blömeke S., Gustafsson JE. (eds) Standard Setting in Education. Methodology of Educational Measurement and Assessment. Springer, Cham Kay, D., Korn, N., & Oppenheim, C. (2012). Legal, risk and ethical aspects of analytics in higher education. JISC CETIS Analytics Series, 1(6), 1-30. doi:ISSN 2051-9214 Kirkwood, A., & Price, L. (2014). Technology-enhanced learning and teaching in higher education: What is ‘enhanced’and how do we know? A critical literature review. Learning, media and technology, 39(1), 6-36. Kistner, S., Rakoczy, K., Otto, B., Dignath-van Ewijk, C., Büttner, G., & Klieme, E. (2010). Promotion of self-regulated learning in classrooms: Investigating frequency, quality, and consequences for student performance. Metacognition and Learning, 5(2), 157-171. doi:10.1007/s11409-010- 9055-3 Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big Data & Society, 1(1). doi:10.1177/2053951714528481 Knight, D. B., Brozina, C., & Novoselich, B. (2016). An investigation of first-year engineering student and instructor perspectives of learning analytics approaches. Journal of Learning Analytics, 3(3), 215-238. Kop, R., & Hill, A. (2008). Connectivism: Learning theory of the future or vestige of the past? The International Review of Research in Open and Distributed Learning, 9(3). Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised machine learning: A review of classification techniques. Emerging Artificial Intelligence Applications in Computer Engineering, 160, 3-24. Kovanovic, V., Joksimović, S., Gasevic, D., Hatala, M., & Siemens, G. (2017). Content Analytics: The Definition, Scope, and an Overview of Published Research. In C. Lang, G. Siemens, A. F. Wise, & D. Gaševic (Eds.), The Handbook of Learning Analytics (1 ed., pp. 77-92).

124

Alberta, Canada: Society for Learning Analytics Research (SoLAR). DOI: 10.18608/hla17.007 Kreijns, K., Kirschner, P. A., & Jochems, W. (2003). Identifying the pitfalls for social interaction in computer-supported collaborative learning environments: A review of the research. Computers in Human Behavior, 19(3), 335-353. doi:10.1016/S0747-5632(02)00057-2 Kuzilek, J., Hlosta, M., Herrmannova, D., Zdrahal, Z., Vaclavek, J., & Wolff, A. (2015). LAK15 Case Study 1: OU Analyse: Analysing at-risk students at the Open University. Learning Analytics Review. Kuzilek, J., Hlosta, M., Herrmannova, D., Zdrahal, Z., & Wolff, A. (2015). OU Analyse: Analysing at-risk students at the Open University. Learning Analytics Review, 1-16. Kwon, S.-W., & Adler, P. S. (2014). Social capital: Maturation of a field of research. Academy of Management Review, 39(4), 412-422. doi:10.5465/amr.2014.0210 Laal, M., & Ghodsi, S. M. (2012). Benefits of collaborative learning. Procedia - Social and Behavioral Sciences, 31(2011), 486-490. doi:10.1016/j.sbspro.2011.12.091 Laal, M., & Laal, M. (2012). Collaborative learning: What is it? Procedia - Social and Behavioral Sciences, 31(2011), 491- 495. doi:10.1016/j.sbspro.2011.12.092 Lambert, J. L., & Fisher, J. L. (2013). Community of inquiry framework: Establishing community in an online course. Journal of Interactive Online Learning, 12(1), 1-16. Lang, C., Macfadyen, L. P., Slade, S., Prinsloo, P., & Sclater, N. (2018). The Complexities of Developing a Personal Code of Ethics for Learning Analytics Practitioners: Implications for Institutions and the Field. Paper presented at the Proceedings of the 8th International Conference on Learning Analytics and Knowledge. Latora, V., & Marchiori, M. (2007). A measure of centrality based on network efficiency. New Journal of Physics, 9(6), 188. Leitner, P., Khalil, M., & Ebner, M. (2017). Learning analytics in higher education—A literature review. In A. Peña-Ayala (Ed.), Learning Analytics: Fundaments, Applications, and

125

Trends (Vol. 94, pp. 1-23). Cham: Springer International Publishing. Lim, C. P., & Tinio, V. L. (2018). Learning Analytics for the Global South. Liu, C. H., & Matthews, R. (2005). Vygotsky's philosophy: Constructivism and its criticisms examined. International Education Journal, 6(3), 386-399. doi:ISSN: 1443-1475 Lockyer, L., Heathcote, E., & Dawson, S. (2013). Informing pedagogical action: Aligning learning analytics with learning design. American Behavioral Scientist, 57(10), 1439-1459. doi:10.1177/0002764213479367 Lodge, J. M., Alhadad, S. S., Lewis, M. J., & Gašević, D. (2017). Inferring learning from big data: The importance of a transdisciplinary and multidimensional approach. Technology, Knowledge and Learning, 22(3), 385-400. doi:10.1007/s10758-017-9330-3 Lou, Y., Abrami, P. C., & d’Apollonia, S. (2001). Small group and individual learning with technology: A meta-analysis. Review of Educational Research, 71(3), 449-521. doi:Doi 10.3102/00346543071003449 Ma, J., Han, X., Yang, J., & Cheng, J. (2015). Examining the necessary condition for engagement in an online learning environment based on learning analytics approach: The role of the instructor. The Internet and Higher Education, 24, 26-34. Macfadyen, L. P., & Dawson, S. (2010). Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers & Education, 54(2), 588-599. doi:10.1016/j.compedu.2009.09.008 Macgregor, J. (1990). Collaborative learning: Shared inquiry as a process of reform. New Directions for Teaching and Learning, 1990(42), 19-30. doi:10.1002/tl.37219904204 Mantelero, A. (2014). The future of consumer data protection in the EU Re-thinking the “notice and consent” paradigm in the new era of . Computer Law & Security Review, 30(6), 643-660. Marcos-García, J.-A., Martínez-Monés, A., & Dimitriadis, Y. (2015). DESPRO: A method based on roles to provide 126

collaboration analysis support adapted to the participants in CSCL situations. Computers & Education, 82, 335-353. doi:10.1016/j.compedu.2014.10.027 Martin, A. J., & Dowson, M. (2009). Interpersonal relationships, motivation, engagement, and achievement: Yields for theory, current issues, and educational practice. Review of Educational Research, 79(1), 327-365. doi:10.3102/0034654308325583 Martin, F., & Ndoye, A. (2016). Using learning analytics to assess student learning in online courses. Journal of University Teaching & Learning Practice, 13(133). doi:10.1177/0047239516656369 Martínez‐López, B., Perez, A., & Sánchez‐Vizcaíno, J. (2009). Social network analysis. Review of general concepts and use in preventive veterinary medicine. Transboundary and Emerging Diseases, 56(4), 109-120. doi:10.1111/j.1865- 1682.2009.01073.x Mattar, J. a. (2010). Constructivism and connectivism in education technology: Active, situated, authentic, experiential, and anchored learning. Technology, 1-16. Mattingly, K. D., Rice, M. C., & Berge, Z. L. (2012). Learning analytics as a tool for closing the assessment loop in higher education. Knowledge Management & E-Learning, 4(3), 236. Mayring, P., 2014. Qualitative Content Analysis: Theoretical Foundation, Basic Procedures and Software Solution. Open Access Repository, Klagenfurt. Retrieved from http://nbn-resolving.de/urn:nbn:de:0168-ssoar-395173. Mazoue, J. G. (2014). The MOOC model: Challenging traditional education. Mazza, R., & Dimitrova, V. (2004a). Visualising Student Tracking Data to Support Instructors in Web-based Distance Education. Paper presented at the 13th international conference Mazza, R., & Dimitrova, V. (2004). Visualising student tracking data to support instructors in web-based distance education. In Proceedings of the 13th international World

127

Wide Web conference on Alternate track papers & posters (pp. 154-161). McCarty, C., & Molina, J. L. (2014). Social network analysis. Handbook of Methods in Cultural Anthropology, 631-657. McKay, T., Miller, K., & Tritz, J. (2012, 2012). What to Do with Actionable Intelligence: E 2 Coach as an Intervention Engine. Mingers, J., & Leydesdorff, L. (2015). A review of theory and practice in scientometrics. European Journal of Operational Research, 246(1), 1-19. Misiejuk, K. & Wasson, B. (2017). State of the Field Report on Learning Analytics. SLATE Report 2017-2. Bergen: Centre for the Science of Learning & Technology (SLATE). Moissa, B., Gasparini, I., & Kemczinski, A. (2015). A systematic mapping on the learning analytics field and its analysis in the massive open online courses context. International Journal of Distance Education Technologies, 13(3), 1-24. doi:10.4018/IJDET.2015070101 Molenaar, I., & Järvelä, S. (2014). Sequential and temporal characteristics of self and socially regulated learning. Metacognition and Learning, 9(2), 75-85. doi:10.1007/s11409-014-9114-2 Mor, Y., Ferguson, R., & Wasson, B. (2015). Learning design, teacher inquiry into student learning and learning analytics: A call for action. British Journal of Educational Technology, 46(2), 221-229. doi:10.1111/bjet.12273 Na, K. S., & Tasir, Z. (2017). Identifying At-risk Students in Online Learning by Analysing Learning Behaviour: A Systematic Review. Paper presented at the Big Data and Analytics (ICBDA), 2017 IEEE Conference on. Nesbit, J., Xu, Y., Winne, P., & Zhou, M. (2008). Sequential pattern analysis software for educational event data. Measuring Behavior 2008, 160. Neville, A. J. (2009). Problem-based learning and medical education forty years on. Medical Principles and Practice, 18(1), 1-9. doi:10.1159/000163038

128

Noroozi, O., Weinberger, A., Biemans, H. J., Mulder, M., & Chizari, M. (2012). Argumentation-based computer supported collaborative learning (ABCSCL): A synthesis of 15 years of research. Educational Research Review, 7(2), 79-106. doi:10.1016/j.edurev.2011.11.006 Nunn, S., Avella, J. T., Kanai, T., Kebritchi, M., Nunn, S., & Kanai, T. (2016). Learning analytics methods, benefits, and challenges in higher education: A systematic literature review. Online Learning, 20(2), 13-29. doi:10.24059/olj.v20i2.790 Nussbaum, M. E. (2008). Collaborative discourse, argumentation, and learning: Preface and literature review. Contemporary Educational Psychology, 33(3), 345-359. doi:10.1016/j.cedpsych.2008.06.001 Nussbaumer, A., Berthold, M., Dahrendorf, D., Schmitz, H.-C., Kravcik, M., & Albert, D. (2012). A Mashup Recommender for Creating Personal Learning Environments. Paper presented at the International Conference on Web-Based Learning. Ott, C., Robins, A., Haden, P., & Shephard, K. (2015). Illustrating performance indicators and course characteristics to support students’ self-regulated learning in CS1. Computer Science Education, 25(2), 174-198. doi:10.1080/08993408.2015.1033129 Papamitsiou, Z., & Economides, A. A. (2014). Learning analytics and educational data mining in practice: A systematic literature review of empirical evidence. Educational Technology & Society, 17(4), 49--64. Pardo, A., Han, F., & Ellis, R. A. (2016). Exploring the Relation between Self-regulation, Online Activities, and Academic Performance: A Case Study. Paper presented at the Sixth International Conference on Learning Analytics & Knowledge. Pardo, A., Han, F., & Ellis, R. A. (2017). Combining university student self-regulated learning indicators and engagement with online learning events to predict academic performance. IEEE Transactions on Learning

129

Technologies, 1382(c), 1-1. doi:10.1109/TLT.2016.2639508 Pardo, A., & Siemens, G. (2014). Ethical and privacy principles for learning analytics. British Journal of Educational Technology, 45(3), 438-450. doi:10.1111/bjet.12152 Pashler, H., & Wagenmakers, E. J. (2012). Editors’ introduction to the special section on replicability in psychological science: A crisis of confidence? Perspectives on Psychological Science, 7(6), 528-530. Peña-Ayala, A. (2014). Educational data mining: A survey and a data mining-based analysis of recent works. Expert Systems with Applications, 41(4 PART 1), 1432-1462. doi:10.1016/j.eswa.2013.08.042 Peng, C.-Y. J., Lee, K. L., & Ingersoll, G. M. (2002). An introduction to logistic regression analysis and reporting. The Journal of Educational Research, 2, 3-14. Persico, D., & Pozzi, F. (2015). Informing learning design with learning analytics to improve teacher inquiry. British Journal of Educational Technology, 46(2), 230-248. doi:10.1111/bjet.12207 Petersen, R. J. (2012). Policy dimensions of analytics in higher education. Educause Review, 47(4), 44. Pintrich, P. R. (2000). The role of goal orientation in self-regulated learning. In M. Boekaerts, P. R. Pintrich, & M. Zeidner (Eds.), Handbook of self-regulation (pp. 451–502). San Diego, CA: Academic Press. Pistilli, M. D., & Arnold, K. E. (2010). In practice: Purdue Signals: Mining real-time academic data to enhance student success. About Campus, 15(August), 22-24. doi:10.1002/abc.20025 Podolskiy A.I. (2012) Zone of Proximal Development. In: Seel N.M. (eds) Encyclopedia of the Sciences of Learning (pp. 3485-3487). Springer, Boston, MA Prinsloo, P., & Slade, S. (2018). Student consent in learning analytics: The devil in the details? In J. K. Lester, Carrie; Johri, Aditya and Rangwala, Huzefa (Eds.), Learning Analytics in Higher Education: Current Innovations,

130

Future Potential, and Practical Applications (pp. 118– 139). New York and Abingdon, Oxon: Routledge, Qassim College of Medicine. Qassim College of Medicine privacy policy and user agreement. Retrieved from https://qumed.org/code.htm Rajesh Kumar S., Hamid S. (2017) Analysis of Learning Analytics in Higher Educational Institutions: A Review. In: Badioze Zaman H. et al. (eds) Advances in Visual Informatics. IVIC 2017. Lecture Notes in Computer Science, vol 10645. Springer, Cham Kumar, S. R., & Hamid, S. (2017). Analysis of Learning Analyt- ics in Higher Educational Institutions: A Review. In International Visual Informatics Conference (pp. 185-196). Springer, Cham. Ramos, C., & Yudko, E. (2008). “Hits” (not “discussion posts”) predict student success in online courses: A double cross- validation study. Computers & Education, 50(4), 1174- 1182. doi:10.1016/j.compedu.2006.11.003 Reimann, P. (2016). Connecting learning analytics with learning research: The role of design-based research. Learning: Research and Practice, 2(2), 130-142. Reiser, R. A. (2001). A history of instructional design and technology: Part I: A history of instructional media. Educational technology research and development, 49(1), 53. doi:Doi 10.1007/Bf02504506 Reschly, A. L., & Christenson, S. L. (2012). Jingle, jangle, and conceptual haziness: Evolution and future directions of the engagement construct. In S. L. Christenson, A. L. Reschly, & C. Wylie (Eds.), Handbook of Research on Student Engagement (pp. 3-19). Boston, MA: Springer US. Resta, P., & Laferrière, T. (2007). Technology in support of collaborative learning. Educational Psychology Review, 19(1), 65-83. doi:10.1007/s10648-007-9042-7 Reychav, I., Raban, D. R., & McHaney, R. (2017). Centrality measures and academic achievement in computerized classroom social networks. Journal of Educational Computing Research, 56(4), 589-618. doi:10.1177/0735633117715749

131

Rienties, B., Boroowa, A., Cross, S., Kubiak, C., Mayles, K., & Murphy, S. (2016). Analytics4Action evaluation framework: A review of evidence-based learning analytics interventions at the Open University UK. Journal of Interactive Media in Education, 1(2), 1-11. doi:10.5334/jime.az Rizzuto, T. E., LeDoux, J., & Hatala, J. P. (2009). It's not just what you know, it's who you know: Testing a model of the relative importance of social networks to academic performance. Social Psychology of Education, 12(2), 175- -189. doi:10.1007/s11218-008-9080-0 Roberts, L. D., Chang, V., & Gibson, D. (2017). Ethical considerations in adopting a university- and system-wide approach to data and learning analytics. In B. Kei Daniel (Ed.), Big Data and Learning Analytics in Higher Education: Current Theory and Practice (pp. 89-108). Cham: Springer International Publishing. Rochat, Y. (2009). Closeness Centrality Extended to Unconnected Graphs: The Harmonic Centrality Index. Paper presented at the ASNA. Rodríguez-Triana, M. J., Martínez-Monés, A., Asensio-Pérez, J. I., & Dimitriadis, Y. (2013). Towards a script-aware monitoring process of computer-supported collaborative learning scenarios. International Journal of Technology Enhanced Learning, 5(2), 151-167. doi:10.1504/IJTEL.2013.059082 Rodríguez-Triana, M. J., Martínez-Monés, A., & Villagrá-Sobrino, S. (2016). Learning analytics in small-scale teacher-led innovations: Ethical and data privacy issues. Journal of Learning Analytics, 3(1), 43-65. Romero, C., López, M.-I., Luna, J.-M., & Ventura, S. (2013). Predicting students' final performance from participation in on-line discussion forums. Computers & Education, 68, 458-472. doi:10.1016/j.compedu.2013.06.009 Romero, C., & Ventura, S. (2010). Educational data mining: A review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and

132

Reviews), 40(6), 601-618. doi:10.1109/tsmcc.2010.2053532 Romero, C., & Ventura, S. (2013). Data mining in education. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 3(1), 12-27. doi:10.1002/widm.1075 Rubel, A., & Jones, K. M. L. (2016). Student privacy in learning analytics: An information ethics perspective. The Information Society, 32(2), 143-159. doi:10.1080/01972243.2016.1130502 Ruipérez-Valiente, J. A., Muñoz-Merino, P. J., Leony, D., & Kloos, C. D. (2015). ALAS-KA: A learning analytics extension for better understanding the learning process in the Khan Academy platform. Computers in Human Behavior, 47, 139-148. doi:10.1016/j.chb.2014.07.002 Salter-Townshend, M., White, A., Gollini, I., & Murphy, T. B. (2012). Review of statistical network analysis: Models, algorithms, and software. Statistical Analysis and Data Mining, 5(4), 243-264. doi:10.1002/sam.11146 Santos, J. L., Klerkx, J., Duval, E., Gago, D., & Rodríguez, L. (2014). Success, Activity and Drop-outs in MOOCs: An Exploratory Study on the UNED COMA Courses. Paper presented at the Proceedings of the Fourth International Conference on Learning Analytics and Knowledge. Saqr, M. (2018). A literature review of empirical research on learning analytics in medical education. International Journal of Health Sciences, 12(2), 80-85. Saqr, M., AlGhasham, A., & Kamal, H. (2014). The Study of Online Clinical Case Discussions with the Means of Social Network Analysis and Data Mining Techniques. Paper presented at the AMEE2014. Saqr, M., Fors, U., & Nouri, J. (2018). Using social network analysis to understand online problem based learning and predict performance. PLoS ONE. doi: 10.1371/journal.pone.0203590 Saqr, M., Fors, U., & Tedre, M. (2017). How learning analytics can early predict under-achieving students in a blended medical education course. Med Teach, 39(7), 757-767. doi:10.1080/0142159X.2017.1309376 133

Saqr, M., Fors, U., & Tedre, M. (2018). How the study of online collaborative learning can guide teachers and predict students' performance in a medical course. BMC Med Educ, 18(1), 24. doi:10.1186/s12909-018-1126-1 Saqr, M., Fors, U., Tedre, M., & Nouri, J. (2018). How social network analysis can be used to monitor online collaborative learning and guide an informed intervention. PLoS ONE, 13(3), 1-22. doi:10.1371/journal.pone.0194777 Saqr, M., Kamal, H., & AlGhasham, A. (2014, 2014). Using Learning Management Systems (LMS) Analytics for Early Detection of Students Underachievement in Blended Courses. Paper presented at the AMEE2014. Saqr, M., Nouri, J., & Fors, U. (2018) (in press). Time to focus on the temporal dimension of learning. A learning analytics study of the temporal patterns of students’ interactions and self-regulation. International Journal of Technology Enhanced Learning. Schmidt, H. G., Rotgans, J. I., & Yew, E. H. J. (2011). The process of problem-based learning: What works and why. Medical Education, 45(8), 792-806. doi:10.1111/j.1365- 2923.2011.04035.x Schraw, G. (2010). Measuring self-regulation in computer-based learning environments. Educational Psychologist, 45(4), 258-266. doi:10.1080/00461520.2010.515936 Schunk, D. H. (2012). Learning Theories: An Educational Perspective (6th edition). Boston: Pearson. Sclater, N., & Bailey, P. (2015). Code of practice for learning analytics. In: Jisc. Retrieved from https://www.jisc.ac.uk/guides/code-of-practice-for- learning-analytics. Marin, A. & Wellman, B. (2014). Social network analysis: an introduction. In Scott, J., & Carrington, P. J. The SAGE handbook of social network analysis (pp. 11-25). London: SAGE Publications Ltd. doi: 10.4135/9781446294413 Marin, A., Wellman, B. (2011), “Social Network Analysis: An Introduction,” in J. Scott and P. J. Carrington (Eds.), The

134

Sage Handbook of Social Network Analysis, Thousand Oaks, CA: Sage, pp. 11–25. Sergis, S., Sampson, D. G., Leitner, P., Khalil, M., & Ebner, M. (2017). Learning analytics: Fundaments, applications, and trends. In A. Peña-Ayala (Ed.), (Vol. 94, pp. 25-64). Cham: Springer International Publishing. Shahiri, A. M., Husain, W., & Rashid, N. A. A. (2015). A review on predicting student's performance using data mining techniques. Procedia Computer Science, 72, 414-422. doi:10.1016/j.procs.2015.12.157 Shmueli, G. (2010). To explain or to predict? Statistical Science, 25(3), 289-310. doi:10.1214/10-STS330 Shum, S. B., & Crick, R. D. (2012). Learning Dispositions and Transferable Competencies. Paper presented at the Proceedings of the 2nd International Conference on Learning Analytics and Knowledge - LAK '12. Shum, S. B., & Ferguson, R. (2012). Social learning analytics. Journal of Educational Technology & Society, 15(3), 3. Sie, R. L., Ullmann, D. T., Rajagopal, K., Cela, K., Bitter, M. B.- R., & Sloep, P. B. (2012). Social network analysis for technology-enhanced learning: review and future directions. International Journal of Technology Enhanced Learning, 4(3/4), 172-190. doi:10.1504/IJTEL.2012.051582 Sie, R. L., Ullmann, T. D., Rajagopal, K., Cela, K., Rijpkema, M. B., & Sloep, P. B. (2012). Social network analysis for technology-enhanced learning: Review and future directions. International Journal of Technology Enhanced Learning, 4(3/4), 172-190. Siemens, G. (2008). Learning and knowing in networks: Changing roles for educators and designers. ITFORUM for Discussion, 1-26. Siemens, G. (2013). Learning analytics: The emergence of a discipline. American Behavioral Scientist, 57(10), 1380- 1400. doi:10.1177/0002764213498851 Siemens, G., & Baker, R. S. J. D. (2012). Learning analytics and educational data mining. Proceedings of the 2nd International Conference on Learning Analytics and 135

Knowledge - LAK '12, 252-252. doi:10.1145/2330601.2330661 Siemens, G., & Baker, R. S. (2012). Learning Analytics and Educational Data Mining: Towards Communication and Collaboration. Paper presented at the Proceedings of the 2nd international conference on learning analytics and knowledge. Siemens, G., & Latour, B. (2015). Learning analytics: The emergence of a discipline. American Behavioral Scientist, 57(10), 1380-1400. doi:10.1177/0002764213498851 Siemens, G. (2004). A learning theory for the digital age. Elearnspace Everything Learning, 1-8. Silius, K., Tervakari, A.-M., & Kailanto, M. (2013). Visualizations of user data in a enhanced web-based environment in higher education. International Journal of Emerging Technologies in Learning (iJET), 8(S2), 13-13. doi:10.3991/ijet.v8iS2.2740 Sin, K., & Muthu, L. (2015). Application of big data in education data mining and learning analytics - A literature review. Ictact Journal on Soft Computing: Special Issue on Soft Computing Models for Big Data, 5(4), 1035-1049. Sitzmann, T., & Ely, K. (2011). A meta-analysis of self-regulated learning in work-related training and educational attainment: What we know and where we need to go. Psychological Bulletin, 137(3), 421. doi:10.1037/a0022777 Slade, S., & Prinsloo, P. (2013). Learning analytics ethical issues and dilemmas. American Behavioral Scientist, 57(10), 1510-1529. doi:10.1177/0002764213479366 Slater, S., Joksimović, S., Kovanovic, V., Baker, R. S., & Gasevic, D. (2017). Tools for educational data mining: A review. Journal of Educational and Behavioral Statistics, 42(1), 85-106. doi:10.3102/1076998616666808 Slof, B., Erkens, G., Kirschner, P. A., Jaspers, J. G., & Janssen, J. (2010). Guiding students’ online complex learning-task behavior through representational scripting. Computers in Human Behavior, 26(5), 927-939.

136

McPherson, J. M., & Smith-Lovin, L. (1987). Homophily in voluntary organizations: Status distance and the composition of face-to-face groups. American sociological review, 370-379. Smith, M. A., Shneiderman, B., Milic-Frayling, N., Mendes Rodrigues, E., Barash, V., Dunne, C., . . . Gleave, E. (2009). Analyzing (Social Media) Networks with NodeXL. Paper presented at the Proceedings of the fourth international conference on Communities and technologies. Soderstrom, N. C., & Bjork, R. A. (2015). Learning versus performance: An integrative review. Perspectives on Psychological Science, 10(2), 176-199. doi:10.1177/1745691615569000 Sonnenberg, C., & Bannert, M. (2015). Discovering the effects of metacognitive prompts on the sequential structure of SRL processes using process mining techniques. Journal of Learning Analytics, 2(1), 72-100. Spoon, K., Beemer, J., Whitmer, J. C., Fan, J., Frazee, J. P., Stronach, J., & Bohonak, A. J. (2016). Random forests for evaluating and informing personalized learning. Journal of Educational Data Mining, 8(2), 20-50. Stahl, G., Koschmann, T., & Suthers, D. (2006). Computer- supported collaborative learning: An historical perspective. Cambridge handbook of the learning sciences, 2006, 409-426. Steel, P. (2007). The nature of procrastination: A meta-analytic and theoretical review of quintessential self-regulatory failure. Psychological Bulletin, 133(1), 65. doi:10.1037/0033- 2909.133.1.65 Stemler, S. E. (2015). Content analysis. Emerging Trends in the Social and Behavioral Sciences: An Interdisciplinary, Searchable, and Linkable Resource, 1-14. Tamim, R. M., Bernard, R. M., Borokhovski, E., Abrami, P. C., & Schmid, R. F. (2011). What forty years of research says about the impact of technology on learning: A second- order meta-analysis and validation study. Review of

137

Educational research, 81(1), 4-28. doi:10.3102/0034654310393361 Tempelaar, D. T., Rienties, B., & Giesbers, B. (2015). In search for the most informative data for feedback generation: Learning analytics in a data-rich context. Computers in Human Behavior, 47, 157-167. doi:10.1016/j.chb.2014.05.038 Tempelaar, D., Rienties, B., Mittelmeier, J., & Nguyen, Q. (2018). Student profiling in a dispositional learning analytics application using formative assessment. Computers in Human Behavior, 78, 408-420. doi:10.1016/j.chb.2017.08.010 Tervakari, A. M., Marttila, J., Kailanto, M., Huhtamäki, J., Koro, J., & Silius, K. (2013). Developing learning analytics for TUT Circle. IFIP Advances in Information and Communication Technology, 395, 101–110. https://doi.org/10.1007/978-3-642-37285-8_11 Trowler, P., & Trowler, V. (2010). Student Engagement Evidence Summary. York, UK: The Higher Education Academy. Trowler, V. (2010). Student engagement literature review. The Higher Education Academy, 11, 1-15. Tsai, Y. S., & Gasevic, D. (2017a). Learning analytics in higher education---challenges and policies: a review of eight learning analytics policies. In Proceedings of the seventh international learning analytics & knowledge conference (pp. 233-242). Tsai, Y.-S., & Gasevic, D. (2017b). Learning analytics in higher education - Challenges and policies. Proceedings of the Seventh International Learning Analytics & Knowledge Conference on - LAK '17(pp. 233-242). doi:10.1145/3027385.3027400 Tsai, Y.-S., Moreno-Marcos, P. M., Tammets, K., Kollom, K., & Gašević, D. (2018). SHEILA Policy Framework: Informing Institutional Strategies and Policy Processes of Learning Analytics. Paper presented at the 8th

138

International Conference on Learning Analytics and Knowledge. University, O. (2014). Policy on ethical use of student data for learning analytics. Retrieved from http://www.open.ac.uk/students/charter/sites/www.open.a c.uk.students.charter/files/files/ecms/web-content/ethical- use-of-student-data-policy.pdf Velleman, P. F., & Hoaglin, D. C. (2012). Exploratory data analysis. In H. Cooper, P. M. Camic, D. L. Long, A. T. Panter, D. Rindskopf, & K. J. Sher (Eds.), APA Handbook of Research Methods in Psychology, Vol 3: Data Analysis and Research Publication (pp. 51-70). Washington, DC, US: American Psychological Association. Verbert, K., Govaerts, S., Duval, E., Santos, J. L., Van Assche, F., Parra, G., & Klerkx, J. (2014). Learning dashboards: An overview and future research opportunities. Personal and Ubiquitous Computing, 18(6), 1499-1514. doi:10.1007/s00779-013-0751-2 Vygotsky, L. (1987). Zone of proximal development. Mind in society: The development of higher psychological processes, 5291. Wang, A. Y., & Newlin, M. H. (2000). Characteristics of students who enroll and succeed in psychology web-based classes. Journal of Educational Psychology, 92(1), 137. doi:10.1037//0022-0663.92.1.137 Wang, A. Y., & Newlin, M. H. (2002a). Predictors of performance in the virtual classroom: Identifying and helping at-risk cyber-students. T H E Journal, 29(10), 21-21. Wang, A. Y., & Newlin, M. H. (2002b). Predictors of web-student performance: The role of self-efficacy and reasons for taking an on-line class. Computers in Human Behavior, 18(2), 151-163. doi:10.1016/S0747-5632(01)00042-5 Wang, G., Gunasekaran, A., Ngai, E. W. T., & Papadopoulos, T. (2016). Big data analytics in logistics and supply chain management: Certain investigations for research and applications. International Journal of Production Economics, 176, 98-110. doi:10.1016/j.ijpe.2016.03.014

139

Wasson, B., & Mørch, A. I. (2000). Identifying collaboration patterns in collaborative telelearning scenarios. Journal of Educational Technology & Society, 3(3), 237-248. Webb, N. M. (2009). The teacher's role in promoting collaborative dialogue in the classroom. British Journal of Educational Psychology, 79(1), 1-28. doi:10.1348/000709908x380772 Wecker, C., & Fischer, F. (2014). Where is the evidence? A meta- analysis on the role of argumentation for the acquisition of domain-specific knowledge in computer-supported collaborative learning. Computers & Education, 75, 218- 228. Winne, P. H. (2011). A cognitive and metacognitive analysis of self-regulated learning. Handbook of Self-regulation of Learning and Performance, 15-32. Winne, P. (2017). Learning Analytics for Self-Regulated Learning. In Lang, C., Siemens, G., Wise, A. F., and Gaevic, D., editors, The Handbook of Learning Analytics, pages 241– 249. Society for Learning Analytics Research (SoLAR), Alberta, Canada, 1 edition. Winne, P. H., Nesbit, J. C., & Popowich, F. (2017). nStudy: A system for researching information problem solving. Technology, Knowledge and Learning, 22(3), 369-376. doi:10.1007/s10758-017-9327-y Wise, A., Zhao, Y., & Hausknecht, S. (2014). Learning analytics for online discussions: Embedded and extracted approaches. Journal of Learning Analytics, 1(2), 48-71. doi:10.1145/2460296.2460308 Wise, A. F., & Shaffer, D. W. (2015). Why theory matters more than ever in the age of big data. Journal of Learning Analytics, 2(2), 5-13. doi:10.18608/jla.2015.22.2 Wolff, A., Zdrahal, Z., Herrmannova, D., Kuzilek, J., & Hlosta, M. (2014). Developing Predictive Models for Early Detection of At-risk Students on Distance Learning

International Conference on Learning Analytics and Knowledge, 145-149. doi:10.1145/2460296.2460324 Wolters, C. A., & Hussain, M. (2015). Investigating grit and its relations with college students’ self-regulated learning and academic achievement. Metacognition and Learning, 10(3), 293-311. doi:10.1007/s11409-014-9128-9 Wong, B. T. M. (2017). Learning analytics in higher education: An analysis of case studies. Asian Association of Open Universities Journal, 12(1), 21-40. doi:10.1108/AAOUJ- 01-2017-0009 Wong, Y. Y. (2016). Academic analytics: A meta-analysis of its applications in higher education. International Journal of Services and Standards, 11(2), 176-192. doi:10.1504/IJSS.2016.077957 Wouters, P., Van Nimwegen, C., Van Oostendorp, H., & Van Der Spek, E. D. (2013). A meta-analysis of the cognitive and motivational effects of serious games. Journal of Educational Psychology, 105(2), 249. doi:10.1037/a0031311 Wright, M. C., McKay, T., Hershock, C., Miller, K., & Tritz, J. (2014). Better than expected: Using learning analytics to promote student success in gateway science. Change: The Magazine of Higher Learning, 46(1), 28-34. doi:10.1080/00091383.2014.867209 Yang, H. (2013). The case for being automatic: Introducing the automatic linear modeling (LINEAR) procedure in SPSS Statistics. Multiple Linear Regression Viewpoints, 39, 27- 37. You, J. W. (2015). Examining the effect of academic procrastination on achievement using LMS data in e- learning. Journal of educational technology & society, 18(3), 64. Zhang, Z. (2016). Introduction to machine learning: k-nearest neighbors. Annals of translational medicine, 4(11). Zimmerman, B. J. (1990). Self-regulated learning and academic achievement: An overview. Educational Psychologist, 25(1), 3-17. doi:DOI 10.1207/s15326985ep2501_2

141

Zimmerman, B. J. (2008). Investigating self-regulation and motivation: Historical background, methodological developments, and future prospects. American Educational Research Journal, 45(1), 166-183. doi:10.3102/0002831207312909 Zumbrunn, S., Tadlock, J., & Roberts, E. D. (2011). Encouraging self-regulated learning in the classroom: A review of the literature. Metropolitan Educational Research Consortium (MERC), 1-28.

142

Studies included in the thesis

143