EXAMENSARBETE INOM TEKNIK, GRUNDNIVÅ, 15 HP STOCKHOLM, SVERIGE 2021

HOW CAN COMPUTER-BASED PROGRAMMING EXAMS BE IMPLEMENTED FOR ENGINEERING STUDENTS?

RICHARD FARSHCHI ALVAREZ

FREDRIK GÖLMAN

KTH SKOLAN FÖR ELEKTROTEKNIK OCH DATAVETENSKAP

Abstract

Programmering inkluderar vanligtvis användningen av en dator och en textredigerare, men ändå avslutas programmeringskurser vid högskolor och universitet ofta med traditionella skriftliga tentor för att bedöma studenternas förvärvade praktiska kunskaper. Denna traditionella examineringsmetod begränsar komplexiteten i programmeringsproblem och kan leda till oavsiktliga subjektiva bedömningar.

Tidigare studier om datorbaserade programmeringstentor som efterliknar verkliga programmeringsförhållanden tyder på att bedömningen av studenternas förvärvade kunskaper kan effektiviseras samtidigt som både betygs- och administrationsprocesser förenklas. Vi använder Bunges generella vetenskapliga metod och fallstudiemetodik för att utveckla ett system för datorbaserade programmeringstentor som kan implementeras på KTH Kista.

Våra resultat tyder på att programmeringstentor kan utföras säkert på studenternas egna datorer genom att starta datorn i en begränsad förkonfigurerad systemmiljö med blockerad internetåtkomst från ett USB- minne. Parat med den öppna lärplattformen Moodle kan tentorna administreras elektroniskt och utformas med automatiska rättningssprocesser. För att hjälpa tentavakter utvecklade vi även ett observationsverktyg som kan upptäcka om den begränsade systemmiljön kringgås.

Trots uteblivna verkliga tester av systemet på grund av omständigheterna orsakade av den pågående pandemin, drar vi, med stöd av tidigare studier, slutsatsen att vår föreslagna lösning för datorbaserade programmeringstentor kan förbättra kvaliteten och effektiviteten av programmeringskurser och deras examinationsprocesser.

Nyckelord

Tentamen; Programmering; Lärplattform; Moodle; Datorbaserad; Digital; Fuskförebyggande

KTH ROYAL INSTITUTE OF TECHNOLOGY ELEKTROTEKNIK OCH DATAVETENSKAP

i

ii

Abstract

Programming usually includes the use of a computer and a text editor, yet programming courses given at institutions of higher education often conclude with traditional written exams to assess the students’ acquired practical knowledge. This traditional examination method restricts the complexity of programming problems and may result in unintentional subjective assessments.

Previous research on computer-based programming exams that mimic real programming conditions suggest that the assessment of the students’ acquired knowledge can be made more efficient while also simplifying both grading and administration processes. We use Bunge’s general scientific method and case study methodology to develop a system for computer-based programming exams that can be implemented at KTH Kista.

Our results suggest that programming exams can be safely performed on the students’ own computers by booting the computer into a restricted preconfigured system environment with blocked Internet access from a usb flash drive. Paired with the open-source learning management system Moodle, the exams can be administered electronically and designed with automatic grading processes. To help exam invigilators we also developed an observation tool that can detect if the restricted system environment is circumvented.

Despite the lack of real tests of the system due to the circumstances caused by the ongoing pandemic, we conclude, with the support of previous studies, that our proposed solution for computer-based programming exams can improve the quality and efficiency of programming courses and their examination processes.

Keywords

Examination; Programming; Learning management system; Moodle; Computer-based; Digital; Cheat-prevention

iii

iv

Acknowledgements

We would like to express our deepest gratitute and appreciation to our supervisor Anders Sjögren, the Program Director for the Bachelor’s Degree Program in Computer Engineering at KTH. Thank you for all the insightful guidance and feedback that made this thesis possible and thank you for all the valuable knowledge you have imparted during this entire education.

A special thanks also goes out to our thesis examiner Johan Montelius, Associate Professor in Communication Systems at KTH. Thank you for all the feedback and suggestions during this thesis and for your courses that got us here.

v

vi

Table of Contents

1 Introduction ...... 1 1.1 Background ...... 1 1.2 Problem...... 2 1.3 Purpose ...... 2 1.4 Goal ...... 2 1.5 Societal benefits, Ethics and Sustainability ...... 2 1.6 Methodology / Methods ...... 3 1.7 Stakeholders ...... 3 1.8 Delimitations ...... 3 1.9 Disposition ...... 4 2 Theoretical background and literature review ...... 5 2.1 Related studies ...... 5 2.2 Existing solutions ...... 7 3 Methodology and methods ...... 15 3.1 Research methodology and methods ...... 15 3.2 Project methodology and methods ...... 18 4 Results ...... 21 4.1 Case study design ...... 21 4.2 Solution requirements ...... 22 4.3 Existing solutions as viable alternatives ...... 23 4.4 Proposed solution ...... 24 4.5 Case study questionnaires ...... 29 5 Discussion ...... 31 5.1 Existing solutions ...... 31 5.2 Proposed solution ...... 33 5.3 Project assignments in lieu of written exams ...... 37 5.4 Validity and reliability ...... 39 6 Conclusion ...... 41 6.1 Limitations ...... 41 6.2 Future work ...... 42 References ...... 43 Appendix A ...... 47

vii

viii

1 Introduction

This thesis investigates the benefits of computer-based exams for programming courses and evaluates the systems for computer-based exams currently implemented at various institutions of higher education to find a suitable solution that could be implemented for programming courses at KTH Kista.

1.1 Background

Written exams using pen and paper is a common examination method for courses given at institutions of higher education [1]. They are simple to perform and the rules that students must adhere to are easy to enforce, however, they have their shortcomings, especially in the examination of practical programming skills.

This thesis has its roots in the limitations that a written exam using pen and paper often imposes on programming problems. One limitation is that they must be simplified for multiple reasons. While students would normally be expected to be familiar with the basic syntax of the programming language in question, the unavailability of application programming interface (API) reference materials, an integrated development environment (IDE) or at the very least a text editor and a way to execute the code, severely restricts the complexity of programming problems that are viable on a paper-based exam. Another limitation is that there must often exist some room for syntactic errors for code written on paper which may pose the risk of variable assessments on behalf of the examiner [2]. Paper-based exams may even result in unintentional subjective assessments for multiple reasons, including examiners being unfamiliar with some programming language features, examiners being biased towards certain ways of solving a problem and examiner fatigue from repeatedly correcting a large number of exams, leading to a grading process of lesser quality.

Since programming usually includes the use of a computer, it is only fitting that programming exams should do the same to mimic real programming conditions. The use of computers could also be beneficial as a tool in the grading and administration processes of exams, allowing for automatic grading and archiving with digitally submitted answers. There are however downsides to using computers during exams. They introduce several problems that must be addressed, mainly how to prevent students from cheating by using prohibited resources or by plagiarizing the work of others. This may be especially true for Bring Your Own Device (BYOD) exams compared to exams taking place on the institutions’ computers [3]. Other problems that may arise are ensuring there are sufficient computers to lend to students that are unable to bring their own and making sure there are contingency plans in place in case of network issues or temporary power losses.

1

While the world transcends into a more digital society, one could presume that the support for computer-based exams will increase over time. Research seems to suggest that they already are generally well perceived by students and teachers alike, even by people with lesser computer skills and in locations where infrastructure may be questionable [4].

1.2 Problem

The main research problem that this thesis attempts to solve is defined as:

“How can computer-based programming exams be implemented for engineering students?”

The secondary research problem that helps solve the main problem is defined as:

“How can computer-based programming exams be implemented for engineering students at KTH Kista?”

1.3 Purpose

The purpose of this thesis is to increase the quality and efficiency of programming courses and their examination processes.

1.4 Goal

The goal is to evaluate existing solutions for computer-based exams and develop a solution that could be implemented for programming courses given at KTH Kista.

1.5 Societal benefits, Ethics and Sustainability

The main intention is for this study, along with the proposed solution, to be usable by KTH Kista and other institutions that are interested in systems for computer-based exams.

Ethically, there are some possible concerns that need to be considered if an institution chooses to implement computer-based exams. A notable concern is that some students may not have their own computers. This issue could be mitigated by allowing the exams to be taken on existing lab machines or by keeping computers in stock for exam purposes. Another ethical aspect that needs consideration is related to accessibility. Since computer-based exams introduce a greater degree of complexity, it is important to ensure that students with disabilities are offered sufficient assistance. The main ethical aspect, however, is to ensure that cheating is not facilitated by the exam being computer-based.

2

With regards to sustainability, we could see a benefit for the environment with the reduced usage of paper. If computer-based exams have a positive impact on the quality of the courses, we could also be looking at an increased human sustainability when students graduate with greater knowledge. The impact on economic sustainability is unclear. The simplified management of exams, including the grading process, could reduce the overall costs of the institutions, however, maintaining a new system, which may include additional computers and other devices, could introduce additional costs.

1.6 Methodology / Methods

The methodology used to answer the research problem is based on Bunge’s general scientific method for technological research and includes case study methodology. Theoretical knowledge is obtained through literature studies. The prototype for KTH Kista is developed iteratively using a lightweight Kanban-like process following Lean and Agile principles, guided by the MoSCoW method for requirement prioritization to keep the project within its budget and delivery frames.

1.7 Stakeholders

The stakeholder for this project is KTH who has expressed an interest in the investigation into computer-based exams for programming courses given at KTH Kista.

1.8 Delimitations

The topic and problem area of this thesis has many relevant facets, and while the authors would like to investigate them all, due to time constraints, the following delimitations must be set:

• This thesis does not provide an in-depth investigation into all existing solutions mentioned. • This thesis does not investigate a solution for examinations performed at a distance. • While computer-based exams may be applicable and implemented for courses in most study areas, this thesis focuses on programming courses. The investigation into a generalized solution is left for future research. • that requires paid licences are generally not considered as viable alternatives for the proposed solution, however, they are still mentioned and evaluated for completeness.

3

1.9 Disposition

This thesis contains the following chapters:

• Chapter 2 presents the scientific findings of related studies and describes existing solutions for computer-based exams currently in use in various institutions of higher education. • Chapter 3 describes the technical and scientific methodologies and methods used in the project. • Chapter 4 presents the results containing an evaluation of existing solutions for the selected case and the proposed solution. • Chapter 5 presents an evaluation and discussion about the presented results, written exams as an examination method, and the validity and realiability of this thesis. • Chapter 6 presents a conclusion, identified limitations, and suggestions for future work.

4

2 Theoretical background and literature review

This chapter presents the scientific findings of related studies in section 2.1 and describes existing solutions for computer-based exams in section 2.2.

2.1 Related studies

Several studies suggest that computer-based programming exams are more efficient in the assessment of the students’ acquired knowledge while also being the favorable choice among students. For instance, Kumar found that more than 75 percent of students in an introductory computer science course believed that “online tests were better at testing their learning in Computer Science I than written tests” [5], while Daly and Waldron, in their study comparing on- computer lab tests to written exams in an introductory programming course, found that “lab exams are more accurate assessors of programming ability than traditional methods such as written exams or programming assignments” [6].

In another study, Grissom et al. compared two groups of students taking a computer science course one year apart facing the same binary tree implementation problem [7]. The first group consisted of eighteen students that took the exam on paper and managed 17 percent correct solutions, whereas the second group consisted of twenty-four students that took the exam on computers, having access to an IDE and the Java API documentation, and managed 58 percent correct solutions. While the difference in results is significant it should be noted that the sample of each group of students is quite small leaving some interpretability concerns, however in another study, Stephenson at the Department of Computer Science, University of Calgary, presents several findings that are in line with what is intuitively expected as well as several findings and approaches that are surprising [8]. First, he describes their exam as a hybrid solution consisting of a paper-based multiple-choice questionnaire and a computer-based programming test with multiple problems. The multiple-choice questionnaire serves to test the basic knowledge that the students are expected to have, and it must be completed prior to access being granted for the advanced part of the exam, taking place at the computer. The interesting part about the computer-based exam is that not only are API references and IDEs allowed, but also Google searches and Stack Overflow posts. The reasoning behind this choice is that the examination is supposed to assess practical abilities in real world situations.

The risk of cheating is mitigated by carefully constructing problems that are not readily available, but that presumably may share some similarities with existing problems, however not similar enough to the degree of making copy and paste viable. Another approach to reducing the risk of cheating is by providing code snippets that must be used in the solution and thus, setting constraints on how the tasks can be solved. Students are also naturally not allowed to communicate with anyone during the exam. The investigation was conducted on the same course on two separate occasions, with 36 and 126 enrolled students on each occasion respectively. The study 5

showed no findings suggesting that any attempts of cheating had occurred or that the test design was unsatisfactory. On the contrary, both students and instructors were generally positive towards the exam being computer-based. In the course evaluation, one question was formulated: “The evaluation methods used for determining the course grade were fair”, with a scale from one to seven, where just above 50 percent answered 7 - “strongly agree”, and roughly 30 percent answered 6 - “agree”, for a total average of 6.22 out of 7.

Stephenson also brings forward some specific aspects surrounding the computer-based exam, both positive and negative. Noteworthy positive aspects include an improved grading accuracy with the explanation that grading had occurred in a partially automated manner through automated tests. He argues that unusual solutions that produce the correct results pass the tests whereas a manual grading process may mark the solution as incorrect due to the examiner not being familiar with some features of the programming language. Similarly, incorrect solutions on a paper were not prematurely graded as correct based on faulty assumptions.

The study also suggests that there is less paper management involved with exams being submitted and graded electronically. The tasks involved in the management of the exams were either simplified or eliminated completely. Stephensons reasoning that answers are easier to read are believable yet slightly amusing. We could all probably agree that some students may have lesser handwriting, and that repeated erasing and rewriting may affect the interpretability of submitted answers, however, he also argues that: “We have also wondered, at times, if students have intentionally obfuscated their writing on an exam with the hope that a grader will guess in their favour when struggling to interpret the answers”.

The most noteworthy negative aspect in Stephensons study relates to questions not being reusable in the approach with full Internet access, and one surprising neutral aspect highlighted is that no significant time savings were made.

Lastly, in their on-computer part of the exam, the students could choose between using their own computers or existing lab machines, and roughly 70 percent chose to work exclusively from their own computers, showing that it can be possible with on-computer tests without providing lab machines for every enrolled student. Interestingly, as this was an option, roughly 15 percent of the students chose to use both their own machines and the lab machines at the same time, i.e., essentially using the secondary machine as a secondary monitor.

6

2.2 Existing solutions

There are multiple solutions currently available for computer-based exams that are not necessarily designed for programming courses. This section describes some of the most frequently used in Sweden and its neighboring countries, and briefly some used at other international institutions. They can be divided into server-side software, often referred to as Learning Management Systems (LMS) or purely assessment-oriented software, and client-side software, often in the form of a secure browser with systems in place to prevent cheating. An evaluation of the viability of these solutions for programming courses at KTH Kista is presented in chapter 4.3 and further discussed in chapter 5.1.

2.2.1 Learning Management Systems

As the name suggests, a Learning Management System (LMS) is a platform that provides a complete learning experience. In higher education, this often takes the form of courses. LMSs can provide many types of content, different communication channels, announcements, different types of questions for tests and quizzes, enrollment handling and various groupings, and handling of documents and other files. There are many different LMSs on the market, where some are commercial products, and some are open sourced. Some of the most well-known LMSs are Canvas, Blackboard and Moodle, each described in the following subsections.

2.2.1.1 Moodle

Moodle is a free and fully open source LMS, licensed under GNU GPLv3 and may as such be freely modified and redistributed [9]. Moodle was originally developed by Martin Dougiamas with its initial release in 2002. It is written in PHP and is actively under development with new releases following a recurring predictable life cycle with available Long-Term Support (LTS) versions. Moodle is cross-platform and runs on any system that supports PHP and some underlying database that is supported, such as MySQL. A web server, such as Apache, is also required to present it to the end user. Moodle supports many features by default but is also barebones in the sense that it does not target one specific audience, but rather, can be customized through its rich community which provides a large number of plugins and themes. As of May 2021, Moodle’s plugin directory has over 1800 plugins, with a total of 503300 recent downloads.

2.2.1.2 Canvas

Canvas is developed by Instructure, Inc. which was founded in 2008 by the two Brigham Young University students Brian Whitmer and Devlin Daley [10]. It is written in Ruby on Rails and licensed under AGPLv3. It was initially made 7

available as open source in 2011, however, while the core is still open source today, much of its functionality and add-ons became proprietary in 2020.

2.2.1.3 Blackboard

Blackboard, also known as Blackboard Learn, is developed by Blackboard Inc [11]. Its initial release was in 1997 with the last stable release being dated October 2014. Blackboard is commercial proprietary software. While being one of the more popular LMSs on the market it has also suffered some criticism regarding reliability issues which has resulted in some educational institutions switching to other solutions [12].

2.2.2 Pure E-Assessment software

This section describes two solutions that entirely focus on the educational assessment aspect. E-assessment platforms are generally server-side software that provides a suite of different examination types which can often be combined with some of the tools mentioned in section 2.2.3 for greater protection against cheating.

2.2.2.1 Inspera Assessment

Inspera can be seen as a major player with regards to computer-based exams. Inspera assessment is a Norwegian developed cloud-based service that provides a feature rich environment to administer and deliver tests to students and is widely used in Sweden [13].

On their blog, Inspera states that educational institutions ask themselves: “with a bit of customisation and a lock-down browser, why can’t we just do digital exams in our LMS/VLE? Why do we need yet another tool?” [14]. They argue that the short answer is: “different platforms are built with different purposes, and therefore have different strengths. VLE/LMS platforms are great for online training, communication between students and teachers, distribution of content and collaboration during the course. An assessment platform, like Inspera Assessment, provides that functionality as well, but it is furthermore specifically designed to securely support all processes and tasks related to assessment. While learning and assessment are tightly connected, the requirements to the platforms that support them are not the same.”

They then proceed to expand on it in four different categories.

• End-to-end online assessment solution • Scalability • Security • Pedagogical impact

8

End-to-end assessment is referring to the fact that a pure assessment platform would provide the whole scope of functionality required to administer exams, everything from designing and delivering the exam to marking and sharing of results, and they specifically mention that Inspera Assessment is role based, i.e., authors, markers, invigilators, etc, whereas VLE/LMS often lack such functionality.

Scalability refers both to being able to handle a large number of students at the same time as well as being able to handle a wide variety of different types of exams for different types of areas of study. They highlight that Inspera Assessment supports over thirty different types of questions by default. They also mention that in 2019, over two million exam submissions in over seventy institutions across the world were handled. They finally conclude that stability on such large-scale exams simply is not possible with VLE/LMS delivery infrastructure.

Security, not surprisingly, refers to how digital exams may be susceptible to cheating. The main ways that Inspera Assessment handles this is by allowing tests to be generated on-the-fly from a collection of appropriate questions. Furthermore, they support the use of a lockdown-browser, Inspera Safe Exam Browser, which presumably is a fork of Safe Exam Browser, described in section 2.2.3.3. They argue that this is only possible sometimes with a VLE/LMS platform, but that other security factors such as system infrastructure, monitoring, and registration scalability makes such a solution unsustainable. Lastly, Inspera Assessment supports plagiarism checks.

Pedagogical impact refers to the way in which Inspera Assessment supposedly provides superior marking functionality with its advanced annotations that allow for criteria-based marks and supporting a wide variety of communications channels in which feedback can be provided to the students.

The comparison is finally concluded with a “Best used for” bullet point list which, for Inspera Assessment, goes as follows:

• Formative assessments. • Summative assessments. • Open and closed book exams. • Bring-Your-Own-Device exams, at-home testing, and on-site exams.

Whereas a LMS would be “Best used for”:

• Quizzes and tests. • Formative assessments. • Peer reviews.

Among the higher education institutions using it are Lund University, Uppsala University, Gothenburg University, Chalmers University, Umeå University, Swedish University of Agricultural Sciences, Jönköpings University and Malmö University [15][16]. Other notable universities in Scandinavia that offer

9

computer-based exams through Inspera include the University of Oslo and the University of Iceland.

2.2.2.2 Wiseflow

Wiseflow is another commonly used computer-based exam platform used throughout Europe. It was originally developed by personnel at Aarhus University who founded Uniwise, the company currently responsible for the product. Like Inspera assessment, Wiseflow is a cloud platform used to administer and deliver tests to students and can be bundled with a secure browser – FLOWLock/LockDown browser – which similar to Safe Exam Browser, described in section 2.2.3.3, attempts to prevent students from cheating by controlling the system it is running on. With the ongoing Covid-19 pandemic, from April 2020, Wiseflow in combination with FLOWLock provides additional functionality for on distance exams to take snapshots of the student via web camera and automatically analyze those images to ensure that the expected student is taking the exam [17].

Among institutions in Sweden using Wiseflow are Linköping University, Karlstad University, Örebro University and Högskolan i Borås [16]. Another notable university in Scandinavia that uses Wiseflow is Aarhus University in Denmark.

2.2.3 Client-side software

The client-side software is in this context almost entirely made up of different tools to restrict or surveil the environment of the student. So-called lockdown- browsers are essentially web browsers that may be used to only access specific resources while also restricting the user’s ability to control the system. Surveillance can occur in different ways with everything from video, and still images to logging.

2.2.3.1 DigiExam

DigiExam has its origin in Sweden, being developed by IMS Global Learning Consortium, and is perhaps the solution that officially integrates with the largest number of learning management [18]. Some of the most well known are Blackboard, Moodle, Canvas and Ping Pong. DigiExam provides functionality both for the teachers to create and administer exams as well as a secure environment for students to take the exam on while controlling the system to prevent cheating.

DigiExam is used by Lund University, Umeå University and Stockholm School of Economics [16][19]. From an international perspective, a notable university that uses DigiExam is Columbia University in New York. 10

2.2.3.2 Exam Monitor

Exam Monitor is developed and used by the University of Southern Denmark [20][21]. It attempts to prevent cheating by recording user activity, which is done by taking still images of the user’s screen with short intervals, capturing audio continuously, and keeping track of user processes. An interesting aspect of Exam Monitor is that it does not necessarily require an Internet connection but is able to store its recordings locally for later submission which in turn results in the ability to eliminate an otherwise worrisome point of failure in wireless connections. The authors are unaware of any Swedish universities using Exam Monitor, but one notable Scandinavian university that uses it is the University of Bergen in Norway.

2.2.3.3 Safe Exam Browser

Safe Exam Browser is an open-source secure browser that provides a client-side environment focused on preventing the computer from being used as a prohibited means of assistance [22]. It is essentially a web browser which restricts which resources are accessible while also restricting running processes on the system. It is developed and maintained by a team at ETH Zurich, a university ranked as one of the best technical universities in the world. Safe Exam Browser officially supports direct LMS integration with Moodle and ILIAS.

2.2.3.4 LockDown

LockDown browser is a proprietary software developed by an American company called Respondus [23]. As with other systems mentioned in this section, it also aims to provide a secure environment on the client side. Besides being proprietary, and unlike Safe Exam Browser, it comes with a hefty price tag starting at 2795 USD per annum at the time of writing this thesis. LockDown officially integrates with several well known LMSs, among others Blackboard, Moodle, and Canvas.

2.2.4 Tools

This section describes the two most relevant Moodle plugins for programming exams in sections 2.2.4.1 and 2.2.4.2. The remainder of this section focuses on other tools vital to handling functionality with regards to code execution in a controlled environment as well as network restrictions.

11

2.2.4.1 CodeRunner

CodeRunner is a Moodle plugin, developed by Richard Lobb, University of Canterbury, and Tim Hunt, The Open University, UK, providing programming related question types for Moodle as well as evaluation of programming code submitted by the students [24]. It can be configured to run in a sandbox environment on a separate server, a so-called Job Engine (Jobe) server. It supports Python 2, Python 3, C, C++, Java, PHP, Pascal, JavaScript, Octave and Matlab by default but can be extended to evaluate other languages with some effort from the administrator. It allows automatic grading in the form of tests, similar to unit tests used in programming. It allows for a wide variety of configuration for exam designs including, among other settings, whether to show what tests a submission has passed, only showing passed tests, and allowing multiple submissions after feedback with or without some configurable penalty degree for failed attempts.

CodeRunner has been used to evaluate programming exams at the University of Canterbury for over a decade, evaluating several millions of student submissions. According to the developers, a modern octa-core Moodle server can handle well over a thousand Python submissions per minute, assuming the submissions are not overly computationally expensive, and still maintain a response time of less than three to four seconds. In a real-world exam with about 500 students, the developers found that the Moodle server was under light or moderate load.

2.2.4.2 Virtual Programming Lab

Virtual Programming Lab (VPL) is another Moodle plugin developed by the Department of Computer Science and Systems, University of Las Palmas de Gran Canaria, Spain [25]. VPL may be used to design exams and supports a wide variety of different settings including the blocking of copy and paste functionality and providing students with sets of boilerplate code to start with. Student-submitted source code is run in a sandboxed environment with help of the so-called VPL-Jail-System which can restrict resource allocation. The execution can end for four reasons; the execution exits normally, it terminates once it depletes its allocated resources, it terminates upon user request or after the Moodle server requests the termination, e.g., when a user attempts to execute another task while already having one running, which is not permitted. VPL supports the following languages out of the box: Ada, C, C++, C#, Fortran, Haskell, Java, Matlab/Octave, Pascal, Perl, PHP, Prolog, Python, Ruby, Scheme, SQL and VHDL.

12

2.2.4.3 Job Engine

A Job Engine (Jobe) server is an environment whose functionality is to, if needed, compile, and run jobs [26]. Jobe is mainly written in PHP and implemented using the CodeIgniter framework. It only runs on .

The expected input is the source code to be evaluated, the standard input and an optional list of additional files. The output from the Jobe server is some status information as well as the output and potential error output from the execution. Communication occurs over a RESTful API.

A Jobe server has been used as the default way of evaluating student submissions at the University of Canterbury since July 2014 with only a few minor issues having been experienced over several hundreds of thousands of submissions. Its sandboxing functionality is implemented with the help of the DOMjudge runguard program to restrict jobs submitted by students to be run with restricted resource allocation, but it does not restrict system calls. It is imperative that the system running Jobe is firewalled properly as other safeguards against unauthorized access are not enabled by default.

2.2.4.4 Uncomplicated

Uncomplicated Firewall (ufw) is, as the name suggests, meant to provide a simpler way to generate rules for network traffic compared to the otherwise both quite verbose and complex configuration [27]. Common parameters for such rules include whether the traffic is incoming or outgoing, to or from what destination it is addressed, what port it is on, and what protocol the data is. Ufw is the default firewall configuration tool in but is also available by default in many other Linux distributions.

13

14

3 Methodology and methods

This chapter describes the methodologies and methods used for conducting the research in section 3.1, and for fulfilling the degree project commitment with regards to the project management triangle in section 3.2, to obtain the results required to solve the main research problem: “How can computer-based programming exams be implemented for engineering students?”.

The results that would solve the research problem is a proposed solution for computer-based programming exams that: 1. mimics real programming conditions, 2. allows for an increased degree of exam problem complexity, 3. prevents students from plagiarizing the work of others and using prohibited resources, i.e., cheating, and, 4. allows for answers to be submitted as digital files.

3.1 Research methodology and methods

This section describes how the research is conducted to obtain scientifically valid results. Subsection 3.1.1 introduces Bunge’s general scientific method for technological research, subsection 3.1.2 introduces the case study methodology, subsection 3.1.3 introduces the questionnaire for data collection, and subsection 3.1.4 describes an adapted research method based on the above, and its usage to produce the results that would solve the research problem.

3.1.1 Bunge’s general scientific method for technological research

Bunge [28] defines a method to be scientific if the following conditions are met: a) it is intersubjective in the sense that it gives roughly the same results for all competent users, b) it can be checked or controlled by alternative methods, and c) there are well-confirmed hypotheses or theories that help explain how it works. Bunge continues to describe the scientific method as a series of cognitive operations performed in order which is known as the general scientific method. An adaptation of this method for technological research is described by Andersson and Ekholm [29] using the following steps: 1. How can the current problem be solved? 2. How can a technology/product be developed to solve the problem in an effective way? 3. What documentation/information is available and required to develop the technology/product? 4. Develop the technology/product based on the data/information in step 3. If the technology/product proves satisfactory, proceed to step 6. 5. Try a new technology/product. 6. Create a model/simulation of the proposed technology/product.

15

7. What is entailed, i.e., what are the consequences of the model/simulation in step 6? 8. Test the application of the model/simulation. If the outcome is not satisfactory, proceed to step 9, otherwise proceed to step 10. 9. Identify and correct for deficiencies in the model/simulation. 10. Evaluate the result in relation to existing knowledge and practice, and identify new problem areas for further research.

3.1.2 Case study methodology

A case study investigates a case to answer specific research questions. The “case” in a case study can be either an individual, a group, an institution, or a community [30]. Yin [31] defines a case study as an empirical method that investigates a contemporary phenomenon in depth and within its real-world context, especially when the boundaries between phenomenon and context may not be clear. Yin states that case study research is most likely to be appropriate to “how” and “why” questions and that stating a proposition moves the research in the right direction. He continues to define four basic types of designs for case studies which are either single- or multiple-case holistic or embedded designs, illustrated by the 2 x 2 matrix in Figure 1. The single-case study is analogous to a single experiment, which according to Ragin, cited in [32], constitutes a qualitative research method as opposed to the multiple-case design which constitutes a quantitative method.

Figure 1. Basic types of designs for case studies 16

A case study can be conducted by following five major process steps. These steps are almost the same for any kind of empirical study [32]:

1. Case study design: objectives are defined, and the case study is planned. 2. Preparation for data collection: procedures and protocols for data collection are defined. 3. Collecting evidence: execution with data collection on the studied case. 4. Analysis of collected data. 5. Reporting.

3.1.3 Questionnaire for data collection

The questionnaire, as described by Dawson [33] can serve as both a qualitative and a quantitative methodology for gathering data. The close-ended questionnaire follows a set format to generate statistics in quantitative research and can be produced in great numbers while the open-ended questionnaire leaves blank sections for the respondents to write their answers, generating subjective data such as opinions and experiences. A combination of both open and closed questions is often used and allows for the gathering of both statistical and subjective data. The questionnaire can also be either self-administered or interviewer administered, meaning that the questionnaire is either delivered to the respondents who fill it in on their own, away from the researcher or that the questionnaire is filled in by, or in the presence of, the researcher.

3.1.4 Adapted research method

The adapted method used to conduct the research is based on Bunge’s general scientific method for technological research and includes a holistic (single unit of analysis) single-case study performed at KTH Kista, for which a proposed solution is developed and evaluated using questionnaires with both open and closed questions. The adapted method includes the following steps, also illustrated in Figure 2:

1. Define the research problem. 2. Perform background research: a. Literature study. b. Case study design. 3. Suggest and validate a solution. 4. Implement and test the suggested solution. If the outcome is not satisfactory, return to step 3. 5. Evaluate the results. If the results are not satisfactory, return to step 3. 6. Present the results. 7. Discuss and evaluate the accepted solution.

17

The first step involves defining the research problem to be investigated. This is followed by performing background research on the problem. The literature study step includes researching related studies and similar solutions for computer-based exams. The case study design step involves selecting and researching the conditions for the case and stating a proposition for the case study. The objectives for the case study are then defined and the study is planned. In step 3, a solution that would solve or advance the solution of the research problem is suggested and validated. In step 4, the suggested solution is implemented and tested, and in step 5 the results are evaluated. If the solution proves to be satisfactory, it is accepted. Steps 3-5 with internal testing are performed iteratively until all requirements are met and a working solution is developed. The evaluations performed in step 5 of the final iteration includes data collected from questionnaires given to test participants of the real-world testing. In step 6, the results are presented, Figure 2. Adapted research method. and in step 7 a discussion is made about the proposed solution along with suggestions for further research and development.

3.2 Project methodology and methods

This section describes the methodology and methods used to fulfill the degree project commitment with regards to the project management triangle, a model in which the corners of a triangle make up three interdependent constraints for the project: time, cost, and scope. The degree project that encompasses the research and development that make up this thesis is constrained by a strict deadline and budget for its completion, leaving scope as the only aspect that can be made flexible. The project is kept within these parameters by incorporating principles from Lean and Agile methodologies and using the MoSCoW method for requirement prioritization. This section is included in this thesis for the fulfillment of the degree project objectives, which include demonstrating the ability to plan and with appropriate methods undertake tasks within predetermined parameters.

Subsection 3.2.1 introduces the MoSCoW method, and subsection 3.2.2 describes the adapted project methodology and its usage to fulfill the project commitment.

18

3.2.1 MoSCoW method

The MoSCoW method is a prioritization technique developed by Clegg [34] used to decide the order in which requirements should be completed. It is often used when there is a fixed deadline to a project to help shift the focus to the most important requirements. MoSCoW itself is an acronym that stands for must, should, could and won’t, which are the four levels of importance that a requirement can be labeled as, where:

• Must – are the critical requirements that must be fulfilled within the given timeframe. • Should – are the important, but not necessary, requirements that should be fulfilled. • Could – are the desirable requirements of less importance that could be fulfilled. • Won’t – are the least important requirements that will not be fulfilled.

3.2.2 Adapted project methodology

The adapted project methodology used for the work behind this thesis incorporates both Lean and Agile principles, as described by the Poppendiecks [35] and the Agile Manifesto [36], of which many overlaps, and uses a process for planning and handling tasks similar to Kanban; a lightweight process that uses a board with To Do tasks, focusing on visualization, flow, and limiting work in progress.

The adapted methodology follows the Lean principle of team empowerment, which suits this project well given that the team is small and that the work is done remotely. The team members are self-organized and pull tasks from a simplified Kanban board which tracks a set of prioritized tasks. The tasks correspond to project requirements that are identified and analyzed using the MoSCoW method, where the “must” label is set only to requirements that are deemed critical for solving the research problem. Additional requirements of lower importance are implemented as the remaining time and budget permits.

19

The identification and analysis of requirements is done in the background research stage of the adapted research method, illustrated in Figure 3. Tasks that fulfill a requirement are then broken down into multiple subtasks that preferably can be completed in different ways and improved upon iteratively with the steps: suggest and validate solution, test solution, and evaluate results. This helps reduce waste, increase delivery speed, and delay decisions that reduce flexibility in the project.

Figure 3. Extended model for adapted research method.

20

4 Results

This chapter presents the results that solves the research problem of this thesis, formulated in section 1.2 as: “How can computer-based programming exams be implemented for engineering students?”.

This question is answered by the generalization of the results of the conducted case study that solves the secondary research problem, formulated as: “How can computer-based programming exams be implemented for engineering students at KTH Kista?”.

Section 4.1 describes the case study design and the constraints of the case. Section 4.2 describes the requirements for the proposed solution, presented as the results of the conducted MoSCoW-analysis and a short description of the most critical requirements. Section 4.3 presents the results of the study into the viability of existing solutions for computer-based exams, adapted for programming courses for the selected case. Section 4.4 presents the proposed solution that meets the requirements listed in section 4.2 and section 4.5 presents the questionnaires for data collection that serve to evaluate the testing of the proposed solution.

4.1 Case study design

The case study attempts to solve the secondary research problem. The case is KTH Kista for which a solution is developed and tested. A proposition for this case study is stated as: “The proposed solution for computer-based programming exams increases the overall quality of programming courses”.

This proposition was supposed to be evaluated through data collected from questionnaires filled out by the people participating in the testing of the proposed solution, including the students taking the exam, the invigilators supervising the students and the teachers responsible for the courses in which the exams were given. Unfortunately, testing the proposed solution in real situations has not been possible due to scheduling issues and the fact that most exams are taken at a distance due to the restrictions caused by the ongoing Covid-19 pandemic. The proposition is instead evaluated from the results of the authors’ internal testing of the proposed solution. The conclusions from these evaluations are presented in chapter 6. Ideally, the proposed solution would have been tested with real participants in real courses with exams taken in classrooms with invigilators present.

The case is bounded by a set of infrastructural constraints, of which, the most notable one, is related to network connectivity. The institution offers Internet access in classrooms mostly through wireless networks, either through a private local network or the commonly available Eduroam network for institutions of

21

higher education. There are a select few classrooms that offer ethernet connectivity, however, these classrooms are often booked and are rarely used for exams.

Another constraint is related to the amount of lab computers available. If the exam is taken solely on lab computers offered by the institution there is a set limit to the number of students that can take the exam simultaneously. These computers are usually meant for general use and booking these computers might cause problems for other students that depend on lab computers for their studies.

The availability of sufficient power is not a concern, however, in the case that every student might need access to a power outlet at the same time, wall outlet extensions are likely required.

Desk screens or dividers are not installed in the classrooms which limits the number of students that can be seated in each classroom to reduce the risk of computer screens being visible to other students.

4.2 Solution requirements

The prioritized list of requirements in Table 1 is the result of the MoSCoW analysis performed on the defined set of requirements based on the stakeholder’s wishes and the infrastructural constraints of the case for which the solution is developed.

The “must” requirements represent the most critical aspects of the proposed solution. The ability to perform programming exams on students’ own computers, so called BYOD exams, is deemed critical since the institution is unable to provide and maintain a large number of computers for many reasons. Ensuring that exams are fair and that the computers are not able to provide prohibited forms of assistance requires the ability to limit the system environment as deemed fit and the ability to block Internet access. The standard rules that apply during an exam must also be followed, including the ability to verify the identities of the examinees, and limiting the writing time. Support for video surveillance and exams taken at a distance are interesting features that would be beneficial to the system, however they are deemed less important at this time and are thus excluded. Other requirements that would improve the solution are labeled as “should” or “could”.

22

Table 1. A prioritized list of requirements for the proposed solution. Prioritization Requirement

must Bring Your Own Device

Limited system environment

Internet access blocking

Student identity verification

Writing time limitation

should Automatic detection of security system circumvention attempts

Ability to choose and configure preferred programming environment

Backups of work in progress

could Support for automatic grading

Limited Internet access for approved sources

Student/teacher communication through approved messaging channel

won’t Support for video surveillance

Support for exams taken at a distance

4.3 Existing solutions as viable alternatives

Moodle was chosen as the server-side platform to provide the functionality of the computer-based exams. As the number of subjects were limited, i.e., only programming exams, it did not make sense to use a full-scale solution such as the pure e-assessment platforms Inspera Assessment and Wiseflow. That left the LMS’ alternatives, mainly Blackboard and Canvas. Blackboard is entirely proprietary and has, at times, suffered from stability issues. Canvas was very interesting, in large due to the fact that it is a system currently in use at KTH which likely means a lower learning curve. Both Blackboard and Canvas suffered from very little details being available with regards to plugins to expand functionality. This, along with the fact that Moodle has a large repository of plugins, a strong community, is fully open source, is stable and has a long history of predictable release cycles led to it being the final choice. Moodle is also lean per default and allows flexible configurability.

CodeRunner in combination with a Jobe server was initially evaluated. Both CodeRunner and VPL could have been great alternatives. CodeRunner does fulfill all the requirements in a satisfactory manner along with the fact that it and Jobe is developed by the same developer, which is positive from a compatibility perspective, leading to it becoming the final choice.

23

Client-side solutions in the form of lockdown-browsers were not viable as they firstly overlap with the control that can be enforced directly in the provided, and secondly, negates what the whole idea is in providing a natural everyday environment for the student to work in. Safe Exam Browser also does not support Linux, and LockDown comes with a hefty price tag, which in both cases works against their viability. Exam Monitor could have been interesting, but the proposed observation tool presented in section 4.4 serves a similar purpose, in a different manner though.

4.4 Proposed solution

The proposed solution for computer-based programming exams at KTH Kista is based on the imposition of a limited system environment on the students’ own computers with Moodle as the backend. Subsection 4.4.1 describes the client side of the system and subsection 4.4.2 describes the backend. Subsection 4.4.3 describes the invigilator observation tool that helps invigilators monitor ongoing exams. Figure 4 shows a use case model for the proposed solution and its implemented functionality. This solution meets all the requirements in Table 1 except for the last two “could” requirements, which have been excluded in this solution.

Figure 4. Use Case model of the proposed solution.

24

4.4.1 Client

The idea is that the student boots the computer into a preconfigured operating system from a usb flash drive handed out right before the exam begins. Student identity verification and exam writing time limitation is handled by the invigilators present in the classroom.

The operating system chosen is Ubuntu Linux, version 20.04.2, which is configured to have multiple programming environments to choose from. The student is given access to a regular user account with a home folder, but no root access or otherwise deviating privilege configuration. This ensures that the student is unable to remove the limitations imposed on the system.

The use of prohibited resources is prevented by blocking Internet access, blocking or removing the drivers for external usb-storage devices and removing the man-pages, the software documentation materials usually found on Unix systems such as Ubuntu. Internet access is blocked by configuring rules in ufw, the default firewall configuration tool for Ubuntu, to not allow any incoming or outgoing connections, except for outgoing traffic on port 80 to the Moodle server.

4.4.2 Backend

The backend is built with Moodle which can handle the automatic saving of the state of a student’s ongoing exam as well as running traditional plagiarism checks. Support for quizzes that can evaluate submitted source code is added by installing the CodeRunner plugin which is used inside Moodle to design exams that can be tested and evaluated automatically against prespecified test cases. Figure 5 shows an example question with provided boilerplate code and Figure 6 shows the test case execution results for that question.

Figure 5. Moodle utilizing CodeRunner. A simple JavaScript question is displayed with provided boilerplate code. 25

Figure 6. Evaluation and test case execution results of the question in Figure 5.

CodeRunner is configured to use a separate Jobe server to evaluate the code submitted by the students in a sandboxed environment. The Jobe server in turn is firewalled to only allow incoming traffic on port 80 from the Moodle server and to block all outgoing traffic.

Both the Moodle server as well as the Jobe server run Ubuntu 20.04.2. The installed Moodle version is 3.9.6, the installed CodeRunner version is 4.0.0 and the installed Jobe version is 1.6.5. The web server serving Moodle to the users is Apache 2, and the MySQL database utilized by Moodle has version 8.0.23.

4.4.3 Invigilator observation tool

The requirement to support automatic detection of security system circumvention attempts is met by a separate tool called the invigilator observation tool, which was developed for this solution to help invigilators identify students that attempt to circumvent the limited system environment by rebooting their systems into the underlying operating system.

The tool consists of three parts. The first is a small client that runs continuously in the background of the restricted operating system on the student computers. Its function is to ping a server repeatedly with a predetermined time interval between each attempt.

The second is a web-based client developed in React, a JavaScript library for building user interfaces, which invigilators can use to configure and watch the students based on a seating grid. This configuration could also be done by teachers or technical assistants if needed.

The third and final part is the server-side software developed on top of Node and Express, a JavaScript runtime environment and a framework for building server applications respectively, that ties it all together by communicating with each party. The idea here is that each usb flash drive is uniquely identified by 26

an ID number, and the invigilator can, after configuring a grid of “seats” in the room, attach this ID number to the seat. After attaching all students to their corresponding seats, the invigilator can begin watching these usb flash drives by starting a session with the “Watch”-button. Figure 7 shows the grid creation and Figure 8 shows the session registration.

Figure 7. Grid creation.

Figure 8. Session registration.

There are four different states that can be applied to a student that could be categorized as “Online”, “Inactive”, “Unknown” and “Offline”, where the latter two are levels of suspicious absence. “Online”, marked as green in the observation tool, means that the student’s system is actively pinging the server and should naturally be interpreted as everything being in order. “Inactive”, marked as gray, can be used to indicate allowed forms of absence, e.g., bathroom breaks or a finished exam. “Unknown”, marked as green with a red border, means that the student’s system has missed pinging within a threshold. This should not be interpreted as an obvious absence though. There could be several valid reasons for achieving this state, including the configuration of thresholds where each student simply does not start pinging at the same time 27

and that the wireless connection could be unstable with connections dropping or network packages getting lost. “Offline”, marked as red in the observation tool, means that the student’s system has missed actively pinging the server during an interval long enough that the reason behind it needs to be investigated. It is worth noting that this still does not necessarily mean that some dishonest activity is going on but could instead be caused by valid reasons such as the wireless connection having dropped without being noticed, or other technical issues. Figures 9-11 show how the tool displays the different states.

Figure 9. Session update. A red border indicates potential absence.

Figure 10. Session update. A red box indicates suspicious absence that should be further investigated.

28

Figure 11. Session update. Usb-id 3 is marked as absent and usb-id 5 is marked as inactive, e.g., indicating a finished exam.

An important goal with this helper tool has been to ensure its ease of use as the technical understanding of invigilators may vary wildly. It should also be noted that this tool is not required to be used alongside the rest of the proposed solution, it is merely an optional helper tool for the invigilators to easily recognize possible cheating attempts. Also, it was of importance to keep it cross- platform compatible which is why a web-based client was the natural choice for the tool.

4.5 Case study questionnaires

The questionnaires presented here are meant to be used for data collection from testing the proposed solution in real exam situations. As mentioned in chapter 3.1.4 they consist of both open and closed questions, where the closed questions have prepared answer alternatives. Three different questionnaires have been designed with some overlapping questions. Subsection 4.5.1 presents the questionnaire for students, subsection 4.5.2 the one for invigilators, and subsection 4.5.3 the one for teachers.

4.5.1 Student questionnaire

1. How difficult did you find working in a Linux environment? 2. How well did this exam fairly assess your acquired knowledge? 3. How well did this system resemble real programming conditions? 4. How well did this exam increase the complexity of the tasks compared to a pen and paper exam? 29

5. How valuable was the ability to execute your code? 6. How highly would you rate the possibility of cheating by using this system for exams compared to pen and paper exams? Also, describe your concerns, if any. 7. open: What, if any, aspects of the system could be improved? 8. open: How would you describe your experience using this system for computer-based exams. Also, what difficulties did you encounter, if any? 9. open: What, if any, concerns do you have with regards to cheating while using this system?

4.5.2 Invigilator questionnaire

1. How well did you enjoy monitoring this exam? 2. How easy did you find using the observation tool? 3. How useful did you find the observation tool as a security measure? 4. open: What, if any, aspects of the system could be improved? 5. open: How would you describe your experience using this system for computer-based exams. Also, what difficulties did you encounter, if any? 6. open: What, if any, concerns do you have with regards to cheating while using this system? 7. open: What cheating attempts, if any, did you discover?

4.5.3 Teacher questionnaire

1. How well did you enjoy using this system for your programming course? 2. How well did this system simplify the administration process of the exam? 3. How difficult was it to design test cases for the exam problems? 4. How likely is it that you would design test cases for your exam problems? 5. How many exam problems did you design and how many had test cases? 6. How well did this system allow for programming solutions consisting of multiple files? 7. How well did this system allow for an increased problem complexity? 8. How did you experience describing the observation tool to the invigilators? 9. How useful did you find the observation tool used by the invigilators as a security measure? 10. How well does this system prevent cheating or unfair advantages? 11. open: What, if any, aspects of the system could be improved? 12. open: How would you describe your experience using this system for computer-based exams. Also, what difficulties did you encounter, if any? 13. open: What, if any, concerns do you have with regards to cheating while using this system? 14. open: How would you suggest handling the recovery process of usb flash drives between exams? 30

5 Discussion

This chapter presents an evaluation and discussion about the presented results and the methods used to obtain those results. Section 5.1 discusses the viability of existing solutions for the selected case. Section 5.2 evaluates and discusses the proposed solution. Section 5.3 discusses project assignments in lieu of written exams. Section 5.4 evaluates the validity and reliability of this thesis for solving the research problem, formulated in chapter 1.2 as: “How can computer-based programming exams be implemented for engineering students?”.

5.1 Existing solutions

Chapter 2 introduced various available solutions for computer-based exams such as Inspera Assessment and Wiseflow, two examples of pure e-assessment platforms that were decided against for the proposed solution for multiple reasons. They market themselves as full scale solutions for computer-based exams, however, the objective was not to derive a solution for all types of different exams but mainly for programming exams, thus, one of their major selling points was seemingly more of a disadvantage for our purposes as it could serve to add additional complexity. It is also possible that bringing in a pure e- assessment tool, as opposed to a LMS platform, may serve to introduce an additional learning curve. They also come with a hefty price tag.

Instead, various LMS based backends were considered. Since KTH already uses Canvas at the time of writing this thesis, attempts were made to find suitable plugins to support programming exams on Canvas. Finding plugins for Canvas, however, yielded few results and the already available access to forum posts and chat messaging between students in a LMS configured for everyday use, which is the case at KTH, could introduce difficulties in the prevention of cheating during exams and may thus require two separate LMS systems running either way. For these reasons, our investigation shifted towards other backends.

Out of Blackboard and Moodle, we found Moodle with its large repository of plugins, a strong community, and the fact that it is open source to be the most suitable solution that could ensure that continued maintenance and development can be done with ease. Another point in favor of Moodle is that it is relatively barebones by default and does not have a specific target audience like Canvas, which is mainly targeted at higher educational institutions. Moodle is more flexible and can be configured to suit most levels of learning.

Two Moodle plugins were identified as strong candidates to implement the required functionality for the evaluation of programming tasks; CodeRunner and VPL. The initial choice fell on CodeRunner which not only has the potential to fulfill our requirements but could do so in an impressive manner. CodeRunner can effectively be combined with a Jobe server. Part of the appeal for combining Jobe server with CodeRunner is that they share the same developer, which is likely to result in greater compatibility. The fact that said 31

developer is actively using these plugins in his profession at the University of Canterbury, New Zealand, and has been doing so for about a decade, indicates that the solution might be similarly well suited for our purposes. While the authors of this thesis have not tested the scalability of this solution with regards to performance, it should probably be noted that the argument that the team at Inspera Assessment brought forward with regards to scalability (chapter 2.2.2.1) should be taken with a grain of salt. It may be applicable to some solutions that integrate with LMS backends and may as such be a generalization, however, the authors find no reason to question the experiences and resulting statements made by CodeRunner developers. That is not to say that Inspera Assessment, for instance, is not a great e-assessment tool and well performing with regards to scalability. On the contrary the authors are confident that it really is, however, some of the major selling points brought forward by Inspera, mentioned in chapter 2.2.2.1, would not really be weak points for our proposed solution. The bottom line is that one also needs to keep in mind that companies with tools that come at a cost have an incentive to generate revenue, and as outlined in chapter 1.8, the scope of this investigation does not consider software requiring a paid license as viable alternatives, at least not in the case where a demo version has not been easily attainable.

As for client-side software that is actively contributing to the proposed solution rather than tools present within the operating system, such as text editors and compilers, the most common ones in similar solutions, not surprisingly, appear to be various ways of controlling the system of the student. Most commonly, such software is simply a lockdown-browser. Among those, the most frequently used appears to be Safe Exam Browser (SEB), introduced in chapter 2.2.3.3, and another notable one being LockDown, introduced in chapter 2.2.3.4. For obvious reasons these types of software can have a significant impact on trying to prevent cheating, however, there are also downsides. LockDown is proprietary software and comes with a hefty price tag which immediately rules it out as an alternative. SEB on the other hand is open source and free of charge but unfortunately does not support Linux which rules it out as an alternative for the proposed solution. Another downside that speaks heavily against any lockdown-browser is the fact that it directly nullifies the idea that the student should be able to work as regularly as possible in the sense that it locks the student into a restricted browser environment.

Another downside, not relevant for the proposed solution, but in solutions where the student takes the exam on their own computer and operating system is that students could find various ways to circumvent the protection provided by lockdown-browsers. One way could be to hard reset the system and continue in an unrestricted manner accessing prohibited resources, and then simply restarting the lockdown browser to continue with the exam [37]. Another way, for a lockdown-browser like SEB which is open source, could be to manipulate the source code in a manner that circumvents the protection. It has been shown that one could utilize two parallel installations, of which one has been tampered with, where the identifier mechanism of the unmodified version is used in conjunction with the modified version. One could then simply modify the source code of the version that has been tampered with to allow the ability to 32

control the task manager or alt-tab out of the process which in essence would render the protection useless [37].

5.2 Proposed solution

This section presents an evaluation of the proposed solution in terms of its fulfillment of the requirements in subsection 5.2.1. Subsections 5.2.2 and 5.2.3 discuss some of the choices made during development with regards to the chosen operating system and preventing the ability to cheat, respectively. Subsection 5.2.4 discusses how the must-requirements affect the proposed solution.

5.2.1 Fulfillment of requirements

The proposed solution fulfills almost all the requirements identified in Table 1, chapter 4.2. The following list describes how each requirement has been met.

must Bring Your Own Device ✔

The requirement to allow for exams to be taken on the students’ own computers is met by creating a preconfigured exam environment on bootable usb flash drives. The environment, built with Ubuntu Linux, allows the students to connect to a Moodle server where the exam questions can be found.

must Limited system environment ✔

This requirement is met by configuring the exam environment on the usb flash drives to limit what the students can do on the computer. The students are given credentials for a regular user account and have access to a home folder where they can store their files. Access to the rest of the filesystem and to system configurations requires root privileges.

must Internet access blocking ✔

This requirement is met by blocking all unauthorized network connections through the firewall.

must Student identity verification ✔

Identity verification is handled by exam invigilators present in the exam rooms.

33

must Writing time limitation ✔

Limiting writing time is handled by exam invigilators present in the exam rooms. Additional time limitations can be configured through Moodle.

should Automatic detection of security system circumvention attempts ✔

This requirement is met by the invigilator observation tool which can be used to track students during exams to ensure that their limited exam environment is not circumvented. A used by the invigilators displays the status of both active and absent students to help identify suspicious behaviour.

should Ability to choose and configure preferred programming environment ✔

This requirement is met by including a bundle of pre-installed programming environments in the imposed operating system.

should Backups of work in progress ✔

This requirement is met by Moodle which can handle the automatic saving of the state of a student’s ongoing exam. Many of the programming environments also have built in support for storing state in case of unexpected events.

could Support for automatic grading ✔

This requirement is met by the Moodle plugin CodeRunner which supports automatic grading through predefined test cases.

could Limited Internet access for approved sources ✖

Student/teacher communication through approved messaging channel won’t Support for video surveillance

Support for exams taken at a distance

These requirements have been excluded and are instead suggested as future improvements.

34

5.2.2 Operating system

Ubuntu Linux was chosen as the operating system for multiple reasons. Firstly, it is licenced under the GNU General Public Licence which means that it may be copied, changed, and distributed freely, provided that further distributions are not restricted and that its source code is made available. Secondly, it is a common with good compatibility with various hardware. This is important to ensure a smooth examination process for as many students as possible regardless of what computer model they own and intend to use for the exam. In the event of compatibility issues or students not having a computer for the exam, the institution is meant to hold a small stock of loanable systems that are compatible. Thirdly, the releases of updates for Ubuntu follow a predictable roadmap. New updates of software are also provided reasonably fast. Lastly, Ubuntu, and Linux overall, provides a versatile system environment that is highly configurable.

While the Linux environment may be new to many students, it could be a good thing for aspiring developers to familiarize themselves with it. Most of the relevant software is almost identical between the different operating systems from a visual perspective and their usage on Linux should therefore not introduce any difficulties. Ubuntu ships with the desktop environment Budgie by default which does not embrace a traditional start menu as the one found in Windows. The way in which programs are launched through the graphical interface is instead more similar to some Mac and mobile operating systems. Aspiring developers unfamiliar with this environment should be able to quickly understand how it works.

5.2.3 Prevention of cheating

The three ways identified that students could access unauthorized resources while using the proposed solution is by using external storage devices, the Internet, and the man-pages. As mentioned in the solution description in chapter 4.4, the mounting of external drives is disabled, and while the student is allowed to connect to any network he or she pleases, the only traffic allowed on the system is outgoing traffic on port 80 to the server running the Moodle LMS. Also, the man-pages have been made unavailable by removing them altogether. These restrictions can only be configured with root access to the operating system and thus, the students are only given regular user privileges. It goes without saying that root credentials to the system should be handled in such a way that the possibility of a student acquiring root access is essentially non-existent.

One remaining issue is the underlying operating system on the computer. It could be possible for the students to reboot the system into the underlying operating system with full Internet access and unlimited unauthorized resources. The invigilators would of course be present to observe, but it cannot be expected that all invigilators are technically versed in various operating systems or that they would even notice the rebooting process and time spent 35

away from the restricted system. In fact, the students could even be running the same operating system natively, being nearly identical visually. The invigilator observation tool was developed to solve this potential issue by helping invigilators identify such attempts of cheating.

While it is hard to provide bullet proof protection against cheating, the authors believe that the presented solution with all its countermeasures in place provides sufficient protection to ensure that cheating attempts are counteracted, at least on a similar level as that of current pen and paper exams, and it does so without violating personal integrity. The authors find this aspect to be important not only for reasons of complying with GDPR but also for ethical reasons.

5.2.4 Solution requirements

This subsection aims to shed some light on the ways that the must- requirements from the MoSCoW analysis, presented in Table 1, chapter 4.2, can reasonably be assumed to affect the proposed solution, and how their removal may affect the examination process. As four of these five requirements are actual constraints, it seems reasonable to assume that cheating would increase if they had all been removed. Attempting to project to what degree cheating would increase would obviously be purely speculative, and as such the authors simply conclude that the must-requirements are all well formulated.

5.2.4.1 Bring Your Own Device

As was discussed in section 4.4, the way in which the proposed solution provides an environment where resources are limited directly within the operating system mitigates the need for various higher-level solutions to lock down the system such as lockdown-browsers. Removing the BYOD requirement would in essence force the institution to provide systems on which the students can take the exam. From an implementation perspective, the removal of this requirement would simplify the proposed solution as there would be no underlying operating system to be concerned about, however, depending on the number of systems that needs to be provided it may certainly not be a feasible approach to take.

5.2.4.2 Limited system environment

The limitations inflicted upon the system environment are made up of disabling the ability to mount external storage devices as well as the ability to access the built-in documentation in the form of man pages. The requirement Internet access blocking discussed in section 5.2.4.3 could have been seen as a limitation of the system environment as it is actually handled within the operating system but depending on network configurations that could also have been solved

36

externally whereas the two restrictions included in this requirement would always be solved within the operating system. It would be naive to think that the ability to freely open documentation that may provide assistance to some problem, or to mount storage devices which contain prohibited resources, would not be exploited by students with dishonest intentions. That makes this must-requirement an integral part of the proposed solution.

5.2.4.3 Internet access blocking

Similarly to how lifting the restrictions discussed in section 5.2.4.2 would likely result in a greater degree of cheating from students’ with dishonest intentions, removing the requirement to block Internet access would likely be as bad, if not worse, given how easily searches can be customized with regards to the problem at hand.

5.2.4.4 Student identity verification

The removal of student identity verification would make it possible for an impersonation which of course could allow some other person with greater knowledge the opportunity to write the exam in place of the actual student. The removal of this requirement would naturally not be acceptable.

5.2.4.5 Writing time limitation

The removal of the writing time limitation would likely result in students’ taking more time for their exams than intended by the examiners. This, in turn, would likely result in greater average scores for the students, however, it does not provide the student with some previously not known knowledge and is as such, while still important, not the most critical requirement of all must- requirements.

5.3 Project assignments in lieu of written exams

The proposed solution introduces the possibility of increasing the complexity of programming tasks compared to pen and paper exams, however, the complexity must still be limited for a few reasons. The main reason being the time restrictions that apply. It is simply not feasible to write complex programs in their entirety in the time duration of a regular exam. Additionally, the design of the exam may require more than one problem to ensure examination of all learning goals of the course which as a consequence leads to a lesser complexity per task. Furthermore, while a goal has been to propose a solution with as realistic conditions as possible, the reality is that group work is very common. These are clearly relative strengths of project assignments which, within reason,

37

do not have to restrict the complexity of tasks, and better imitates real working conditions.

There are not only advantages to project assignments though. The advantage that comes from working in groups in better imitating real working conditions also comes with a significant disadvantage in that it severely affects the possibility of individual assessments. Most of the time it is also a necessity to work in groups to maintain a reasonable workload for the teachers. Furthermore, it is hard to ensure that students do not acquire help from other students, friends, or family. This leads to the prerequisite of some form of oral examination to ensure that each student actually meets the learning goals of the course.

Normark et al. with the Department of Computer Science at Aalborg University argue in a similar manner when they investigate the different examination forms they employ; oral exam, written exam, project exam, and Mini Programming Project (MIP) [38]. In their study there are mainly two differences between a project exam and a MIP. The number of participants generally range between two to four for a MIP and up to seven for a project exam. The second difference is the duration of the task where a MIP is generally a few days whereas a project exam can span over a significantly longer period. One notable omission is that it does not seriously consider computer based written exams arguing that they were unfeasible either due to having to provide a large number of available computers from which the exam could be taken or having to deal with potential cheating on BYOD. This was in 2008 though, and solutions for BYOD exams have certainly improved since then. Despite the fact that computer-based exams are likely to improve the quality of the exams significantly, the arguments brought forward above as well as the study of Normark et al. are still valid.

Project assignments in groups where the students must research the problem and collaborate to come up with a solution may also contribute positively both to the learning process as well as to a more pleasant study environment. Normark et al. state that Problem-based learning (PBL) is the most dominant teaching activity at Aalborg University. At Helsinki University of Technology, PBL was incorporated into a yearly programming course between 1999 and 2003 [39]. The average dropout rate from this course during this period ranged from 0 to 29 percent with an average of 17 percent. Comparatively, in traditional computer science courses, the dropout rate ranged from 41 to 51 percent with an average of 45 percent.

The authors do, however, state that they do not consider the data statistically valid, but they also state; “In our subjective judgment, the quality of learning has been high in the PBL course. The students that pass the course generally submit good projects and are able to describe the implementation in a manner that shows good conceptual understanding. However, since the PBL course differs from the traditional courses in the structure, the student profile, and the evaluation method used, we cannot present any statistically valid data about the

38

overall quality of learning compared to the traditional courses.”. Also, both students that had taken the PBL course and ones that had taken the traditional courses took an advanced Java course in 2003 and 2004 where the score a student could get ranged from 0 to 5. The PBL students’ average score was 2.92 in 2003 and 2.83 in 2004 compared to the students that had taken the traditional courses who scored 2.78 and 2.44 in 2003 and 2004, respectively.

Chis et al. performed a case study at the National College of Ireland where 53 students enrolled in a module of 9 weeks as part of a Higher Diploma in Science in Computing degree, a one-year conversion course [40]. It was split into three parts of three weeks each: classes and objects, repetition statements, and arrays data structures. These were performed in a traditional manner, Flipped Classroom (FC), and FC-PBL. FC is where the theoretical lecture material is distributed beforehand for the students to digest and the scheduled lectures are used for practical learning with assistance of the lecturer and potential teaching assistants, instead of being devoted to theoretical learning. The students had to take a pre-test to assess prior knowledge. The failure rate of the traditional manner, FC, and FC-PBL were 28 percent, 25 percent, and 2 percent, respectively. The rate at which students received higher grades for the traditional manner, FC, and FC-PBL were 60 percent, 36 percent, and 72 percent, respectively. The remainder of students attained average grades.

Chis et al. concludes that the combination of FC and PBL seems to improve results, and this appears to be the case especially for weaker students, but they also state that they cannot rule out that the different subjects may not be of equal difficulty.

The authors of this thesis are of the opinion that a combination of both examination forms is likely the best approach. It also seems like the variation could have a positive impact on the learning process. Finally, in the context of an entire educational programme, the latter parts of the programme not uncommonly require knowledge from earlier stages of the programme, which over time may make it unsustainable to not put in the honest work. Thus, the process may in a natural manner assist in mitigating dishonesty.

5.4 Validity and reliability

This section evaluates the validity and reliability of the used methods and the presented results.

5.4.1 Validity and reliability of methods

The adapted research method used in this thesis is heavily based on Bunge’s general scientific method which is a proven and reliable method for conducting research that produces valid results. The validity and reliability of the case study elements of the adapted research method, however, can be argued. In 39

Josefssons study into the efficacy of case studies as scientific products, she states that most studies argue against each other, but that mostly older studies argue against case studies being used for scientific research [41]. The generalization of case study results is said to serve as an example based on reasoning for what could be obtained as results for similar cases. Other cases might have specific conditions that would render the results invalid. Even similar cases might have different conditions over time, rendering the method unreliable. Thus, our conclusion is that the method has high validity and high reliability only for similar cases with similar conditions.

5.4.2 Validity and reliability of results

The proposed solution is a system that allows for programming exams to be performed on the students’ own computers. The system imposes certain limitations to the user for security reasons but is otherwise a system that provides real programming conditions during exams. As such, the limitations imposed by paper-based programming exams are no longer present. There is no longer a need for exam problem simplification and the risk of unintentional subjective assessments has been reduced greatly or eliminated altogether with the support for automatic grading.

The proposed solution shows how computer-based programming exams can be implemented for engineering students at KTH Kista, and thus, solves the secondary research problem. By generalization, serving as an example for how computer-based programming exams can be implemented overall, it also solves the main research problem. Even though extensive testing with real participants has not been possible, the results are assessed as having high validity.

These results are not reliable though, as other researchers might find completely different solutions for computer-based programming exams. The design of the proposed solution was heavily affected by the state of available software, the state of the infrastructure at KTH, and the wishes expressed by KTH, all of which might change in the near future. Thus, we conclude that the results have low reliability.

40

6 Conclusion

The purpose of this thesis is to increase the quality and efficiency of programming courses and their examination processes by solving the research problem, formulated in chapter 1.2 as: “How can computer-based programming exams be implemented for engineering students?”.

Our goal has been to provide an evaluation of existing solutions for the selected case and to suggest a solution in that context. Both of these goals have been achieved.

The proposed solution is likely to have a positive impact on programming courses in terms of quality and efficiency for the grading and administration processes and is likely better able to evaluate the knowledge that examiners would want to examine compared to traditional paper-based exams. Despite real world tests not having been possible, the validation steps performed while deriving the solution suggest that the system would work as per the requirements, and thus, along with the findings of previous studies, presented in chapter 2.1, we conclude that our results support the purpose of this thesis and the case study proposition, stated in chapter 4.1 as: “The proposed solution for computer-based programming exams increases the overall quality of programming courses”.

The proposed solution is a system in which students use their own computers with an imposed limited system environment, available through usb flash drives. The environment is configured to restrict the use of prohibited resources while allowing a natural configurable programming environment for the students. The system uses the Moodle LMS as its backend, which contains the digital exams and provides automatic grading functionality, among other things. A separate tool is also provided to help invigilators monitor ongoing exams. The public GitHub repository contains the source code for the observation tool and the necessary documentation and instructions for implementation. A link to the repository can be found in Appendix A.

6.1 Limitations

Given the circumstances caused by the ongoing pandemic with regards to scheduled on-campus events such as exams having been cancelled and instead performed remotely, it has not been possible to perform the real world tests the authors would have wanted. This has resulted in unused questionnaires, and therefore, the evaluation of our solution, including the suggested observation tool developed by the authors, remains incomplete.

Another limitation is the recovery process of the usb flash drives between exams, for which an effective solution has not been found in this study.

41

6.2 Future work

As the authors did not have access to licenses for some interesting solutions such as Inspera Assessment and Wiseflow, a hands-on comparison between these and different LMS solutions specifically with regards to computer-based programming exams would make for interesting future research. The authors obtained some results regarding the quality of computer-based exams compared to pen and paper exams as far as programming goes but it is mostly based on student perception. It could be interesting to further investigate the correlation between results of these two different types of exams and how students fare at the end of the educational programme.

Other related areas that may be worth investigating are additional issues that may exist with regards to students with functional disabilities. Also, as this thesis is primarily focused on non-distance computer-based programming exams, further investigation on how computer-based exams are best taken at distance, and how cheating could best be prevented then, may be worth pursuing. It could also be worth exploring what long term environmental effects that a wider implementation of computer-based exams could have as opposed to pen and paper exams.

For the proposed solution a natural next step could be to make available select Internet resources which may be deemed allowed, e.g., API reference documentation for the programming language in question. Depending on requirements, this may be doable with the current implementation, but the functionality could also be further extended with other tools.

Yet another area that could benefit significantly from further work is to find an effective solution for the restoration process of the operating system on the usb flash drives while preserving persistent storage during the actual exams.

42

References

[1] N. Olsson, “Examination vid universitet och högskolor - ur studentens synvinkel” [Examination at universities and university colleges - from the student’s point of view], Högskoleverket, [Online]. Available: https://www.uka.se/download/18.12f25798156a345894e2d51/148784193189 4/9710S.pdf.

[2] R. Lobb, J. Harlow, Coderunner: a tool for assessing computer programming skills. ACM Inroads, vol. 7, issue 1, pp. 47–51, 2016. doi: https://doi.org/10.1145/2810041.

[3] A. Chirumamilla, G. Sindre, A. Nguyen-Duc, Cheating in e-exams and paper exams: the perceptions of engineering students and teachers in Norway, Assessment & Evaluation in Higher Education, 45:7, 940-957, 2020. DOI: 10.1080/02602938.2020.1719975.

[4] A. Tella, M. T. Bashorun, “Attitude of Undergraduate Students Towards Computer-Based Test CBT: A Case Study of the University of Ilorin, Nigeria”, International Journal of Information and Communication Technology Education, vol. 8, issue 2, pp. 33-45, 2012. doi: 10.4018/JICTE.2012040103. [Online]. Available: https://www.igi-global.com/gateway/article/65576.

[5] A. N. Kumar, "The design of online tests for Computer Science I and their effectiveness," FIE'99 Frontiers in Education. 29th Annual Frontiers in Education Conference. Designing the Future of Science and Engineering Education. Conference Proceedings (IEEE Cat. No.99CH37011, San Juan, PR, USA, 1999, pp. 13B3/1-13B3/5 vol.3, doi: 10.1109/FIE.1999.840371.

[6] C. Daly, J. Waldron. “Assessing the assessment of programming ability”, In Proceedings of the 35th SIGCSE Technical Symposium on Computer Science Education, SIGCSE ’04, pages 210–213, New York, NY, USA, 2004. ACM.

[7] S. Grissom, L. Murphy, R. McCauley, S. Fitzgerald, “Paper vs. Computer-based Exams: A Study of Errors in Recursive Binary Tree Algorithms”, In Proceedings of the 47th ACM Technical Symposium on Computing Science Education (SIGCSE '16). doi: 10.1145/2839509.2844587.

[8] B. Stephenson, “An Experience Using On-Computer Programming Questions During Exams”, In Proceedings of the 23rd Western Canadian Conference on Computing Education (WCCCE '18). doi: 10.1145/3209635.3209639.

[9] Moodle, “Moodle - Open-source learning platform | Moodle.org”. [Online]. Available: https://moodle.org. [Accessed: 23-Mar-2021].

[10] Canvas, “Canvas LMS | Learning Management System | Instructure”. [Online]. Available: https://www.instructure.com/canvas. [Accessed: 23-Mar-2021].

43

[11] Blackboard, “Blackboard Learn - An Advanced LMS | Blackboard”. [Online]. Available: https://www.blackboard.com/teaching-learning/learning- management/blackboard-learn. [Accessed: 23-Mar-2021].

[12] H. Natanson, “Failed tech, missed warnings: How Fairfax schools’ online learning debut went sideways”, The Washington Post. 2020 [Online]. Available: https://www.washingtonpost.com/local/education/fairfax-schools- online-learning-blackboard/2020/04/18/3db6b19c-80b5-11ea-9040- 68981f488eed_story.html. [Accessed: 27-Mar-2021].

[13] Inspera Assessment, “Inspera Assessment | Product | Inspera”. [Online]. Available: https://www.inspera.com/assessment. [Accessed: 24-Mar-2021].

[14] A. Sisarica, “What is the difference between an assessment platform and a VLE/LMS?”. 2020 [Online]. Available: https://www.inspera.com/blog/the- difference-between-an-assessment-platform-and-vle-lms. [Accessed: 24-Mar- 2021].

[15] Inspera Assessment, “Customer Stories | Learn | Inspera”. [Online]. Available: https://www.inspera.com/customer-stories. [Accessed: 24-Mar-2021].

[16] H. Sandström, M. Brenner, “Datoriserad tentamen: SUNET-inkubator slutrapport”, SUNET Inkubator, 2015.

[17] WISEflow, “Moving Assessments Online with WISEflow”. 2020 [Online]. Available: https://www.uniwise.co.uk/blog/moving-assessments-online-with- wiseflow. [Accessed: 25-Mar-2021].

[18] DigiExam, “Digital exams, high stakes testing, invigilation and proctored exams.”. [Online]. Available: https://www.digiexam.com/. [Accessed: 25-Mar- 2021].

[19] DigiExam, “DigiExam Case Studies - How we help our customers with digital exams”. [Online]. Available: https://www.digiexam.com/stories/. [Accessed: 25-Mar-2021].

[20] Exam Monitor, “ExamMonitor | A.I.-Driven Remote Proctoring with Professional Review”. [Online]. Available: https://examsoft.com/solutions/exam-monitor. [Accessed: 25-Mar-2021].

[21] University of Southern Denmark Exam Monitor, “Exam Monitor - University of Southern Denmark”. [Online]. Available: https://sdu.exammonitor.dk/. [Accessed: 25-Mar-2021].

[22] Safe Exam Browser, “Safe Exam Browser - About”. [Online]. Available: https://safeexambrowser.org/about_overview_en.html. [Accessed: 25-Mar- 2021].

44

[23] LockDown Browser, “LockDown Browser - Respondus”. [Online]. Available: https://web.respondus.com/he/lockdownbrowser/. [Accessed: 25-Mar-2021].

[24] CodeRunner, “Moodle plugins directory: CodeRunner”. [Online]. Available: https://moodle.org/plugins/qtype_coderunner. [Accessed: 26-Mar-2021].

[25] Virtual Programming Lab, “Moodle plugins directory: Virtual Programming Lab”. [Online]. Available: https://moodle.org/plugins/mod_vpl. [Accessed: 26-Mar-2021].

[26] Job Engine, “trampgeek/jobe: jobe is a server that runs small programming jobs in a variety of programming languages”. [Online]. Available: https://github.com/trampgeek/jobe. [Accessed: 28-Mar-2021].

[27] Uncomplicated Firewall, “UncomplicatedFirewall - Ubuntu Wiki”. [Online]. Available: https://wiki.ubuntu.com/UncomplicatedFirewall. [Accessed: 20- Mar-2021].

[28] M. Bunge. “Epistemology and Methodology I: Exploring the World”, Vol. 5 of Treatise on Basic Philosophy, Dordrecht, Holland, D. Reidel Publishing Company, 1983.

[29] N. Andersson, A. Ekholm, “Vetenskaplighet”, Dep. of Construction and Architecture, Lund University, Lund, Sweden, 2002.

[30] B. Gillham, “Case Study Research Methods”, London, Continuum, 2000.

[31] R. K. Yin, “Case study research and applications: design and methods”, 6th ed. Thousand Oaks, California, SAGE Publications, Inc., 2018.

[32] P. Runeson, M. Höst, “Guidelines for conducting and reporting case study research in software engineering”, Empirical Software Engineering, Vol. 14, No. 2, 2009, p. 131-164. doi: 10.1007/s10664-008-9102-8.

[33] C. Dawson. “Practical Research Methods”, Oxford, UK, How To Books Ltd, 2002.

[34] D. Clegg, R. Barker. “Case Method Fast-Track: A Rad Approach”, Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., 1994.

[35] M. Poppendieck, T. Poppendieck, “Lean Software Development: An Agile Toolkit”, Addison-Wesley Professional, 2003.

[36] “Manifesto for Agile Software Development”, 2001, Accessed on: April. 12, 2021. [Online]. Available: https://agilemanifesto.org/.

[37] A. Heintz. Cheating at Digital Exams. Master's thesis. Norwegian University of Science and Technology, Norway, 2017. 45

[38] K. Nørmark, L.L. Thomsen, K. Torp, “Mini Project Programming Exams”. In: J. Bennedsen, M.E. Caspersen, M. Kölling. Reflections on the Teaching of Programming. Lecture Notes in Computer Science, vol 4821. Springer, Berlin, Heidelberg. doi: https://doi.org/10.1007/978-3-540-77934-6_18.

[39] E. Nuutila, S. Törmä, L. Malmi, “PBL and Computer Programming — The Seven Steps Method with Adaptations”. Computer Science Education, 15:2, pp. 123- 142. doi: https://doi.org/10.1080/08993400500150788.

[40] A.E. Chis, A.N. Moldovan, L. Murphy, P. Pathak, C.H. Muntean. "Investigating Flipped Classroom and Problem-based Learning in a Programming Module for Computing Conversion Course". Journal of Educational Technology & Society 21, no. 4 (2018): 232-47. Available: https://www.jstor.org/stable/26511551.

[41] T. Josefsson, “How good are case studies as scientific products?”, Dissertation, 2016.

46

Appendix A

GitHub link to the source code and documentation for the proposed solution: https://github.com/ralvarezkth/II142X

47

TRITA-EECS-EX-2021:296

www.kth.se