Design and Development of an App for Statistical Data Analysis Learning

Treball de Fi de Grau Grau en Enginyeria en Tecnologies Industrials Design and development of an app for statistical data analysis learning MEMÒRIA Autor: Eduard Serrahima de Cambra Director/s: Lluís Marco Almagro Convocatòria: Juny 2015 Escola Tècnica Superior d’Enginyeria Industrial de Barcelona Design and development of an app for statistical data analysis learning Page 1 Abstract As statistics grows in importance in our society due to the need to analyze the increasing amounts of data available, it is necessary to reinforce the role of statistics in education. However, the non-existence of free open-source statistical tools designed specifically for educational purposes make the task more difficult. Concern about this fact led to this project. An app designed for statistical data analysis learning has been developed from scratch, and fully programmed using R. Diverse studies and profound research have been carried on to design the application, named StatClip. Different aspects have been taken into account, following a logical sequential structure for implementing the application: review of similar existing programs, characterization of target users, definition of user needs and translation into program characteristics. The result is a fully functional program with several important and useful functionalities implemented. More importantly, though, the whole architecture and structure of the software has been designed to make it easier to continue developing and adding functionalities to StatClip. An emotional design study (more specifically, a study following the kansei engineering model) has been as well performed, to take into consideration sensations and feelings that StatClip’s user interface conveys on users. The results led to rethink some features of the program’s user interface (UI), although in general the initial UI’s design was found adequate and correct. A lot of working effort in this project has been devoted to program code for the application. As a result, the app’s skeleton and structure, as well as many important functionalities, are already programmed. From now on, StatClip can continue growing and expanding, with the addition of new functionality, so that it becomes a complete and useful statistical educational tool. Page 2 StatClip Index INTRODUCTION ............................................................................................... 8 Objectives of the StatClip project ............................................................................. 9 Scope of the project ............................................................................................... 10 1. SOFTWARE RESOURCES FOR EDUCATION IN STATISTICS ........... 12 1.1. Study of the available programs .................................................................. 12 1.1.1. Programming languages ................................................................................ 12 1.1.1.1. R ................................................................................................................................. 12 1.1.1.2. Python ........................................................................................................................ 13 1.1.1.3. GNU Octave ............................................................................................................... 13 1.1.2. Commercial Statistical Software ..................................................................... 13 1.1.2.1. Minitab Statistical Software ........................................................................................ 13 1.1.2.2. JMP ............................................................................................................................ 16 1.1.2.3. SPSS .......................................................................................................................... 19 1.1.3. Freely available Statistical and Data Analysis Software ................................. 20 1.1.3.1. Statistical Lab ............................................................................................................. 20 1.1.3.2. Statcato ...................................................................................................................... 21 1.1.3.3. SalStat Statistics ........................................................................................................ 22 1.1.3.4. SOFA Statistics .......................................................................................................... 23 1.1.3.5. Others......................................................................................................................... 25 2. USERS AND PROGRAM CHARACTERISTICS DEFINITION ............... 27 2.1. Users ........................................................................................................... 27 2.1.1. Main Users ..................................................................................................... 27 2.1.2. Operator Users ............................................................................................... 28 2.2. Needs to be covered by StatClip ................................................................. 28 2.2.1. Main Users’ needs ......................................................................................... 28 2.2.2. Operator Users’ needs ................................................................................... 29 2.3. Program characteristics ............................................................................... 30 3. DESIGN OF THE SOLUTION................................................................... 31 3.1. Theoretical design: StatClip’s philosophy .................................................... 31 3.1.1. Menu Options ................................................................................................. 33 3.1.1.1. Data ............................................................................................................................ 33 3.1.1.2. Graphics ..................................................................................................................... 34 Design and development of an app for statistical data analysis learning Page 3 3.1.1.3. Computations ............................................................................................................. 40 3.1.1.4. Statistics ..................................................................................................................... 42 3.2. Theoretical Design: UI’s first approach ........................................................ 42 3.2.1. General frame ................................................................................................ 43 3.2.2. Data................................................................................................................ 44 3.2.2.1. Load Data Set ............................................................................................................. 45 3.2.2.2. Create Simulated Data ............................................................................................... 47 3.2.3. Graphs ........................................................................................................... 49 3.2.4. Computations and Statistics ........................................................................... 50 3.3. Programming language election: R .............................................................. 51 3.3.1. What is R? ...................................................................................................... 52 3.3.2. Shiny .............................................................................................................. 56 3.3.2.1. Structure of a Shiny app ............................................................................................. 56 4. STATCLIP IMPLEMENTATION ............................................................... 59 4.1. Structure ...................................................................................................... 60 4.2. R Programming ............................................................................................ 62 4.2.1. User Interface (UI) .......................................................................................... 63 4.2.1.1. ui.R .............................................................................................................................. 63 4.2.1.2. UI-sidebar.R ............................................................................................................... 64 4.2.1.3. UI-body.R.................................................................................................................... 67 4.2.2. Server ............................................................................................................. 68 4.3. Code examples: “Load Data Set” and “Histogram” ...................................... 69 4.3.1. Load Data Set ................................................................................................ 69 4.3.1.1. User Interface ............................................................................................................. 70 4.3.1.2. Server ......................................................................................................................... 76 4.3.2. Histogram ....................................................................................................... 79 4.3.2.1. User Interface ............................................................................................................. 80 4.3.2.2.

Load more