Recent Advances in Computer Engineering, Communications and Information Technology

Comparative analysis of statistic software used in education of non- statisticians students

KLARA RYBENSKA; JOSEF SEDIVY, LUCIE KUDOVA Department of Technical subjects, Departement of Informatics Faculty of Education, Fakulty of Science, Faculty of Arts University of Hradec Kralove Rokitanskeho 62, 500 03 Hradec Kralove CZECH REPUBLIC [email protected] http://www.uhk.cz [email protected] http://www.uhk.cz, [email protected] http://www.uhk.cz

Abstract: - Frequently used tool for processing of statistical data in the field of science and humanities IBM SPSS program. This is a very powerful tool, which is an unwritten standard. Its main disadvantage is the high price, which is restrictive for use in an academic environment, not only in teaching but also in the case of individual student work on their own computers. Currently, there are two tools that could at least partially IBM SPSS for teaching science disciplines to replace. These are programs PSPP (http://www.gnu.org/software/pspp/) and alternative (SOFA http://www.sofastatistics.com). Both are available under a license that permits their free use not only for learning but also for commercial purposes. This article aims to find out which are the most common ways of using IBM SPSS program at the University of Hradec Králové and suggest a possible alternative to the commercial program to use in teaching non-statistical data processing student study programs.

Key-Words: - statistic software, open source, IBM SPSS, PSPP, data processing, science education.

1 Introduction using formal symbolic system. They then certainly Quantitative research uses logical reasoning process one of the methods of mathematical , statistical and Very simply, you can lay out this process in a few other, but such methods of abstract algebra , formal steps: Getting Started formulation of the research logic, probability theory, which are not restricted to problem, which should be read plenty of literature, numerical data nature. These formal methods, as of which we will examine how the problem was relatively new tools of scientific knowledge solved in the past and what solution the researchers increasingly penetrate well into the research realm came before us. Based on the literature we set our of science. own research goals and also construct hypotheses that operationalize. Based on these steps to select the appropriate method of data collection and 2 Introducing to the test software determine the measuring instrument (e.g. IBM SPSS : Paid set of tools suitable for questionnaire). Determine the size of the sample and data analysis from planning and data collection to we can go into the field to collect data. The processing, to the reporting of the processed results. collected data are interpreted in the context of The software is designed not only for analysts, previous theories and conclusions. Research data programmers of statistical applications for the obtained from specific empirical observations and specialized companies, but also for students who are experiments are usually not for scientific engaged mainly in the exploration of statistical interpretation used in crude form, but are the basis science disciplines. The program provides a choice for the use of their related theoretical methods. comprehensive set of fully integrated modules, and Because empirical data tend to have relatively allows you to find and specialized functions that are frequent form of quantitative and theoretical useful for performing statistical surveys on which methods often use mathematical operations, speaks you can make better decisions in natural sciences, to - not quite correct - the "quantitative methods". In social and economic areas. Software is paid and the particular, here we focus on statistical methods, for licensing model allows only a certain kind of rent reasons that we will. In general, and more [1]. specifically for the formal methods, i.e. methods

ISBN: 978-960-474-361-2 152 Recent Advances in Computer Engineering, Communications and Information Technology

SPSS despite uncomfortably high price offers users table (Fig. 2) operates much faster than software the most important analytical features a variety of SPSS and its actual use in this case is more user simple and advanced statistical procedures that are friendly. designed to best meet the needs of the user [2]. Provides an overview of the data and speeds up the summary of results in different outputs, in which a user can export your data (text files, Microsoft Word, PDF, Excel, image formats JPEG, PNG, BMP and others). Software can be installed on both Fig.2 Sample processing functions required OS and the distribution descriptive statistics in the PSPP. (source: authors) [3]. PSPP. Free PSPP program designed for statistical analysis of the selected data is still in development. I still paid SPSS great competition. PSPP except Pearson's test also provides tests that SPSS is very similar in appearance and function, are based on kurtosis (Fig. 3). Further tests of because according to him, was designed as a normality (Kolmogorov-Smirnov test, Shapiro-Wilk replacement. Software allows you to develop test of goodness of fit) PSPP software cannot various tests, analysis of variance, Cronbach's alpha perform. and is very suitable for the analysis of large data sets because it can work with the data very quickly [4].

Fig. 3 Sample of output test based on kurtosis in the Fig.1 Sample Histograms of the PSPP (source: PSPP. (source: authors) authors)

PSPP provides the tools necessary to create the chi- Graphs (Fig. 1) that the user creates in the program test (Fig. 4). This program also allows the user to has the option to convert into standard formats quick processing in addition to the actual test, also (PostScript, PNG) and allows you to convert the test results saved in several formats (PDF, HTML, files to a text file, PDF, HTML, SVG or OpenDocument, etc.) or directly printed. OpenDocument [2], [4]. This program can open and import the Excel and even allows you to choose the data from the table, indicating that the user is working. Its weaknesses include its instability but when misspelled task [3], [5]. The description of this product was mentioned above, it is not necessary to repeat it already. For this reason, here we focus only on the description of the requirements that the test program should ideally meet, while present in the text of the results that we have achieved PSPP software. PSPP program meets all the descriptive statistics, which have to cope with statistical software. Although it does not change the color scheme or heterogeneous display the result

ISBN: 978-960-474-361-2 153 Recent Advances in Computer Engineering, Communications and Information Technology

Fig.6 Cronbach's alpha processed in software PSPP. (source: authors)

Correlation: PSPP tested software contains tools for performing Spearman and Pearson tests and meets the requirements for the implementation of correlation. Factor analysis: Factor analysis software PSPP can perform them (Fig. 7).

Fig.4 Chi-square processed using software PSPP. (source: authors)

More kinds of T-test software PSPP offers well (Fig. 5).

Fig.7 Factor analysis in the PSPP. (source: authors)

2 Research study At the Faculty of Education is in the doctoral study Fig.5 T-test processed in software PSPP. (source: program Information and Communication authors) Technologies in Education used NCSS software, in other fields is preferably used Microsoft Office Excel versions 2010 and 2013. Non-parametric tests: PSPP does not perform nonparametric tests, which can be considered as one of the few shortcomings that this software has. Reliability: Cronbach's alpha test open source program PSPP can perform (Fig. 6).

ISBN: 978-960-474-361-2 154 Recent Advances in Computer Engineering, Communications and Information Technology

completely free and also very easy to install and service. Program copies due SPSS, which we see as another example, particularly for those users who will actually move from the paid software. PSPP is a sound statistical program, which has, with few exceptions (to name his shortcomings and that is that there is no way to perform non-parametric tests, tests of normality cannot Kolmogorov -Smirnov and Shapiro-Wilk test of goodness of fit), all the required functions. As a defense to its shortcomings can be noted that the PSPP software is still in Fig.8 Graph showing the frequency of use of development, so we can expect to soon be fully statistical software to selected faculties of the functional replacement for SPSS. The aim of the University of Hradec Kralove (source: authors). research was to compare between them overall program capabilities PSPP with SPSS to determine in which areas, and whether it would be at all Using questionnaires and interviews with teachers in possible to replace this said Free-view, paid selected faculties of the University of Hradec program. Research has shown that it is possible to Králové, we investigated the most common reasons replace the IBM SPSS, although not in all functions. that lead teachers to teach them the preferred But due to zero tariffs and use PSPP software on software. In total, we contacted 25 teachers any computer, including at home, and also with (including part-time staff) from different faculties. regard to the possibility of commercial use of the program, it can be unequivocally recommended for science disciplines requiring the processing of statistical data, with the prospect of a later recovery software students in their employment and training activities.

This article was created under the project called Specific research done at UHK Hradec Králové

References: [1] HUBALOVSKY, S., Rectification column - mathematical modeled and computer simulated Fig.9 Graph showing the reasons for teachers to real system. In: International Journal of teach specific software. Users could select Mathematics and Computers in Simulation. multiple options (source: authors). 2012, vol. 1., ISSN 1998-0159. [2] HUBALOVSKA, M., HUBALOVSKY, S. Implementation of the Systems Approach in Mathematical Modeling, Dynamic Simulation 7 Conclusion and Visualization Using MS Excel The research, which focuses on the comparison and Spreadsheet. International Journal of analysis of tools for statistical data processing, Mathematics and Computers in Simulation. which is used in the teaching of natural sciences, 2013, vol. 7, no. 2, pp. 267-276. ISSN 1998- confirmed that the paid software IBM SPSS 0159. software quality is to teach students. The program [3] HUBALOVSKY, S., JELINEK, J., SEDIVY, offers a comprehensive set of tools and seamlessly J., Mathematical modeling and computer manage the required tasks. Its major drawback is, simulation of optimal reaction time of the however, in addition to high prices also the way of Lupine protein hydrolysis using fermented installation, which is tedious and not always whey. International Journal of Mathematical flawless. The best candidate for its replacement Models and Methods in Applied Sciences. seems PSPP program for the following reasons. 2012, edit. 6, vol. 2, s. 394-401. ISSN 1998- This software can fully serve to teach sociological 0140. disciplines and processing of statistical data, and it's

ISBN: 978-960-474-361-2 155 Recent Advances in Computer Engineering, Communications and Information Technology

[4] HUBALOVSKY, S., Rectification column - mathematical modeled and computer simulated real system. International Journal of Mathematics and Computers in Simulation. 2012, roč. 6, č. 1., ISSN 1998-0159. [5] CHROMY, J., DRTINA, . Porovnání modelů komunikace a didaktických aspektů výuky při výuce i v praxi. Media4u magazine. Praha: J. Chromý, 2012. 4s. ISSN: 1214-9187 [6] CHROMY, J., NĚMEČEK, R., POLGÁR, R. Vybrané vlivy na úroveň vědeckých konferencí. Media4u magazine. Praha: J. Chromý, 2012. 8s. ISSN: 1214-9187.

ISBN: 978-960-474-361-2 156