Comparison of Commonly Used Statistics Package Programs
Total Page:16
File Type:pdf, Size:1020Kb
Black Sea Journal of Engineering and Science 2(1): 26-32 (2019) Black Sea Journal of Engineering and Science Open Access Journal e-ISSN: 2619-8991 BSPublishers Review Volume 2 - Issue 1: 26-32 / January 2019 COMPARISON OF COMMONLY USED STATISTICS PACKAGE PROGRAMS Ömer Faruk KARAOKUR1*, Fahrettin KAYA2, Esra YAVUZ1, Aysel YENİPINAR1 1Kahramanmaraş Sütçü Imam University, Faculty of Agriculture, Department of Animal Science, 46040, Onikişubat, Kahramanmaraş, Turkey 2Kahramanmaraş Sütçü Imam University, Andırın Vocational School, Computer Technology Department, 46410, Andırın, Kahramanmaras, Turkey. Received: October 17, 2018; Accepted: December 04, 2018; Published: January 01, 2019 Abstract The specialized computer programs used in the collection, organization, analysis, interpretation and presentation of the data are known as statistical software. Descriptive statistics and inferential statistics are two main statistical methodologies in some of the software used in data analysis. Descriptive statistics summarizes data in a sample using indices such as mean or standard deviation. Inferential statistics draws conclusions that are subject to random variables such as observational errors and sampling variation. In this study, statistical software used in data analysis is examined under two main headings as open source (free) and licensed (paid). For this purpose, 5 most commonly used software were selected from each groups. Statistical analyzes and analysis outputs of this selected software have been examined comparatively. As a result of this study, the features of licensed and unlicensed programs are presented to the researchers in a comparative way. Keywords: Statistical software, Statistical analysis, Data analysis *Corresponding author: Kahramanmaraş Sütçü Imam University, Faculty of Agriculture, Department of Animal Science, 46040, Onikişubat, Kahramanmaraş, Turkey E mail: [email protected] (Ö. F. KARAOKUR) Ömer Faruk KARAOKUR https://orcid.org/ 0000-0002-3436-8415 Fahrettin KAYA https://orcid.org/ 0000-0003-1666-4859 Esra YAVUZ https://orcid.org/ 0000-0002-5589-297X Aysel YENİPINAR https://orcid.org/ 0000-0001-2345-6789 Cite as: Karaokur OF, Kaya F, Yavuz E, Yenipinar, A. 2019. Comparison of commonly used statistics package programs. BSJ Eng Sci, 2(1): 26-32. 1. Introduction Statistical analysis process examines in two groups, which are descriptive statistics and inferential statistics. Statistical analysis process is the collection, organization, Descriptive statistics summarize population data by analysis, interpretation and presentation of data. It starts describing what was observed in the sample graphically with defining the method and population. The population or numerically. The numeric descriptors are the mean consists of observations collected by selected methods at and standard deviation for continuous data types, and the various times, and the data obtained from each percent and frequency for categorical data. observation serves as a member of the overall group. BSJ Eng. Sci. / Ömer Faruk KARAOKUR et al. 26 Black Sea Journal of Engineering and Science Inferential statistics allows making inferences about SAS / STAT offers analysis such as variance, Bayes, population using patterns in the sample by taking into categorical data, clustering, discriminant distribution, consideration the randomness. The inference can extend group order design, multivariate, psychometric, to forecasting and prediction. This may also include regression, spatial. It also offers methods and techniques extrapolation and interpolation of time series or spatial such as descriptive statistics, definite methods, group data. order design, market research, missing value estimation, Analysis of variance (Anova), chi-square test, correlation, power and sample size. factor analysis, Mann-Whitney U, mean square weighted SAS / IML provides a matrix programming language for deviation (MSWD), regression analysis, student t-test, more specific analysis and data research such as matrix time series analysis, k-nearest neighbors (k-nn), majority functions, linear algebra and statistical functions, time classifier, group method of data processing algorithms, series functions, interactive data analysis, etc.. logistic regression, uplift modeling, naive Bayes classifier, SAS Studio provides hundreds of SAS statements and support vector machines etc. are some statistical tests procedures along with programming tools that are useful and procedures used in predictive analysis. for improving productivity by using autocomplete for Application of statistical tests and procedures will cause built-in syntax help, using the browser-based interface time loss if the number of data is excessive. The use of for basic data processing and basic statistical tasks to statistical analysis package programs helps to avoid time build SAS code. loss. A number of statistical analysis package programs SAS / Access provides easy connectivity with SAS and have emerged to meet the needs of researchers, students other file formats for access and integration and businesses. In this study, the features of selected (Anonymous, 2018a). licensed and unlicensed programs are presented to the An intuitive interface, a powerful programming language, researchers and statistical analysis programs are comprehensive and reliable tools, a robust and flexible compared in accordance with this information matrix programming language and a great user community are among the benefits and best features 2. Software offered to users by SAS University Edition. 2.1.2. Paleontological statistics (PAST) 2.1. Open Source (Free) Statistical Analysis Package Paleontological Statistics (PAST) is a free statistical Programs software for data analysis, graph creation, data 2.1.1. SAS University edition manipulation, univariate and multivariate statistics, time SAS Analytics has launched a free program called SAS series, ecological analysis, morphometric, stratigraphy University Edition for higher education in the context of and spatial analysis. PAST provides an easy-to-use educational support. SAS University Edition is a free analysis package that includes statistical, graphical and edition that includes selected SAS products for learning modeling functions to users. PAST is easy to use because and teaching statistics and quantitative methods. SAS it provides an interactive user interface, easy coding, and University Edition provides easy access to statistical a spreadsheet type data entry form where excel users software for courses such as research, engineering, social can enter their data. Users can make curve fitting in sciences, economics, computer science, trade, medicine many types. Linear with bootstrapping and permutation, and health. SAS University Edition is available in two Breusch-Pagan and Durbin-Watson tests and generalized versions. These are SAS Download that runs locally on linear model that includes Logit regression, log-log, your machine and Aws Marketplace that requires logistic, polynomial, lin- log, Von Bertalanffy and sum of internet connection. Both versions work in Windows, sines are some of the curve fitting types located in PAST. Linux and Mac OS X operating systems. The SAS PAST provides a good platform for users to perform University Edition package also includes access to the multivariate statistics including principal coordinates, SAS Analytics community, where you can access e- key components, non-metric and multidimensional learning classes, training videos, and more SAS scaling, and much more. PAST also offers time series resources. SAS University Edition consists of Base SAS, analysis features including spectral analysis, cross- SAS / STAT, SAS / IML, SAS Studio and SAS / Access correlation, autocorrelation, Walsh transform, wavelet components. transform, runs test, short time Fourier transform and Base SAS makes programming with SAS programming Markov chains and Mantel correlogram and language, output delivery service graphics periodogram. Directional statistics, angular mean with (ODSgraphics) and reporting procedures. It also confidence interval, kernel density estimation of point provides convenient data analysis including intuitive density, rose plots, alignment detection and more are the 4GL and easy-to-learn syntax, including correlations, geometric analysis methods offered by PAST to users. intuitive statistics, distribution and table analysis. SAS Analysis of parsimony that includes Heuristic algorithms, Macro Facility reduces coding to modularize work such Fitch and Dollo characters and Wagner is also performed as easy reuse and maintenance and support for by PAST. PAST also has a biostratigraphic property and Structured Query Language (SQL). guiding feature including thin plate spline and moving BSJ Eng. Sci. / Ömer Faruk KARAOKUR et al. 27 Black Sea Journal of Engineering and Science average (Anonymous, 2018b). files as well as having popup windows that provide Curve fitting, multivariate statistics, time series analysis access to supported statistical analysis and transforms. and geometrical analysis are the best of the features Features can be used through interactive dialog boxes offered by PAST to users. that display each command's options and required 2.1.3. MacAnova parameters. PSPP can generate high-quality graphics MacAnova was written and developed by two (box and whisker plots, normal probability plots and statisticians from the Statistical School of the University histograms) to help visualize the distribution of data. of Minnesota, Gary W. OEHLERT