Introduction to the R Project for Statistical Computing for Use at ITC

Total Page:16

File Type:pdf, Size:1020Kb

Introduction to the R Project for Statistical Computing for Use at ITC Introduction to the R Project for Statistical Computing for use at ITC D G Rossiter University of Twente Faculty of Geo-information Science & Earth Observation (ITC) Enschede (NL) http://www.itc.nl/personal/rossiter August 14, 2012 Actual vs. modelled straw yields 9 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● 8 ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●●● ●● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● 4.5 ●● ● ● ●● ● ● ●● ● ● ● ●●● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ●●● ●● ●● ● ●● ● ●● ●● ● ●●● ●●●● ● ●● ● ● ●●● ● ●●● ● ● 7 ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ●●● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ●●● ●● ● ● ●● ● ●●● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ●● ● 4.0 ● ● ● ● ● ● ● ●●●●●●●●●●●● ●● ●●● ● ● ● ●● ●● ● ● ● ●● ● ● ●● ●● ● Actual ● ● ● ● ●● ●● ● ● ●●●●●●●●●●● ● ● ●●●● ● ● ● ● ● ●●●●●● ● ●● ● ● ●● ●● ●● ● ● ● ● ●●●● ● ● ● 6 ●● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ●●●● ●●●● ● ● ●● ● ●●●●●● ● ● ● ● ● ●● ● ●●●● ● ●●● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ●● ●●●● ● ● 3.5 ● ●● Grain yield, lbs per plot ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● 5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 3.0 ● ● 4 4 5 6 7 8 9 1 3 5 7 9 11 13 15 17 19 21 23 25 Modelled Column number Frequency histogram, Meuse lead concentration 60 53 GLS 2nd−order trend surface, subsoil clay % 50 340000 40 335000 30 26 Frequency 330000 N 20 17 17 17 12 325000 10 4 3 3 320000 1 1 0 0 1 0 315000 0 100 200 300 400 500 600 700 660000 670000 680000 690000 700000 lead concentration, mg kg−1 Counts shown above bar, actual values shown with rug plot E Contents 0 If you are impatient . .1 1 What is R?1 2 Why R for ITC?3 2.1 Advantages...............................3 2.2 Disadvantages.............................4 2.3 Alternatives...............................5 2.3.1 S-PLUS..............................5 2.3.2 Statistical packages......................5 2.3.3 Special-purpose statistical programs...........5 2.3.4 Spreadsheets..........................6 2.3.5 Applied mathematics programs..............6 3 Using R7 3.1 R console GUI..............................7 3.1.1 On your own Windows computer.............7 3.1.2 On the ITC network......................7 3.1.3 Running the R console GUI.................8 3.1.4 Setting up a workspace in Windows............8 3.1.5 Saving your analysis steps..................9 3.1.6 Saving your graphs......................9 3.2 Working with the R command line................. 10 3.2.1 The command prompt.................... 10 3.2.2 On-line help in R........................ 11 3.3 The RStudio development environment............. 13 3.4 The Tinn-R code editor....................... 14 3.5 Writing and running scripts..................... 14 3.6 The Rcmdr GUI............................. 16 3.7 Loading optional packages...................... 17 3.8 Sample datasets............................ 18 4 The S language 19 4.1 Command-line calculator and mathematical operators.... 19 4.2 Creating new objects: the assignment operator......... 20 4.3 Methods and their arguments.................... 21 4.4 Vectorized operations and re-cycling............... 22 4.5 Vector and list data structures................... 24 4.6 Arrays and matrices.......................... 25 4.7 Data frames............................... 30 4.8 Factors.................................. 34 4.9 Selecting subsets............................ 36 Version 4.0 Copyright © D G Rossiter 2003 – 2012. All rights reserved. Non-commercial reproduction and dissemination of the work as a whole freely permitted if this original copyright notice is included. To adapt or translate please contact the author. ii 4.9.1 Simultaneous operations on subsets........... 39 4.10 Rearranging data............................ 40 4.11 Random numbers and simulation................. 41 4.12 Character strings............................ 43 4.13 Objects and classes.......................... 44 4.13.1 The S3 and S4 class systems................ 45 4.14 Descriptive statistics......................... 48 4.15 Classification tables.......................... 50 4.16 Sets.................................... 51 4.17 Statistical models in S......................... 52 4.17.1 Models with categorical predictors............ 55 4.17.2 Analysis of Variance (ANOVA)............... 57 4.18 Model output.............................. 57 4.18.1 Model diagnostics....................... 59 4.18.2 Model-based prediction................... 61 4.19 Advanced statistical modelling................... 62 4.20 Missing values............................. 63 4.21 Control structures and looping................... 64 4.22 User-defined functions........................ 65 4.23 Computing on the language..................... 67 5 R graphics 69 5.1 Base graphics.............................. 69 5.1.1 Mathematical notation in base graphics......... 73 5.1.2 Returning results from graphics methods........ 75 5.1.3 Types of base graphics plots................ 75 5.1.4 Interacting with base graphics plots............ 77 5.2 Trellis graphics............................. 77 5.2.1 Univariate plots........................ 77 5.2.2 Bivariate plots......................... 78 5.2.3 Triivariate plots........................ 79 5.2.4 Panel functions......................... 81 5.2.5 Types of Trellis graphics plots............... 82 5.2.6 Adjusting Trellis graphics parameters.......... 82 5.3 Multiple graphics windows...................... 84 5.3.1 Switching between windows................. 85 5.4 Multiple graphs in the same window............... 85 5.4.1 Base graphics.......................... 85 5.4.2 Trellis graphics......................... 86 5.5 Colours.................................. 86 6 Preparing your own data for R 91 6.1 Preparing data directly in R..................... 91 6.2 A GUI data editor........................... 92 6.3 Importing data from a CSV file................... 93 6.4 Importing images........................... 96 7 Exporting from R 99 iii 8 Reproducible data analysis 101 8.1 The NoWeb document........................ 101 8.2 The LATEX document.......................... 102 8.3 The PDF document.......................... 103 8.4 Graphics in Sweave.......................... 104 9 Learning R 105 9.1 Task views................................ 105 9.2 R tutorials and introductions.................... 105 9.3 Textbooks using R........................... 106 9.4 Technical notes using R....................... 107 9.5 Web Pages to learn R......................... 107 9.6 Keeping up with developments in R................ 108 10 Frequently-asked questions 110 10.1 Help! I got an error, what did I do wrong?............ 110 10.2 Why didn’t my command(s) do what I expected?........ 112 10.3 How do I find the method to do what I want?.......... 113 10.4 Memory problems........................... 115 10.5 What version of R am I running?.................. 116 10.6 What statistical procedure should I use?............. 117 A Obtaining your own copy of R 119 A.1 Installing new packages....................... 121 A.2 Customizing your installation.................... 121 A.3 R in different human languages................... 122 B An example script 123 C An example function 126 References 128 Index of R concepts 133 List of Figures 1 The RStudio screen.......................... 13 2 The Tinn-R screen........................... 14 3 The R Commander screen...................... 16 4 Regression diagnostic plots..................... 60 5 Finding the closest point....................... 66 6 Default scatterplot........................... 70 7 Plotting symbols............................ 71 8 Custom scatterplot.......................... 73 9 Scatterplot with math symbols, legend and model lines.... 74 10 Some interesting base graphics plots............... 76 11 Trellis density plots.......................... 78 12 Trellis scatter plots.......................... 79 13 Trellis trivariate plots......................... 80 14 Trellis scatter plot with some added elements......... 82 iv 15 Available colours............................ 87 16 Example of a colour ramp...................... 89 17 R graphical data editor........................ 93 18 Example PDF produced by Sweave and LATEX........... 103 19 Results of an RSeek search...................... 108 20 Results of an R site search...................... 109 21 Visualising the variability of small random samples...... 125 List of Tables 1 Methods for adding to an existing base graphics plot..... 71 2 Base graphics plot types....................... 75 3 Trellis graphics plot types...................... 83 4 Packages in the base R distribution for Windows........ 120 v 0 If you are impatient . 1. Install R and RStudio on your MS-Windows, Mac OS/X or Linux sys- tem (§A); 2. Run RStudio; this will automatically start R within it; 3. Follow one of the tutorials (§9.2) such as my “Using the R Environ- ment for Statistical Computing: An example with the Mercer & Hall wheat yield dataset”1 [48]; 4. Experiment! 5. Use this document as a reference. 1 What is R? R is an open-source environment for statistical computing and visualisa- tion. It is based on the S language developed at Bell Laboratories in the 1980’s [20], and is the product of an active movement among statisti- cians for a powerful, programmable, portable, and open computing en- vironment, applicable to the most complex and sophsticated problems, as well as “routine” analysis, without any restrictions on access or use. Here is a description from the R Project home page:2 “R is an integrated suite of software facilities
Recommended publications
  • WEKA Manual for Version 3-7-8
    WEKA Manual for Version 3-7-8 Remco R. Bouckaert Eibe Frank Mark Hall Richard Kirkby Peter Reutemann Alex Seewald David Scuse January 21, 2013 ⃝c 2002-2013 University of Waikato, Hamilton, New Zealand Alex Seewald (original Commnd-line primer) David Scuse (original Experimenter tutorial) This manual is licensed under the GNU General Public License version 3. More information about this license can be found at http://www.gnu.org/licenses/gpl-3.0-standalone.html Contents ITheCommand-line 11 1Acommand-lineprimer 13 1.1 Introduction . 13 1.2 Basic concepts . 14 1.2.1 Dataset . 14 1.2.2 Classifier . 16 1.2.3 weka.filters . 17 1.2.4 weka.classifiers . 19 1.3 Examples . 23 1.4 Additional packages and the package manager . .24 1.4.1 Package management . 25 1.4.2 Running installed learning algorithms . 26 II The Graphical User Interface 29 2LaunchingWEKA 31 3PackageManager 35 3.1 Mainwindow ............................. 35 3.2 Installing and removing packages . 36 3.2.1 Unofficalpackages ...................... 37 3.3 Usingahttpproxy.......................... 37 3.4 Using an alternative central package meta data repository . 37 3.5 Package manager property file . 38 4SimpleCLI 39 4.1 Commands . 39 4.2 Invocation . 40 4.3 Command redirection . 40 4.4 Command completion . 41 5Explorer 43 5.1 The user interface . 43 5.1.1 Section Tabs . 43 5.1.2 Status Box . 43 5.1.3 Log Button . 44 5.1.4 WEKA Status Icon . 44 3 4 CONTENTS 5.1.5 Graphical output . 44 5.2 Preprocessing . 45 5.2.1 Loading Data .
    [Show full text]
  • Overview-Of-Statistical-Analytics-And
    Brief Overview of Statistical Analytics and Machine Learning tools for Data Scientists Tom Breur 17 January 2017 It is often said that Excel is the most commonly used analytics tool, and that is hard to argue with: it has a Billion users worldwide. Although not everybody thinks of Excel as a Data Science tool, it certainly is often used for “data discovery”, and can be used for many other tasks, too. There are two “old school” tools, SPSS and SAS, that were founded in 1968 and 1976 respectively. These products have been the hallmark of statistics. Both had early offerings of data mining suites (Clementine, now called IBM SPSS Modeler, and SAS Enterprise Miner) and both are still used widely in Data Science, today. They have evolved from a command line only interface, to more user-friendly graphic user interfaces. What they also share in common is that in the core SPSS and SAS are really COBOL dialects and are therefore characterized by row- based processing. That doesn’t make them inherently good or bad, but it is principally different from set-based operations that many other tools offer nowadays. Similar in functionality to the traditional leaders SAS and SPSS have been JMP and Statistica. Both remarkably user-friendly tools with broad data mining and machine learning capabilities. JMP is, and always has been, a fully owned daughter company of SAS, and only came to the fore when hardware became more powerful. Its initial “handicap” of storing all data in RAM was once a severe limitation, but since computers now have large enough internal memory for most data sets, its computational power and intuitive GUI hold their own.
    [Show full text]
  • Tinkertool System 7 Reference Manual Ii
    Documentation 0642-1075/2 TinkerTool System 7 Reference Manual ii Version 7.5, August 24, 2021. US-English edition. MBS Documentation 0642-1075/2 © Copyright 2003 – 2021 by Marcel Bresink Software-Systeme Marcel Bresink Software-Systeme Ringstr. 21 56630 Kretz Germany All rights reserved. No part of this publication may be redistributed, translated in other languages, or transmitted, in any form or by any means, electronic, mechanical, recording, or otherwise, without the prior written permission of the publisher. This publication may contain examples of data used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. This publication could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. The publisher may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Make sure that you are using the correct edition of the publication for the level of the product. The version number can be found at the top of this page. Apple, macOS, iCloud, and FireWire are registered trademarks of Apple Inc. Intel is a registered trademark of Intel Corporation. UNIX is a registered trademark of The Open Group. Broadcom is a registered trademark of Broadcom, Inc. Amazon Web Services is a registered trademark of Amazon.com, Inc.
    [Show full text]
  • Rkward: a Comprehensive Graphical User Interface and Integrated Development Environment for Statistical Analysis with R
    JSS Journal of Statistical Software June 2012, Volume 49, Issue 9. http://www.jstatsoft.org/ RKWard: A Comprehensive Graphical User Interface and Integrated Development Environment for Statistical Analysis with R Stefan R¨odiger Thomas Friedrichsmeier Charit´e-Universit¨atsmedizin Berlin Ruhr-University Bochum Prasenjit Kapat Meik Michalke The Ohio State University Heinrich Heine University Dusseldorf¨ Abstract R is a free open-source implementation of the S statistical computing language and programming environment. The current status of R is a command line driven interface with no advanced cross-platform graphical user interface (GUI), but it includes tools for building such. Over the past years, proprietary and non-proprietary GUI solutions have emerged, based on internal or external tool kits, with different scopes and technological concepts. For example, Rgui.exe and Rgui.app have become the de facto GUI on the Microsoft Windows and Mac OS X platforms, respectively, for most users. In this paper we discuss RKWard which aims to be both a comprehensive GUI and an integrated devel- opment environment for R. RKWard is based on the KDE software libraries. Statistical procedures and plots are implemented using an extendable plugin architecture based on ECMAScript (JavaScript), R, and XML. RKWard provides an excellent tool to manage different types of data objects; even allowing for seamless editing of certain types. The objective of RKWard is to provide a portable and extensible R interface for both basic and advanced statistical and graphical analysis, while not compromising on flexibility and modularity of the R programming environment itself. Keywords: GUI, integrated development environment, plugin, R.
    [Show full text]
  • Annual Report of the Center for Statistical Research and Methodology Research and Methodology Directorate Fiscal Year 2017
    Annual Report of the Center for Statistical Research and Methodology Research and Methodology Directorate Fiscal Year 2017 Decennial Directorate Customers Demographic Directorate Customers Missing Data, Edit, Survey Sampling: and Imputation Estimation and CSRM Expertise Modeling for Collaboration Economic and Research Experimentation and Record Linkage Directorate Modeling Customers Small Area Simulation, Data Time Series and Estimation Visualization, and Seasonal Adjustment Modeling Field Directorate Customers Other Internal and External Customers ince August 1, 1933— S “… As the major figures from the American Statistical Association (ASA), Social Science Research Council, and new Roosevelt academic advisors discussed the statistical needs of the nation in the spring of 1933, it became clear that the new programs—in particular the National Recovery Administration—would require substantial amounts of data and coordination among statistical programs. Thus in June of 1933, the ASA and the Social Science Research Council officially created the Committee on Government Statistics and Information Services (COGSIS) to serve the statistical needs of the Agriculture, Commerce, Labor, and Interior departments … COGSIS set … goals in the field of federal statistics … (It) wanted new statistical programs—for example, to measure unemployment and address the needs of the unemployed … (It) wanted a coordinating agency to oversee all statistical programs, and (it) wanted to see statistical research and experimentation organized within the federal government … In August 1933 Stuart A. Rice, President of the ASA and acting chair of COGSIS, … (became) assistant director of the (Census) Bureau. Joseph Hill (who had been at the Census Bureau since 1900 and who provided the concepts and early theory for what is now the methodology for apportioning the seats in the U.S.
    [Show full text]
  • Fibrinogen Levels Are Associated with Lymph Node Involvement And
    ANTICANCER RESEARCH 38 : 1097-1104 (2018) doi:10.21873/anticanres.12328 Fibrinogen Levels Are Associated with Lymph Node Involvement and Overall Survival in Gastric Cancer Patients JÚLIUS PALAJ 1, ŠTEFAN KEČKÉŠ 2, VÍTĚZSLAV MAREK 1, DANIEL DYTTERT 1, IVETA WACZULÍKOVÁ 3 and ŠTEFAN DURDÍK 1 1Department of Oncological Surgery, St. Elizabeth Cancer Institute, Slovak Republic and Faculty of Medicine in Bratislava of the Comenius University, Bratislava, Slovak Republic; 2Department of Immunodiagnostics, St. Elizabeth Cancer Institute, Bratislava, Slovak Republic; 3Department of Nuclear Physics and Biophysics, Comenius University, Faculty of Mathematics, Physics and Informatics, Bratislava, Slovak Republic Abstract. Background/Aim: Combination of perioperative accounting for 6.8% of all diagnosed cancers and making this chemotherapy with gastrectomy with D2 lymphadenectomy cancer the 5th most common malignancy globally (2). improves long-term survival in patients with gastric cancer. Moreover, it is the third leading cause of death in both sexes The aim of this study was to investigate the predictive value accounting for 8.8% of the total deaths from cancer (3). of preoperative levels of CRP, albumin, fibrinogen, In spite of advancements in chemotherapy and local neutrophil-to-lymphocyte ratio and routinely used tumor control of GC, prognosis remains poor, mainly because of markers (CEA, CA 19-9, CA 72-4) for lymph node the advancement of the disease at the time of diagnosis. involvement. Materials and Methods: This retrospective Approximately 50% patients in western countries have study was conducted in 136 patients who underwent surgery metastases at the time of diagnosis, and from those without between 2007 and 2015. Bivariable and multivariable metastatic disease only 50% are eligible for gastric resection analyses were performed in order to identify important (4).
    [Show full text]
  • Towards a Fully Automated Extraction and Interpretation of Tabular Data Using Machine Learning
    UPTEC F 19050 Examensarbete 30 hp August 2019 Towards a fully automated extraction and interpretation of tabular data using machine learning Per Hedbrant Per Hedbrant Master Thesis in Engineering Physics Department of Engineering Sciences Uppsala University Sweden Abstract Towards a fully automated extraction and interpretation of tabular data using machine learning Per Hedbrant Teknisk- naturvetenskaplig fakultet UTH-enheten Motivation A challenge for researchers at CBCS is the ability to efficiently manage the Besöksadress: different data formats that frequently are changed. Significant amount of time is Ångströmlaboratoriet Lägerhyddsvägen 1 spent on manual pre-processing, converting from one format to another. There are Hus 4, Plan 0 currently no solutions that uses pattern recognition to locate and automatically recognise data structures in a spreadsheet. Postadress: Box 536 751 21 Uppsala Problem Definition The desired solution is to build a self-learning Software as-a-Service (SaaS) for Telefon: automated recognition and loading of data stored in arbitrary formats. The aim of 018 – 471 30 03 this study is three-folded: A) Investigate if unsupervised machine learning Telefax: methods can be used to label different types of cells in spreadsheets. B) 018 – 471 30 00 Investigate if a hypothesis-generating algorithm can be used to label different types of cells in spreadsheets. C) Advise on choices of architecture and Hemsida: technologies for the SaaS solution. http://www.teknat.uu.se/student Method A pre-processing framework is built that can read and pre-process any type of spreadsheet into a feature matrix. Different datasets are read and clustered. An investigation on the usefulness of reducing the dimensionality is also done.
    [Show full text]
  • An Example of Statistical Data Analysis Using the R Environment for Statistical Computing
    Tutorial: An example of statistical data analysis using the R environment for statistical computing D G Rossiter Version 1.4; May 6, 2017 Subsoil vs. topsoil clay, by zone Regression Residuals vs. Fitted Values, subsoil clay % 128 80 15 138 ● 17119 137 1 ● 139 70 2 ● 3 10 ● 4 ● 60 ● ● 5 50 0 Slopes: Residual 40 zone 1 : 0.834 Subsoil clay % Subsoil clay ● ● zone 2 : 0.739 zone 3 : 0.564 −5 30 zone 4 : 1.081 overall: 0.829 −10 20 81 −15 10 145 10 20 30 40 50 60 70 80 20 30 40 50 60 70 Topsoil clay % Fitted GLS 2nd−order trend surface, subsoil clay % 340000 335000 330000 N 325000 320000 315000 660000 670000 680000 690000 700000 E Copyright © D G Rossiter 2008 { 2010, 2014, 2017 All rights reserved. Repro- duction and dissemination of the work as a whole (not parts) freely permitted if this original copyright notice is included. Sale or placement on a web site where payment must be made to access this document is strictly prohibited. To adapt or translate please contact the author ([email protected]). Contents 1 Introduction1 2 Example Data Set2 2.1 Loading the dataset...........................3 2.2 A normalized database structure*...................5 3 Research questions8 4 Univariarte Analysis9 4.1 Univariarte Exploratory Data Analysis................9 4.2 Point estimation; inference of the mean............... 14 4.3 Answers.................................. 15 5 Bivariate correlation and regression 16 5.1 Conceptual issues in correlation and regression........... 16 5.2 Bivariate Exploratory Data Analysis................. 18 5.3 Bivariate Correlation Analysis..................... 22 5.4 Fitting a regression line........................
    [Show full text]
  • Using SPSS to Analyze Complex Survey Data: a Primer
    Journal of Modern Applied Statistical Methods Volume 18 Issue 1 Article 16 4-6-2020 Using SPSS to Analyze Complex Survey Data: A Primer Danjie Zou University of British Columbia, [email protected] Jennifer E. V. Lloyd University of British Columbia, [email protected] Jennifer L. Baumbusch University of British Columbia, [email protected] Follow this and additional works at: https://digitalcommons.wayne.edu/jmasm Part of the Applied Statistics Commons, Social and Behavioral Sciences Commons, and the Statistical Theory Commons Recommended Citation Zou, D., Lloyd, J. E. V., & Baumbusch, J. L. (2019). Using SPSS to analyze complex survey data: A primer Journal of Modern Applied Statistical Methods, 18(1), eP3253. doi: 10.22237/jmasm/1556670300 This Regular Article is brought to you for free and open access by the Open Access Journals at DigitalCommons@WayneState. It has been accepted for inclusion in Journal of Modern Applied Statistical Methods by an authorized editor of DigitalCommons@WayneState. Using SPSS to Analyze Complex Survey Data: A Primer Cover Page Footnote Thank you to the McCreary Centre Society (https://www.mcs.bc.ca/), who collects and owns the British Columbia Adolescent Health Survey data. Thanks also to Dr. Colleen Poon, Allysha Ram, Dr. Elizabeth Saewyc, and Annie Smith for their guidance as we worked with the data. We also thank the Social Sciences and Humanities Research Council of Canada (SSHRC) for an Insight Development grant awarded to Dr. Baumbusch. Finally, thanks to blind reviewers for their comments that improved the paper. An SPSS syntax file with the commands outlined in this paper is va ailable for download at: http://blogs.ubc.ca/jenniferlloyd/ This regular article is available in Journal of Modern Applied Statistical Methods: https://digitalcommons.wayne.edu/ jmasm/vol18/iss1/16 Journal of Modern Applied Statistical Methods May 2019, Vol.
    [Show full text]
  • User Interface Specification for Interactive Software Systems
    User Interface Specification for Interactive Software Systems Process-, Method- and Tool-Support for Interdisciplinary and Collaborative Requirements Modelling and Prototyping-Driven User Interface Specification Dissertation zur Erlangung des akademischen Grades des Doktor der Naturwissenschaften (Dr. rer. nat.) Universität Konstanz Mathematisch-Naturwissenschaftliche Sektion Fachbereich Informatik und Informationswissenschaft Vorgelegt von Thomas Memmel Betreuer der Dissertation: Prof. Dr. Harald Reiterer Tag der mündlichen Prüfung: 29. April 2009 1. Referent: Prof. Dr. Harald Reiterer 2. Referent: Prof. Dr. Rainer Kuhlen Prof. Dr. Rainer Kuhlen For Cathrin Acknowledgements I thank my advisor, Prof. Dr. Harald Reiterer, for more than 6 years of great Prof. Dr. Harald Reiterer teamwork. Since I joined his work group as a student researcher, his guidance and friendship have helped me to reach high goals and achieve scientific recognition. I thank Harald for his creative contributions and his unfailing support, which made him the best supervisor I could imagine. Every time I read the Dr. in front of my name, I will think about the person who made it possible. It was Harald! Moreover, I thank him for teaching me many skills, of which especially purposefulness and per- suasive power opened up a world of possibilities. Among the other researchers in the human-computer interaction work group, spe- My colleague and friend Fredrik cial thanks are due to my colleague Fredrik Gundelsweiler. Fredrik and I started working for Harald at the same time, and since then we have shared many experi- ences. I worked with Fredrik at Siemens AG in Munich, and we both gained interna- tional work experience during our stay at DaimlerChrysler AG in Singapore.
    [Show full text]
  • Deducer: a Data Analysis GUI for R
    JSS Journal of Statistical Software June 2012, Volume 49, Issue 8. http://www.jstatsoft.org/ Deducer: A Data Analysis GUI for R Ian Fellows University of California, Los Angeles Abstract While R has proven itself to be a powerful and flexible tool for data exploration and analysis, it lacks the ease of use present in other software such as SPSS and Minitab. An easy to use graphical user interface (GUI) can help new users accomplish tasks that would otherwise be out of their reach, and improves the efficiency of expert users by replacing fifty key strokes with five mouse clicks. With this in mind, Deducer presents dialogs that are understandable for the beginner, and yet contain all (or most) of the options that an experienced statistician, performing the same task, would want. An Excel-like spreadsheet is included for easy data viewing and editing. Deducer is based on Java's Swing GUI library and can be used on any common operating system. The GUI is independent of the specific R console and can easily be used by calling a text-based menu system. Graphical menus are provided for the JGR console and the Windows R GUI. Keywords: GUI, R. 1. Introduction R (R Development Core Team 2012) is a powerful statistical programming language that places the latest statistical techniques at one's fingertips through thousands of add-on packages available on the Comprehensive R Archive Network (CRAN) download servers. The price for all of this power is complexity. Because R analyses must be called as text commands, the user is required to find out the name of the function that will accomplish their task, and then remember that name along with the names of the variables to feed it, and its argument options.
    [Show full text]
  • Vtpin: Practical Vtable Hijacking Protection for Binaries
    VTPin: Practical VTable Hijacking Protection for Binaries Pawel Sarbinowski Vasileios P. Kemerlis Cristiano Giuffrida Vrije Universiteit Amsterdam Brown University Vrije Universiteit Amsterdam [email protected] [email protected] [email protected] Elias Athanasopoulos Vrije Universiteit Amsterdam [email protected] ABSTRACT 1. INTRODUCTION VTable hijacking has lately been promoted to the de facto technique The recent advances in software hardening have undoubtedly for exploiting C++ applications, and in particular web browsers. made exploitation a challenging craft [45]. Yet, despite the plethora VTables, however, can be manipulated without necessarily corrupt- of defenses in place [37], attackers still find ways to compromise ing memory, simply by leveraging use-after-free bugs. In fact, in essential commodity software, like web browsers [5,6]. Modern the recent Pwn2Own competitions all major web browsers were exploits are highly sophisticated and typically leverage a variety of compromised with exploits that employed (among others) use-after- different vulnerabilities to bypass established protections, such as free vulnerabilities and VTable hijacking. address space layout randomization (ASLR) [42], non-executable In this paper, we propose VTPin: a system to protect against memory [12], and sandboxing [18]. To this end, temporal safety VTable hijacking, via use-after-free vulnerabilities, in large C++ errors, and particularly use-after-free vulnerabilities, are becoming binaries that cannot be re-compiled or re-written. The main idea be- a key component of the attackers’ arsenal [1,2,7,8]. Interestingly, hind VTPin is to pin all the freed VTable pointers on a safe VTable exploiting use-after-free bugs does not require corrupting memory; under VTPin’s control.
    [Show full text]