An Introduction to R

Total Page:16

File Type:pdf, Size:1020Kb

An Introduction to R An introduction to R Longhow Lam Under Construction Feb-2010 some sections are unfinished! longhowlam at gmail dot com Contents 1 Introduction 7 1.1 What is R? . .7 1.2 The R environment . .8 1.3 Obtaining and installing R . .8 1.4 Your first R session . .9 1.5 The available help . 11 1.5.1 The on line help . 11 1.5.2 The R mailing lists and the R Journal . 12 1.6 The R workspace, managing objects . 12 1.7 R Packages . 13 1.8 Conflicting objects . 15 1.9 Editors for R scripts . 16 1.9.1 The editor in RGui . 16 1.9.2 Other editors . 16 2 Data Objects 19 2.1 Data types . 19 2.1.1 Double . 19 2.1.2 Integer . 20 2.1.3 Complex . 21 2.1.4 Logical . 21 2.1.5 Character . 22 2.1.6 Factor . 23 2.1.7 Dates and Times . 25 2.1.8 Missing data and Infinite values . 27 2.2 Data structures . 28 2.2.1 Vectors . 28 2.2.2 Matrices . 32 2.2.3 Arrays . 34 2.2.4 Data frames . 35 2.2.5 Time-series objects . 37 2.2.6 Lists . 38 2.2.7 The str function . 41 3 Importing data 42 3.1 Text files . 42 1 3.1.1 The scan function . 44 3.2 Excel files . 44 3.3 Databases . 45 3.4 The Foreign package . 46 4 Data Manipulation 47 4.1 Vector subscripts . 47 4.2 Matrix subscripts . 51 4.3 Manipulating Data frames . 54 4.3.1 Extracting data from data frames . 54 4.3.2 Adding columns to a data frame . 57 4.3.3 Combining data frames . 57 4.3.4 Merging data frames . 59 4.3.5 Aggregating data frames . 60 4.3.6 Stacking columns of data frames . 61 4.3.7 Reshaping data . 61 4.4 Attributes . 62 4.5 Character manipulation . 63 4.5.1 The functions nchar, substring and paste ............ 64 4.5.2 Finding patterns in character objects . 65 4.5.3 Replacing characters . 67 4.5.4 Splitting characters . 68 4.6 Creating factors from continuous data . 68 5 Writing functions 70 5.1 Introduction . 70 5.2 Arguments and variables . 72 5.2.1 Required and optional arguments . 72 5.2.2 The `...' argument . 73 5.2.3 Local variables . 73 5.2.4 Returning an object . 74 5.2.5 The Scoping rules . 75 5.2.6 Lazy evaluation . 76 5.3 Control flow . 77 5.3.1 Tests with if and switch ...................... 77 5.3.2 Looping with for, while and repeat ................ 79 5.4 Debugging your R functions . 80 5.4.1 The traceback function . 80 5.4.2 The warning and stop functions . 81 5.4.3 Stepping through a function . 82 5.4.4 The browser function . 83 6 Efficient calculations 84 6.1 Vectorized computations . 84 2 6.2 The apply and outer functions . 86 6.2.1 the apply function . 86 6.2.2 the lapply and sapply functions . 87 6.2.3 The tapply function . 89 6.2.4 The by function . 90 6.2.5 The outer function . 91 6.3 Using Compiled code . 92 6.3.1 The .C and .Fortran interfaces . 93 6.3.2 The .Call and .External interfaces . 94 6.4 Some Compiled Code examples . 94 6.4.1 The arsim example . 94 6.4.2 Using #include <R.h> ........................ 96 6.4.3 Evaluating R expressions in C . 98 7 Graphics 103 7.1 Introduction . 103 7.2 More plot functions . 104 7.2.1 The plot function . 105 7.2.2 Distribution plots . 106 7.2.3 Two or more variables . 109 7.2.4 Graphical Devices . 111 7.3 Modifying a graph . 112 7.3.1 Graphical parameters . 112 7.3.2 Some handy low-level functions . 119 7.3.3 Controlling the axes . 121 7.4 Trellis Graphics . 123 7.4.1 Introduction . 123 7.4.2 Multi panel graphs . 125 7.4.3 Trellis panel functions . 128 7.4.4 Conditioning plots . 130 7.5 The ggplot2 package . 131 7.5.1 The qplot function . 131 7.5.2 Facetting . 133 7.5.3 Plots with several layers . 134 8 Statistics 135 8.1 Basic statistical functions . 135 8.1.1 Statistical summaries and tests . 135 8.1.2 Probability distributions and random numbers . 139 8.2 Regression models . 141 8.2.1 Formula objects . 141 8.3 Linear regression models . 142 8.3.1 Formula objects . 142 8.3.2 Modeling functions . 144 3 8.3.3 Multicollinearity . 149 8.3.4 Factor (categorical) variables as regression variables . 152 8.4 Logistic regression . 154 8.4.1 The modeling function glm ...................... 155 8.4.2 Performance measures . 157 8.4.3 Predictive ability of a logistic regression . 158 8.5 Tree models . 160 8.5.1 An example of a tree model . ..
Recommended publications
  • Tinn-R Team Has a New Member Working on the Source Code: Wel- Come Huashan Chen
    Editus eBook Series Editus eBooks is a series of electronic books aimed at students and re- searchers of arts and sciences in general. Tinn-R Editor (2010 1. ed. Rmetrics) Tinn-R Editor - GUI forR Language and Environment (2014 2. ed. Editus) José Cláudio Faria Philippe Grosjean Enio Galinkin Jelihovschi Ricardo Pietrobon Philipe Silva Farias Universidade Estadual de Santa Cruz GOVERNO DO ESTADO DA BAHIA JAQUES WAGNER - GOVERNADOR SECRETARIA DE EDUCAÇÃO OSVALDO BARRETO FILHO - SECRETÁRIO UNIVERSIDADE ESTADUAL DE SANTA CRUZ ADÉLIA MARIA CARVALHO DE MELO PINHEIRO - REITORA EVANDRO SENA FREIRE - VICE-REITOR DIRETORA DA EDITUS RITA VIRGINIA ALVES SANTOS ARGOLLO Conselho Editorial: Rita Virginia Alves Santos Argollo – Presidente Andréa de Azevedo Morégula André Luiz Rosa Ribeiro Adriana dos Santos Reis Lemos Dorival de Freitas Evandro Sena Freire Francisco Mendes Costa José Montival Alencar Junior Lurdes Bertol Rocha Maria Laura de Oliveira Gomes Marileide dos Santos de Oliveira Raimunda Alves Moreira de Assis Roseanne Montargil Rocha Silvia Maria Santos Carvalho Copyright©2015 by JOSÉ CLÁUDIO FARIA PHILIPPE GROSJEAN ENIO GALINKIN JELIHOVSCHI RICARDO PIETROBON PHILIPE SILVA FARIAS Direitos desta edição reservados à EDITUS - EDITORA DA UESC A reprodução não autorizada desta publicação, por qualquer meio, seja total ou parcial, constitui violação da Lei nº 9.610/98. Depósito legal na Biblioteca Nacional, conforme Lei nº 10.994, de 14 de dezembro de 2004. CAPA Carolina Sartório Faria REVISÃO Amek Traduções Dados Internacionais de Catalogação na Publicação (CIP) T591 Tinn-R Editor – GUI for R Language and Environment / José Cláudio Faria [et al.]. – 2. ed. – Ilhéus, BA : Editus, 2015. xvii, 279 p. ; pdf Texto em inglês.
    [Show full text]
  • Rkward: a Comprehensive Graphical User Interface and Integrated Development Environment for Statistical Analysis with R
    JSS Journal of Statistical Software June 2012, Volume 49, Issue 9. http://www.jstatsoft.org/ RKWard: A Comprehensive Graphical User Interface and Integrated Development Environment for Statistical Analysis with R Stefan R¨odiger Thomas Friedrichsmeier Charit´e-Universit¨atsmedizin Berlin Ruhr-University Bochum Prasenjit Kapat Meik Michalke The Ohio State University Heinrich Heine University Dusseldorf¨ Abstract R is a free open-source implementation of the S statistical computing language and programming environment. The current status of R is a command line driven interface with no advanced cross-platform graphical user interface (GUI), but it includes tools for building such. Over the past years, proprietary and non-proprietary GUI solutions have emerged, based on internal or external tool kits, with different scopes and technological concepts. For example, Rgui.exe and Rgui.app have become the de facto GUI on the Microsoft Windows and Mac OS X platforms, respectively, for most users. In this paper we discuss RKWard which aims to be both a comprehensive GUI and an integrated devel- opment environment for R. RKWard is based on the KDE software libraries. Statistical procedures and plots are implemented using an extendable plugin architecture based on ECMAScript (JavaScript), R, and XML. RKWard provides an excellent tool to manage different types of data objects; even allowing for seamless editing of certain types. The objective of RKWard is to provide a portable and extensible R interface for both basic and advanced statistical and graphical analysis, while not compromising on flexibility and modularity of the R programming environment itself. Keywords: GUI, integrated development environment, plugin, R.
    [Show full text]
  • From Perl to Python
    From Perl to Python Fernando Pineda 140.636 Oct. 11, 2017 version 0.1 3. Strong/Weak Dynamic typing The notion of strong and weak typing is a little murky, but I will provide some examples that provide insight most. Strong vs Weak typing To be strongly typed means that the type of data that a variable can hold is fixed and cannot be changed. To be weakly typed means that a variable can hold different types of data, e.g. at one point in the code it might hold an int at another point in the code it might hold a float . Static vs Dynamic typing To be statically typed means that the type of a variable is determined at compile time . Type checking is done by the compiler prior to execution of any code. To be dynamically typed means that the type of variable is determined at run-time. Type checking (if there is any) is done when the statement is executed. C is a strongly and statically language. float x; int y; # you can define your own types with the typedef statement typedef mytype int; mytype z; typedef struct Books { int book_id; char book_name[256] } Perl is a weakly and dynamically typed language. The type of a variable is set when it's value is first set Type checking is performed at run-time. Here is an example from stackoverflow.com that shows how the type of a variable is set to 'integer' at run-time (hence dynamically typed). $value is subsequently cast back and forth between string and a numeric types (hence weakly typed).
    [Show full text]
  • Declare New Function Javascript
    Declare New Function Javascript Piggy or embowed, Levon never stalemating any punctuality! Bipartite Amory wets no blackcurrants flagged anes after Hercules supinate methodically, quite aspirant. Estival Stanleigh bettings, his perspicaciousness exfoliating sopped enforcedly. How a javascript objects for mistakes to retain the closure syntax and maintainability of course, written by reading our customers and line? Everything looks very useful. These values or at first things without warranty that which to declare new function javascript? You declare a new class and news to a function expressions? Explore an implementation defined in new row with a function has a gas range. If it allows you probably work, regular expression evaluation but mostly warszawa, and to retain the explanation of. Reimagine your namespaces alone relate to declare new function javascript objects for arrow functions should be governed by programmers avoid purity. Js library in the different product topic and zip archives are. Why is selected, and declare a superclass method decorators need them into functions defined by allocating local time. It makes more sense to handle work in which support optional, only people want to preserve type to be very short term, but easier to! String type variable side effects and declarations take a more situations than or leading white list of merchantability or a way as explicit. You declare many types to prevent any ecmascript program is not have two categories by any of parameters the ecmascript is one argument we expect. It often derived from the new knowledge within the string, news and error if the function expression that their prototype.
    [Show full text]
  • Nuclear Capability, Bargaining Power, and Conflict by Thomas M. Lafleur
    Nuclear capability, bargaining power, and conflict by Thomas M. LaFleur B.A., University of Washington, 1992 M.A., University of Washington, 2003 M.M.A.S., United States Army Command and General Staff College, 2004 M.M.A.S., United States Army Command and General Staff College, 2005 AN ABSTRACT OF A DISSERTATION submitted in partial fulfillment of the requirements for the degree DOCTOR OF PHILOSOPHY Security Studies KANSAS STATE UNIVERSITY Manhattan, Kansas 2019 Abstract Traditionally, nuclear weapons status enjoyed by nuclear powers was assumed to provide a clear advantage during crisis. However, state-level nuclear capability has previously only included nuclear weapons, limiting this application to a handful of states. Current scholarship lacks a detailed examination of state-level nuclear capability to determine if greater nuclear capabilities lead to conflict success. Ignoring other nuclear capabilities that a state may possess, capabilities that could lead to nuclear weapons development, fails to account for the potential to develop nuclear weapons in the event of bargaining failure and war. In other words, I argue that nuclear capability is more than the possession of nuclear weapons, and that other nuclear technologies such as research and development and nuclear power production must be incorporated in empirical measures of state-level nuclear capabilities. I hypothesize that states with greater nuclear capability hold additional bargaining power in international crises and argue that empirical tests of the effectiveness of nuclear power on crisis bargaining must account for all state-level nuclear capabilities. This study introduces the Nuclear Capabilities Index (NCI), a six-component scale that denotes nuclear capability at the state level.
    [Show full text]
  • Statistical Graphical User Interface Plug-In for Survival Analysis in R Statistical and Graphics Language and Environment
    Applied Medical Informatics Original Research Vol. 23, No. 3-4/2008, pp: 57 - 62 Statistical Graphical User Interface Plug-In for Survival Analysis in R Statistical and Graphics Language and Environment Daniel C. LEUCUŢA*, Andrei ACHIMAŞ CADARIU „Iuliu Haţieganu” University of Medicine and Pharmacy Cluj-Napoca, Department of Medical Informatics and Biostatistics, 6 Louis Pasteur, 400349 Cluj-Napoca, Romania. E-mail: [email protected] * Author to whom correspondence should be addressed; Tel.: +4-0264-431697; Fax: +4-0264- 593847. Abstract: Introduction: R is a statistical and graphics language and environment. Although it is extensively used in command line, graphical user interfaces exist to ease the accommodation with it for new users. Rcmdr is an R package providing a basic-statistics graphical user interface to R. Survival analysis interface is not provided by Rcmdr. The AIM of this paper was to create a plug-in for Rcmdr to provide survival analysis user interface for some basic R survival analysis functions. Materials and Methods: The Rcmdr plug-in code was written in Tinn-R. The plug-in package was tested and built with Rtools. The plug-in was installed and tested in R with Rcmdr package on a Windows XP workstation with the "aml" and "kidney" data sets from survival R package. Results: The Rcmdr survival analysis plug-in was successfully built and it provides the functionality it was designed to offer: interface for Kaplan Meier and log log survival graph, interface for the log-rank test, interface to create a Cox proportional hazard regression model, interface commands to test and assess graphically the proportional hazard assumption, and influence observations.
    [Show full text]
  • Scala by Example (2009)
    Scala By Example DRAFT January 13, 2009 Martin Odersky PROGRAMMING METHODS LABORATORY EPFL SWITZERLAND Contents 1 Introduction1 2 A First Example3 3 Programming with Actors and Messages7 4 Expressions and Simple Functions 11 4.1 Expressions And Simple Functions...................... 11 4.2 Parameters.................................... 12 4.3 Conditional Expressions............................ 15 4.4 Example: Square Roots by Newton’s Method................ 15 4.5 Nested Functions................................ 16 4.6 Tail Recursion.................................. 18 5 First-Class Functions 21 5.1 Anonymous Functions............................. 22 5.2 Currying..................................... 23 5.3 Example: Finding Fixed Points of Functions................ 25 5.4 Summary..................................... 28 5.5 Language Elements Seen So Far....................... 28 6 Classes and Objects 31 7 Case Classes and Pattern Matching 43 7.1 Case Classes and Case Objects........................ 46 7.2 Pattern Matching................................ 47 8 Generic Types and Methods 51 8.1 Type Parameter Bounds............................ 53 8.2 Variance Annotations.............................. 56 iv CONTENTS 8.3 Lower Bounds.................................. 58 8.4 Least Types.................................... 58 8.5 Tuples....................................... 60 8.6 Functions.................................... 61 9 Lists 63 9.1 Using Lists.................................... 63 9.2 Definition of class List I: First Order Methods..............
    [Show full text]
  • Lambda Expressions
    Back to Basics Lambda Expressions Barbara Geller & Ansel Sermersheim CppCon September 2020 Introduction ● Prologue ● History ● Function Pointer ● Function Object ● Definition of a Lambda Expression ● Capture Clause ● Generalized Capture ● This ● Full Syntax as of C++20 ● What is the Big Deal ● Generic Lambda 2 Prologue ● Credentials ○ every library and application is open source ○ development using cutting edge C++ technology ○ source code hosted on github ○ prebuilt binaries are available on our download site ○ all documentation is generated by DoxyPress ○ youtube channel with over 50 videos ○ frequent speakers at multiple conferences ■ CppCon, CppNow, emBO++, MeetingC++, code::dive ○ numerous presentations for C++ user groups ■ United States, Germany, Netherlands, England 3 Prologue ● Maintainers and Co-Founders ○ CopperSpice ■ cross platform C++ libraries ○ DoxyPress ■ documentation generator for C++ and other languages ○ CsString ■ support for UTF-8 and UTF-16, extensible to other encodings ○ CsSignal ■ thread aware signal / slot library ○ CsLibGuarded ■ library for managing access to data shared between threads 4 Lambda Expressions ● History ○ lambda calculus is a branch of mathematics ■ introduced in the 1930’s to prove if “something” can be solved ■ used to construct a model where all functions are anonymous ■ some of the first items lambda calculus was used to address ● if a sequence of steps can be defined which solves a problem, then can a program be written which implements the steps ○ yes, always ● can any computer hardware
    [Show full text]
  • PHP Advanced and Object-Oriented Programming
    VISUAL QUICKPRO GUIDE PHP Advanced and Object-Oriented Programming LARRY ULLMAN Peachpit Press Visual QuickPro Guide PHP Advanced and Object-Oriented Programming Larry Ullman Peachpit Press 1249 Eighth Street Berkeley, CA 94710 Find us on the Web at: www.peachpit.com To report errors, please send a note to: [email protected] Peachpit Press is a division of Pearson Education. Copyright © 2013 by Larry Ullman Acquisitions Editor: Rebecca Gulick Production Coordinator: Myrna Vladic Copy Editor: Liz Welch Technical Reviewer: Alan Solis Compositor: Danielle Foster Proofreader: Patricia Pane Indexer: Valerie Haynes Perry Cover Design: RHDG / Riezebos Holzbaur Design Group, Peachpit Press Interior Design: Peachpit Press Logo Design: MINE™ www.minesf.com Notice of Rights All rights reserved. No part of this book may be reproduced or transmitted in any form by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. For information on getting permission for reprints and excerpts, contact [email protected]. Notice of Liability The information in this book is distributed on an “As Is” basis, without warranty. While every precaution has been taken in the preparation of the book, neither the author nor Peachpit Press shall have any liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the instructions contained in this book or by the computer software and hardware products described in it. Trademarks Visual QuickPro Guide is a registered trademark of Peachpit Press, a division of Pearson Education. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks.
    [Show full text]
  • Exploring the Effects of Binge Watching on Television Viewer Reception
    Syracuse University SURFACE Dissertations - ALL SURFACE June 2015 Breaking Binge: Exploring The Effects Of Binge Watching On Television Viewer Reception Lesley Lisseth Pena Syracuse University Follow this and additional works at: https://surface.syr.edu/etd Part of the Social and Behavioral Sciences Commons Recommended Citation Pena, Lesley Lisseth, "Breaking Binge: Exploring The Effects Of Binge Watching On Television Viewer Reception" (2015). Dissertations - ALL. 283. https://surface.syr.edu/etd/283 This Thesis is brought to you for free and open access by the SURFACE at SURFACE. It has been accepted for inclusion in Dissertations - ALL by an authorized administrator of SURFACE. For more information, please contact [email protected]. ABSTRACT The modern television viewer enjoys an unprecedented amount of choice and control -- a direct result of widespread availability of new technology and services. Cultivated in this new television landscape is the phenomenon of binge watching, a popular conversation piece in the current zeitgeist yet a greatly under- researched topic academically. This exploratory research study was able to make significant strides in understanding binge watching by examining its effect on the viewer - more specifically, how it affects their reception towards a television show. Utilizing a uses and gratifications perspective, this study conducted an experiment on 212 university students who were assigned to watch one of two drama series, and designated a viewing condition, binge watching or appointment viewing. Data gathered using preliminary and post questionnaires, as well as short episodic diary surveys, measured reception factors such as opinion, enjoyment and satisfaction. This study found that the effect of binge watching on viewer reception is contingent on the show.
    [Show full text]
  • Chapter 15. Javascript 4: Objects and Arrays
    Chapter 15. JavaScript 4: Objects and Arrays Table of Contents Objectives .............................................................................................................................................. 2 15.1 Introduction .............................................................................................................................. 2 15.2 Arrays ....................................................................................................................................... 2 15.2.1 Introduction to arrays ..................................................................................................... 2 15.2.2 Indexing of array elements ............................................................................................ 3 15.2.3 Creating arrays and assigning values to their elements ................................................. 3 15.2.4 Creating and initialising arrays in a single statement ..................................................... 4 15.2.5 Displaying the contents of arrays ................................................................................. 4 15.2.6 Array length ................................................................................................................... 4 15.2.7 Types of values in array elements .................................................................................. 6 15.2.8 Strings are NOT arrays .................................................................................................. 7 15.2.9 Variables — primitive
    [Show full text]
  • Deducer: a Data Analysis GUI for R
    JSS Journal of Statistical Software June 2012, Volume 49, Issue 8. http://www.jstatsoft.org/ Deducer: A Data Analysis GUI for R Ian Fellows University of California, Los Angeles Abstract While R has proven itself to be a powerful and flexible tool for data exploration and analysis, it lacks the ease of use present in other software such as SPSS and Minitab. An easy to use graphical user interface (GUI) can help new users accomplish tasks that would otherwise be out of their reach, and improves the efficiency of expert users by replacing fifty key strokes with five mouse clicks. With this in mind, Deducer presents dialogs that are understandable for the beginner, and yet contain all (or most) of the options that an experienced statistician, performing the same task, would want. An Excel-like spreadsheet is included for easy data viewing and editing. Deducer is based on Java's Swing GUI library and can be used on any common operating system. The GUI is independent of the specific R console and can easily be used by calling a text-based menu system. Graphical menus are provided for the JGR console and the Windows R GUI. Keywords: GUI, R. 1. Introduction R (R Development Core Team 2012) is a powerful statistical programming language that places the latest statistical techniques at one's fingertips through thousands of add-on packages available on the Comprehensive R Archive Network (CRAN) download servers. The price for all of this power is complexity. Because R analyses must be called as text commands, the user is required to find out the name of the function that will accomplish their task, and then remember that name along with the names of the variables to feed it, and its argument options.
    [Show full text]