Introduction to Programming

Session 1. Getting Started with R

Dr. Chao Zheng Mathematical Sciences & S3RI

[email protected]

...... Introduction to the Introduction Sessions

The introduction sessions are designed for MSc Stats & OR students. During the sessions you will: ▶ learn basic knowledge of R ▶ have a (good) impression of R and its usefulness in statistics ▶ see some examples of statistical analysis using R

Do NOT be afraid if you: ▶ have no experience in R; ▶ do not understand some of the R commands I demonstrate; ▶ do not understand the statistical concepts I mention;

You are NOT expected to become familiar with R after these two sessions!

, Week 0 ...... 1/26...... Outline for Session 1.

1. R Introduction ▶ What is R ▶ Why using R ▶ Who is using R

2. First Taste of R ▶ Install R and RStudio ▶ Basic commands in R ▶ Feature of RStudio ▶ Play with your first real-world dataset

, Week 0 ...... 2/26...... 1. R introduction

, Week 0 1. R introduction ...... 3/26...... What is R? ▶ R is a language and a software environment for statistical computing and graphics. ▶ In that sense it is like: Matlab/SAS/Excel/Stata/SPSS/.... ▶ R is based on another programming language S (inspired by Scheme), which is created by John Chambers in 1976, while at Bell Labs. ▶ R was created by Ross Ihaka and Robert Gentleman at the , New Zealand, and is developed by the R Core Team. The first version (v1.0) of R was released on 29 February 2000. ▶ Latest version of R: v4.0.3

Figure: Ross Ihaka (left) and Robert Gentleman (right) , Week 0 1.1.1 Introduction to R ...... 4/26...... Why Using R? — Lingua Franca of Statistics

It is ”THE” language for statisticians and data scientists. ▶ Nearly all statistical methods are implemented in R, from the elementary ones to the most state-of-the-art ones ▶ The main features of R are: ◦ data handling and storage facility; ◦ operators for matrix (and array) manipulation; ◦ data analysis tools; ◦ graphical facilities; ￿

, Week 0 1.1.1 Introduction to R ...... 5/26...... Why Using R? — Language of Future It is quickly becoming one of the most popular (statistics) languages.

Figure: Popularity of different programming languages. R and Python are languages with growing trends. Note that most competitors in this figure are general-purpose programming language.

, Week 0 1.1.1 Introduction to R ...... 6/26...... Why Using R? — Open-Source R is an open-source programming language. This means that anyone can work with R without any need for a license or a fee. Users are able to produce their own add-on packages implementing new methods, and uploaded to an on-line repository called CRAN. Other users can download these packages.

Figure: Number of R packages created , Week 0 1.1.1 Introduction to R ...... 7/26...... Why Using R? — Quality Graph R facilitates aesthetic and visually appealing graphs that set it apart from other programming languages.

Figure: A gorgeous chart created by package in R, showing most trafficked cycle routes in London. Code is available here.

, Week 0 1.1.1 Introduction to R ...... 8/26...... Why Using R? — Other Reasons

▶ Compatibility: R is highly compatible and can be paired with many other programming languages like C, C++, Java, and Python.

▶ EaSy to get help: When you have a question with R and cannot figure it out for yourself, do not forget resources available. Beaware that often answers for your questions can be found on websites such as StackOverflow and R-Bloggers...

▶ Eye-Catching Reports. With packages like Shiny and Markdown, reporting the analysis results with with the data, plots and R scripts is extremely easy. You can choose your report format flexibly among Latex/Word/Web apps. For example, your MATH6166/6173 Lab sheets and solutions are all created by R Markdown.

, Week 0 1.1.1 Introduction to R ...... 9/26...... Who is using R?

Figure: Break down the use of R by industry (left). Academics come first as R is a language to do statistic related research. R is also the first choice in the healthcare industry, followed by government and consulting. ’Big’ Companies from different areas that are using R (Right).

, Week 0 1.1.1 Introduction to R ...... 10/26...... Who is Using R?

Figure: ”The face you make when you create your first plot using R and proudly present it to a journalist.”

, Week 0 1.1.1 Introduction to R ...... 11/26...... Is R Difficult to learn?

, Week 0 1.1.1 Introduction to R ...... 12/26...... Is R Difficult to learn?

, Week 0 1.1.1 Introduction to R ...... 13/26...... 2. First taste of R and RStudio

, Week 0 2. First taste of R and RStudio ...... 14/26...... Installing R on your own machine

1. Go to www.r-project.org 2. Click on download R 3. Choose a mirror. It is recommended that you choose either one of the 0-Cloud or UK mirrors. 4. Click on Download R for … depending on your operating system and follow the instructions. 5. Install R follow the instructions.

, Week 0 2. First taste of R and RStudio ...... 15/26...... First Taste of R Open your R and try to type the following commands print("Hello World") Math operators in R: 2+2

2 + 2

6 - 3 + 2

3 * 4

4 / 2

log(10)

exp(3)

2^3

sqrt(1.5) , Week 0 2. First taste of R and RStudio ...... 16/26...... Vector and Matrix

function to create a vector — c() c(1, 2, 3, 4, 5, 6)

c(1:6)

c(1:100) function to create a matrix — matrix() matrix(1:6, nrow=2, ncol=3, byrow=TRUE)

matrix(1:6, nrow=2, ncol=3, byrow=FALSE)

To look up (recall) previous command — press ”up”

, Week 0 2. First taste of R and RStudio ...... 17/26...... Write Comments

Use # to write a comment # This is a comment that will not be run

matrix(1:6, nrow=2, ncol=3, byrow=TRUE) # matrix by row

matrix(1:6, nrow=2, ncol=3, byrow=FLASE) # matrix by column

, Week 0 2. First taste of R and RStudio ...... 18/26...... Help getting help with a R function — help() help(matrix)

help(c)

help(print)

?matrix

, Week 0 2. First taste of R and RStudio ...... 19/26...... Assign An Object

Use <- to assign values to an object

a <- 4

a

a <- 2 + 2

a

2 + 2 -> a

a

A <- 14

A Sometimes you can use the equal sign ”=” to assign an object, but this is not recommended. , Week 0 2. First taste of R and RStudio ...... 20/26...... Generate Random Numbers

Try the following commands that generate random number(s) from N(µ, σ2)

rnorm(1)

rnorm(10)

rnorm(10, mean=2, sd=3)

help(rnorm) Try the following commands that generate raNdom numbers from Unif(a, b) runif(1)

runif(10)

runif(10, min=2, max=3)

, Week 0 2. First taste of R and RStudio ...... 21/26...... Summary

Some times when we have a big table, and we want to just have a brief look, we can use function — head() and summary() a <- rnorm(100000)

head(a)

tail(a)

head(a, n=10L)

b <- matrix(rnorm(10000), nrow=100) # 100 * 100 matrix

head(b)

summary(a)

summary(b)

, Week 0 2. First taste of R and RStudio ...... 22/26...... Installing RStudio on your own machine

RStudio is a piece of software that provides a free and more user-friendly GUI (graphical user interface) to R. It does not replace R. Statistical analyses are still run by typing code! To work, you need to have R and RStudio installed on your machine.

You can install RStudio by: 1. Go to www..com 2. Click on Download under RStudio 3. Click on Download under RStudio Desktop 4. Download the installer appropriate for your operating system. 5. Install RStudio follow the instructions

, Week 0 2. First taste of R and RStudio ...... 23/26...... First Taste of RStudio

Open your RStudio and try the following: 1. Create an R script file: File — New File — R Script 2. Type the same commands you just tried for R 3. Select and Run each of your command 4. Clear your console/workspace 5. Save your R script file: File — Save (or Save As) 6. Customize your own RStudio layout and appearance

, Week 0 2. First taste of R and RStudio ...... 24/26...... Load your first Dataset — Pokemon pokemon <- read.csv(file="Desktop/Pokemon.csv", header=TRUE)

head(pokemon)

str(pokemon)

summary(pokemon)

, Week 0 2. First taste of R and RStudio ...... 25/26...... Tasks for You

1. Install R and RStudio on your own computer. 2. Try the commands you learned in this sessions. 3. Explore a little bit of the Pokemon data using R

To download slides, R codes and the Pokemon data for this session, 1. Go to blackboard.soton.ac.uk 2. Log in your account 3. Select the module Statistical Computing or Statistical Computing for Data Scientists 4. Choose Course Content 5. Open the folder Introduction Session If you are not able to open the module page on blackboard, you can download above materials from www.maths.lancs.ac.uk/∼zhengc5/teaching.html.

, Week 0 2. First taste of R and RStudio ...... 26/26......