Generic Functions in R
Total Page:16
File Type:pdf, Size:1020Kb
Outline Introduction Objects in R Generic functions Do it yourself Conclusion Generic Functions in R Martin Gregory Merck Serono PhUSE 2013 Brussels, October 2013 Outline Introduction Objects in R Generic functions Do it yourself Conclusion 1 Introduction 2 Objects in R 3 Generic functions 4 Do it yourself 5 Conclusion Outline Introduction Objects in R Generic functions Do it yourself Conclusion Preliminary definition A generic function is one which may be applied to different types of inputs producing results depending on the type of input. Examples are plot() and summary() We demonstrate these differences using sample data from Dalgaard’s Introductory Statistics with R Outline Introduction Objects in R Generic functions Do it yourself Conclusion Sample data: melanom > melanom no status days ulc thick sex 1 789 3 10 1 676 2 2 13 3 30 2 65 2 3 97 2 35 2 134 2 4 16 3 99 2 290 1 5 21 1 185 1 1208 2 ... where status 1=dead from melanoma, 2=alive, 3=dead from other causes, days is the observation time in days, thick is the tumor thickness in 1/100mm and sex 1=female, 2=male Outline Introduction Objects in R Generic functions Do it yourself Conclusion Kaplan-Meier survival function Estimate the Kaplan-Meier survival function using sex as an explanatory variable: > surv.bysex <- survfit(Surv(days,status==1)~sex) Surv() converts status to a suitable censoring variable survfit() calculates the K-M survival function and associated statistics Outline Introduction Objects in R Generic functions Do it yourself Conclusion View the survival function > summary(surv.bysex) Call: survfit(formula = Surv(days, status == 1) ~ sex) sex=1 lower upper time n.risk n.event survival std.err 95% CI 95% CI 279 124 1 0.992 0.00803 0.976 1.000 295 123 1 0.984 0.01131 0.962 1.000 386 121 1 0.976 0.01384 0.949 1.000 ... sex=2 lower upper time n.risk n.event survival std.err 95% CI 95% CI 185 76 1 0.987 0.0131 0.962 1.000 204 75 1 0.974 0.0184 0.938 1.000 210 74 1 0.961 0.0223 0.918 1.000 ... Outline Introduction Objects in R Generic functions Do it yourself Conclusion summary(melanom) > summary(melanom) status days thick sex Min. :1.000 Min. : 10 Min. : 10 Min. :1.000 1st Qu.:1.000 1st Qu.:1525 1st Qu.: 97 1st Qu.:1.000 Median :2.000 Median :2005 Median : 194 Median :1.000 Mean :1.790 Mean :2153 Mean : 292 Mean :1.385 3rd Qu.:2.000 3rd Qu.:3042 3rd Qu.: 356 3rd Qu.:2.000 Max. :3.000 Max. :5565 Max. :1742 Max. :2.000 Outline Introduction Objects in R Generic functions Do it yourself Conclusion plot(melanom) > plot(melanom) Outline Introduction Objects in R Generic functions Do it yourself Conclusion plot(surv.bysex) > plot(surv.bysex) Outline Introduction Objects in R Generic functions Do it yourself Conclusion Objects in R The R Language Definition defines an object as a specialized data structure which can be referred to through symbols or variables The symbols are themselves objects and are accessible to programs. Outline Introduction Objects in R Generic functions Do it yourself Conclusion Built-in R objects The most basic types of objects are vectors: > x <- c(1,2,3) > y <- x-1 > z <- y-1 All elements have the same data type. we can view the contents of vectors: > x;y;z xy z [1] 1 [1] 0 [1] -1 [2] 2 [2] 1 [2] 0 [3] 3 [3] 2 [3] 1 Outline Introduction Objects in R Generic functions Do it yourself Conclusion Built-in R objects Matrices are two-dimensional vectors and can, for example, be built by “stacking” vectors: > mat <- cbind(x,y,z) and we can examine these: > mat xyz [1,] 1 0 -1 [2,] 210 [3,] 321 Arrays are n-dimensional vectors Outline Introduction Objects in R Generic functions Do it yourself Conclusion Built-in R objects Lists are collections of objects of any type: > lst <- list(x,y,mat,z ) and we can examine these: > lst [[1]] [[2]] [1] 123 [1] 012 [[3]] [[4]] x y z [1] -1 0 1 [1,] 1 0 -1 [2,] 21 0 [3,] 32 1 Outline Introduction Objects in R Generic functions Do it yourself Conclusion Structure of the survfit object The basic type of an object is found by the typeof() function: > typeof(surv.bysex) [1] "list" The class of the object with class(): > class(surv.bysex) [1] "survfit" The names of the elements of an object with names(): > names(surv.bysex) [1] "n" "time" "n.risk" "n.event" "n.censor" [6] "surv" "type" "strata" "std.err" "upper" [11] "lower" "conf.type" "conf.int" "call" Outline Introduction Objects in R Generic functions Do it yourself Conclusion Contents of the survfit object > str(surv.bysex,give.attr=FALSE,vec.len=3,give.length=FALSE) List of 14 $ n : int 126 79 $ time : num 99 232 279 295 355 386 469 667 ... $ n.risk : num 126 125 124 123 122 121 120 119 ... $ n.event : num 0 0 1 1 0 1 1 1 ... $ n.censor : num 1 1 0 0 1 0 0 0 ... $ surv : num 1 1 0.992 0.984 ... $ type : chr "right" $ strata : Named int 124 79 $ std.err : num 0 0 0.0081 0.0115 ... $ upper : num 1 1 1 1 ... $ lower : num 1 1 0.976 0.962 ... $ conf.type: chr "log" $ conf.int : num 0.95 $ call : language survfit(formula = Surv(days, status == 1) ~ sex) Outline Introduction Objects in R Generic functions Do it yourself Conclusion Generic functions are simple Generic functions are simple: the complete definition of plot() is a single line: plot <- function (x, y, ...) UseMethod("plot") UseMethod examines the class x of its first argument searches for a function plot.x executes plot.x with the arguments passed to plot If plot.x is not found, UseMethod repeats the search using the next class attribute, if any. If no class-specific function is found, it uses plot.default Outline Introduction Objects in R Generic functions Do it yourself Conclusion Convert polar to cartesian Plotting in R requires cartesian coordinates but we might have data in polar form. We define: a class polar a class cartes a function xypos which takes either polar or cartesian coords as input and returns cartesian Outline Introduction Objects in R Generic functions Do it yourself Conclusion Define classes Create data frames for both cartesian and polar coordinates: > pxy <- data.frame(x=c(1,2,3),y=c(1,2,3)) > class(pxy) <- c(class(pxy),"cartes") > qrt <- data.frame(r=c(sqrt(2),sqrt(8),sqrt(18)),theta=pi/4) > class(qrt) <- c(class(qrt),"polar") These objects both have two classes: > class(pxy);class(qrt) [1] "data.frame" "cartes" [1] "data.frame" "polar" Outline Introduction Objects in R Generic functions Do it yourself Conclusion Define classes We can call the summary function > summary(pxy) xy Min. :1.0 Min. :1.0 1st Qu.:1.5 1st Qu.:1.5 Median :2.0 Median :2.0 Mean :2.0 Mean :2.0 3rd Qu.:2.5 3rd Qu.:2.5 Max. :3.0 Max. :3.0 and see that the method for data frames is called. Outline Introduction Objects in R Generic functions Do it yourself Conclusion Define generic function xypos <- function(z,...) UseMethod("xypos") minimal number of parameters because all parameters specified on the generic must appear on every method use . for passing class-specific parameters Outline Introduction Objects in R Generic functions Do it yourself Conclusion Cartesian method xypos.cartes <- function(z,x="x",y="y",names,...) { xy <- data.frame(z[c(x,y)]) if (!missing(names)) colnames(xy) <- names class(xy) <- c(class(xy),"cartes") return(xy) } Outline Introduction Objects in R Generic functions Do it yourself Conclusion Polar method xypos.polar <- function(z,r="r",theta="theta",names,...) { xy <- data.frame(z[r]*cos(z[theta]),z[r]*sin(z[theta])) if (missing(names) || length(names)==0) names=c("x","y") if (length(names)==1) names[2]="y" colnames(xy) <- names class(xy) <- c(class(xy),"cartes") return(xy) } Outline Introduction Objects in R Generic functions Do it yourself Conclusion Example calls Passing cartesian coordinates: > xypos(pxy) > xypos(pxy,names=(c("U","V"))) xy UV 111 111 222 222 333 333 If we pass in polar co-ordinates: > qrt > xypos(qrt) r theta x y 1 1.414214 0.7853982 1 1 1 2 2.828427 0.7853982 2 2 2 3 4.242641 0.7853982 3 3 3 Outline Introduction Objects in R Generic functions Do it yourself Conclusion Conclusion Generic functions produce a sensible result based on the input object type hide complexity from the user reduce the amount of checking when programming contribute to standardization allow for the extension of existing functions to new objects.