Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Generic Functions in R
Martin Gregory
Merck Serono
PhUSE 2013
Brussels, October 2013
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
1 Introduction
2 Objects in R
3 Generic functions
4 Do it yourself
5 Conclusion
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Preliminary definition
A generic function is one which may be applied to different types
of inputs producing results depending on the type of input.
Examples are plot() and summary()
We demonstrate these differences using sample data from
Dalgaard’s Introductory Statistics with R
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Sample data: melanom
> melanom
no status days ulc thick sex1 789 3 10 1 676 22 13 3 30 2 65 23 97 2 35 2 134 24 16 3 99 2 290 15 21 1 185 1 1208 2...
where
status 1=dead from melanoma, 2=alive, 3=dead from other
causes,
days is the observation time in days,
thick is the tumor thickness in 1/100mm and
sex 1=female, 2=male
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Kaplan-Meier survival function
Estimate the Kaplan-Meier survival function using sex as an
explanatory variable:
> surv.bysex <- survfit(Surv(days,status==1)~sex)
Surv() converts status to a suitable censoring variable
survfit() calculates the K-M survival function and associated statistics
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
View the survival function
> summary(surv.bysex)Call: survfit(formula = Surv(days, status == 1) ~ sex)
sex=1lower upper
time n.risk n.event survival std.err 95% CI 95% CI279 124 1 0.992 0.00803 0.976 1.000295 123 1 0.984 0.01131 0.962 1.000386 121 1 0.976 0.01384 0.949 1.000...
sex=2lower upper
time n.risk n.event survival std.err 95% CI 95% CI185 76 1 0.987 0.0131 0.962 1.000204 75 1 0.974 0.0184 0.938 1.000210 74 1 0.961 0.0223 0.918 1.000...
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
summary(melanom)
> summary(melanom)status days thick sex
Min. :1.000 Min. : 10 Min. : 10 Min. :1.0001st Qu.:1.000 1st Qu.:1525 1st Qu.: 97 1st Qu.:1.000Median :2.000 Median :2005 Median : 194 Median :1.000Mean :1.790 Mean :2153 Mean : 292 Mean :1.3853rd Qu.:2.000 3rd Qu.:3042 3rd Qu.: 356 3rd Qu.:2.000Max. :3.000 Max. :5565 Max. :1742 Max. :2.000
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
plot(melanom)
> plot(melanom)
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
plot(surv.bysex)
> plot(surv.bysex)
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Objects in R
The R Language Definition defines an object as a
specialized data structure which can be referred tothrough symbols or variables
The symbols are themselves objects and are accessible to programs.
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Built-in R objects
The most basic types of objects are vectors:
> x <- c(1,2,3)> y <- x-1> z <- y-1
All elements have the same data type.
we can view the contents of vectors:
> x;y;zx y z
[1] 1 [1] 0 [1] -1[2] 2 [2] 1 [2] 0[3] 3 [3] 2 [3] 1
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Built-in R objects
Matrices are two-dimensional vectors and can, for example, be
built by “stacking” vectors:
> mat <- cbind(x,y,z)
and we can examine these:
> matx y z
[1,] 1 0 -1[2,] 2 1 0[3,] 3 2 1
Arrays are n-dimensional vectors
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Built-in R objects
Lists are collections of objects of any type:
> lst <- list(x,y,mat,z )
and we can examine these:
> lst
[[1]] [[2]][1] 1 2 3 [1] 0 1 2
[[3]] [[4]]x y z [1] -1 0 1
[1,] 1 0 -1[2,] 2 1 0[3,] 3 2 1
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Structure of the survfit object
The basic type of an object is found by the typeof() function:
> typeof(surv.bysex)[1] "list"
The class of the object with class():
> class(surv.bysex)[1] "survfit"
The names of the elements of an object with names():
> names(surv.bysex)[1] "n" "time" "n.risk" "n.event" "n.censor"[6] "surv" "type" "strata" "std.err" "upper"
[11] "lower" "conf.type" "conf.int" "call"
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Contents of the survfit object
> str(surv.bysex,give.attr=FALSE,vec.len=3,give.length=FALSE)List of 14$ n : int 126 79$ time : num 99 232 279 295 355 386 469 667 ...$ n.risk : num 126 125 124 123 122 121 120 119 ...$ n.event : num 0 0 1 1 0 1 1 1 ...$ n.censor : num 1 1 0 0 1 0 0 0 ...$ surv : num 1 1 0.992 0.984 ...$ type : chr "right"$ strata : Named int 124 79$ std.err : num 0 0 0.0081 0.0115 ...$ upper : num 1 1 1 1 ...$ lower : num 1 1 0.976 0.962 ...$ conf.type: chr "log"$ conf.int : num 0.95$ call : language
survfit(formula = Surv(days, status == 1) ~ sex)
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Generic functions are simple
Generic functions are simple: the complete definition of plot() is a
single line:
plot <- function (x, y, ...) UseMethod("plot")
UseMethod
examines the class x of its first argument
searches for a function plot.x
executes plot.x with the arguments passed to plot
If plot.x is not found, UseMethod repeats the search using the next
class attribute, if any. If no class-specific function is found, it uses
plot.default
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Convert polar to cartesian
Plotting in R requires cartesian coordinates but we might have
data in polar form.
We define:
a class polar
a class cartes
a function xypos which takes either polar or cartesian coords
as input and returns cartesian
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Define classes
Create data frames for both cartesian and polar coordinates:
> pxy <- data.frame(x=c(1,2,3),y=c(1,2,3))> class(pxy) <- c(class(pxy),"cartes")
> qrt <- data.frame(r=c(sqrt(2),sqrt(8),sqrt(18)),theta=pi/4)> class(qrt) <- c(class(qrt),"polar")
These objects both have two classes:
> class(pxy);class(qrt)[1] "data.frame" "cartes"[1] "data.frame" "polar"
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Define classes
We can call the summary function
> summary(pxy)x y
Min. :1.0 Min. :1.01st Qu.:1.5 1st Qu.:1.5Median :2.0 Median :2.0Mean :2.0 Mean :2.03rd Qu.:2.5 3rd Qu.:2.5Max. :3.0 Max. :3.0
and see that the method for data frames is called.
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Define generic function
xypos <- function(z,...) UseMethod("xypos")
minimal number of parameters because all parameters
specified on the generic must appear on every method
use . . . for passing class-specific parameters
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Cartesian method
xypos.cartes <- function(z,x="x",y="y",names,...) {xy <- data.frame(z[c(x,y)])if (!missing(names)) colnames(xy) <- namesclass(xy) <- c(class(xy),"cartes")return(xy)
}
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Polar method
xypos.polar <- function(z,r="r",theta="theta",names,...) {xy <- data.frame(z[r]*cos(z[theta]),z[r]*sin(z[theta]))if (missing(names) || length(names)==0) names=c("x","y")if (length(names)==1) names[2]="y"colnames(xy) <- namesclass(xy) <- c(class(xy),"cartes")return(xy)
}
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Example calls
Passing cartesian coordinates:
> xypos(pxy) > xypos(pxy,names=(c("U","V")))x y U V
1 1 1 1 1 12 2 2 2 2 23 3 3 3 3 3
If we pass in polar co-ordinates:
> qrt > xypos(qrt)r theta x y
1 1.414214 0.7853982 1 1 12 2.828427 0.7853982 2 2 23 4.242641 0.7853982 3 3 3
Outline Introduction Objects in R Generic functions Do it yourself Conclusion
Conclusion
Generic functions
produce a sensible result based on the input object type
hide complexity from the user
reduce the amount of checking when programming
contribute to standardization
allow for the extension of existing functions to new objects