data visualisation using r
Embed Size (px)
TRANSCRIPT
-
8/10/2019 Data Visualisation Using R
1/69
Data Visualization Using R
Sameer Bamnote
Cytel Statistical Software & Services Pvt. Ltd., India
July 19, 2014
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 1 / 41
-
8/10/2019 Data Visualisation Using R
2/69
Disclaimer
Any views or opinions presented in this presentation are solely those of theauthor and do not necessarily represent those of the company.
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 2 / 41
-
8/10/2019 Data Visualisation Using R
3/69
Outline
1 Introduction
2
About R3 Graphics in R
Base graphicsGrid graphicsLattice graphicsggplot2 graphics
4 Illustrations
Mutiple figures in Single PanelErrorbar plotKaplan Meier Survival CurveForest plot using R package metaforForest plot using R package ggplot2
5 Summary
6 Conclusion
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 3 / 41
-
8/10/2019 Data Visualisation Using R
4/69
Introduction
Data visualization is the study of the visual representation of data.
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 4 / 41
-
8/10/2019 Data Visualisation Using R
5/69
Introduction
Data visualization is the study of the visual representation of data.
Presentation of data in a pictorial or graphical format.
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 4 / 41
-
8/10/2019 Data Visualisation Using R
6/69
Introduction
Data visualization is the study of the visual representation of data.
Presentation of data in a pictorial or graphical format.
Data in spreadsheets is hard to visualize.
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 4 / 41
-
8/10/2019 Data Visualisation Using R
7/69
Introduction
Data visualization is the study of the visual representation of data.
Presentation of data in a pictorial or graphical format.
Data in spreadsheets is hard to visualize.
It helps to see analytical results presented visually, find relevanceamong the millions of variables, communicate concepts andhypotheses to others, and even predict the future.
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 4 / 41
-
8/10/2019 Data Visualisation Using R
8/69
Introduction
Data visualization is the study of the visual representation of data.
Presentation of data in a pictorial or graphical format.
Data in spreadsheets is hard to visualize.
It helps to see analytical results presented visually, find relevanceamong the millions of variables, communicate concepts andhypotheses to others, and even predict the future.
Data visualization makes interpretation easier.
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 4 / 41
-
8/10/2019 Data Visualisation Using R
9/69
About R
R is a opensource software.
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 5 / 41
-
8/10/2019 Data Visualisation Using R
10/69
About R
R is a opensource software.
Available for free and supported by strong user community.
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 5 / 41
-
8/10/2019 Data Visualisation Using R
11/69
About R
R is a opensource software.
Available for free and supported by strong user community.
The current R is the result of a collaborative effort with contributionsfrom all over the world.
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 5 / 41
-
8/10/2019 Data Visualisation Using R
12/69
About R
R is a opensource software.
Available for free and supported by strong user community.
The current R is the result of a collaborative effort with contributionsfrom all over the world.
Developed by Ross Ihaka and Robert Gentelman at University ofAuckland.
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 5 / 41
-
8/10/2019 Data Visualisation Using R
13/69
About R
R is a opensource software.
Available for free and supported by strong user community.
The current R is the result of a collaborative effort with contributionsfrom all over the world.
Developed by Ross Ihaka and Robert Gentelman at University ofAuckland.
Highly extensible and flexible.
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 5 / 41
-
8/10/2019 Data Visualisation Using R
14/69
About R
R is a opensource software.
Available for free and supported by strong user community.
The current R is the result of a collaborative effort with contributionsfrom all over the world.
Developed by Ross Ihaka and Robert Gentelman at University ofAuckland.
Highly extensible and flexible.
Implementation of modern statistical methods.
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 5 / 41
-
8/10/2019 Data Visualisation Using R
15/69
Graphics in R
R is a powerful environment for data visualization
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 6 / 41
-
8/10/2019 Data Visualisation Using R
16/69
Graphics in R
R is a powerful environment for data visualization
Integrated graphics
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 6 / 41
-
8/10/2019 Data Visualisation Using R
17/69
Graphics in R
R is a powerful environment for data visualization
Integrated graphics
Good quality of graphics
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 6 / 41
-
8/10/2019 Data Visualisation Using R
18/69
Graphics in R
R is a powerful environment for data visualization
Integrated graphics
Good quality of graphics
Full control over graphics
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 6 / 41
-
8/10/2019 Data Visualisation Using R
19/69
Graphics in R
R is a powerful environment for data visualization
Integrated graphics
Good quality of graphics
Full control over graphics
Can be reproduced
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 6 / 41
-
8/10/2019 Data Visualisation Using R
20/69
Graphics in R
R is a powerful environment for data visualization
Integrated graphics
Good quality of graphics
Full control over graphics
Can be reproduced
Huge number of R packages for graphics are available
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 6 / 41
-
8/10/2019 Data Visualisation Using R
21/69
Graphics in R
R is a powerful environment for data visualization
Integrated graphics
Good quality of graphics
Full control over graphics
Can be reproduced
Huge number of R packages for graphics are availableR Graphs can be viewed on screen
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 6 / 41
-
8/10/2019 Data Visualisation Using R
22/69
Graphics in R
R is a powerful environment for data visualization
Integrated graphics
Good quality of graphics
Full control over graphics
Can be reproduced
Huge number of R packages for graphics are availableR Graphs can be viewed on screen
R graphs can be saved in various formats likepdf/png/jpg/wmf/tiff/ps etc.
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 6 / 41
-
8/10/2019 Data Visualisation Using R
23/69
Graphical Environments in R
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 7 / 41
-
8/10/2019 Data Visualisation Using R
24/69
Graphical Environments in R
Low-Level Graphics
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 7 / 41
-
8/10/2019 Data Visualisation Using R
25/69
Graphical Environments in R
Low-Level Graphics
Base graphics
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 7 / 41
-
8/10/2019 Data Visualisation Using R
26/69
Graphical Environments in R
Low-Level Graphics
Base graphics
Grid graphics
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 7 / 41
-
8/10/2019 Data Visualisation Using R
27/69
Graphical Environments in R
Low-Level Graphics
Base graphics
Grid graphics
High-Level Graphics
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 7 / 41
-
8/10/2019 Data Visualisation Using R
28/69
Graphical Environments in R
Low-Level Graphics
Base graphics
Grid graphics
High-Level Graphics
lattice
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 7 / 41
-
8/10/2019 Data Visualisation Using R
29/69
Graphical Environments in R
Low-Level Graphics
Base graphics
Grid graphics
High-Level Graphics
lattice
ggplot2
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 7 / 41
-
8/10/2019 Data Visualisation Using R
30/69
Base graphics
Plotting functions:plot - x-y plotting
barplot - bar plots
boxplot - box & whisker plot
hist - histogram
pie - pie charts
dotchart - dot plotsimage, heatmap, contour, persp - functions to generate image likeplots
qqnorm, qqline, qqplot - distribution comparison plots
pairs, coplot - display of multivariate data
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 8 / 41
-
8/10/2019 Data Visualisation Using R
31/69
Scatter plot: Basic
R code> set.seed(1234)
> y plot(y[,1], y[,2])
0.0 0.2 0.4 0.6 0.8
0.2
0.
4
0.
6
0.
8
y[, 1]
y[,
2]
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 9 / 41
-
8/10/2019 Data Visualisation Using R
32/69
Scatter plot: Pairs
R code> pairs(y)
A
0.2 0.4 0.6 0.8
0.0
0.2
0.4
0.6
0.8
0.2
0.4
0.6
0.8
B
0.0 0.2 0.4 0.6 0.8
0.2 0.4 0.6 0.8
0.2
0.4
0.6
0.8
C
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 10 / 41
-
8/10/2019 Data Visualisation Using R
33/69
Scatter plot: with label
R code> plot(y[,1], y[,2], pch=20, col="red", main="Symbols and Labels")
> text(y[,1]+0.03, y[,2], rownames(y))
0.0 0.2 0.4 0.6 0.8
0.
2
0.
4
0.
6
0.
8
Symbols and Labels
y[, 1]
y[,
2]
a
b
c
d
e
f
gh
i
j
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 11 / 41
-
8/10/2019 Data Visualisation Using R
34/69
Scatter plot: More Examples
Using important plotting parameters
> grid(5, 5, lwd = 2)
> op plot(y[,1], y[,2], type="p", col="red", cex.lab=1.2, cex.axis=1.2,
+ cex.main=1.2, cex.sub=1, lwd=4, pch=20, xlab="x label",
+ ylab="y label", main="My Main", sub="My Sub")
> par(op)
Used argumentsmar- species the margin sizes around the plotting area: c(bottom,left, top, right)
col- color of symbols
pch- type of symbols, samples: example(points)
lwd- size of symbols
cex.- control font sizes
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 12 / 41
-
8/10/2019 Data Visualisation Using R
35/69
Scatter plot: More examples
Adding a regression line
> plot(y[,1], y[,2])
> myline summary(myline)
Plot on log scale> plot(y[,1], y[,2], log="xy")
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 13 / 41
-
8/10/2019 Data Visualisation Using R
36/69
Line plot
R code> plot(y[,1], type="l", lwd=2, col="blue")
2 4 6 8 10
0.
0
0.2
0.
4
0.
6
0.
8
Index
y[,
1]
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 14 / 41
-
8/10/2019 Data Visualisation Using R
37/69
Line plot with multiple variables
R code> plot(y[,1], ylim=c(0,1),xlab="Measurement",ylab="Intensity",type="l",lwd=2,col=1)
> for(i in 2:length(y[1,])) {+ screen(1, new=FALSE)
+ plot(y[,i], ylim=c(0,1), type="l", lwd=2, col=i, xaxt="n", yaxt="n", ylab="",
+ xlab="", main="", bty="n")
+ }
2 4 6 8 10
0.
0
0.2
0.
4
0.6
0.
8
1.0
Measurement
Intensity
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 15 / 41
-
8/10/2019 Data Visualisation Using R
38/69
More Functions in Base graphics
Graph Type R function
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 16 / 41
-
8/10/2019 Data Visualisation Using R
39/69
More Functions in Base graphics
Graph Type
Barplot
R function
barplot
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 16 / 41
-
8/10/2019 Data Visualisation Using R
40/69
More Functions in Base graphics
Graph Type
Barplot
Histogram
R function
barplot
hist
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 16 / 41
-
8/10/2019 Data Visualisation Using R
41/69
More Functions in Base graphics
Graph Type
Barplot
Histogram
Density plot
R function
barplot
hist
density
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 16 / 41
-
8/10/2019 Data Visualisation Using R
42/69
More Functions in Base graphics
Graph Type
Barplot
Histogram
Density plotPie Chart
R function
barplot
hist
densitypie
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 16 / 41
-
8/10/2019 Data Visualisation Using R
43/69
Grid Graphics Environment
What is grid?
Low-level graphics system
Highly exible and controllable system
Does not provide high-level functions
Intended as development environment for custom plotting functions
Pre-installed on new R distributions
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 17 / 41
-
8/10/2019 Data Visualisation Using R
44/69
Lattice
What is lattice?
High-level graphics system
Developed by Deepayan Sarkar
Implements Trellis graphics system from S-Plus
Simplies high-level plotting tasks, arranging complex graphical features
Syntax similar to Rs base graphics
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 18 / 41
-
8/10/2019 Data Visualisation Using R
45/69
Lattice plot example
R code> library(lattice)
> p1 plot(p1)
1:8
1:8
2
4
6
8
A
2 4 6 8
B
2 4 6 8
C
2
4
6
8
D
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 19 / 41
-
8/10/2019 Data Visualisation Using R
46/69
Lattice plot: Density plot
R code> data(Chem97, package = "mlmRev")
> densityplot(~ gcsescore | factor(score),data= Chem97, groups = gender,+ plot.points = FALSE, auto.key = TRUE)
gcsescore
Density
0.0
0.2
0.4
0.6
0.8
0 2 4 6 8
0 2
0 2 4 6 8
4
6
0 2 4 6 8
8
0.0
0.2
0.4
0.6
0.8
10
MF
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 20 / 41
-
8/10/2019 Data Visualisation Using R
47/69
ggplot2
What is ggplot2?
High-level graphics system
Implements grammar of graphics from Leland Wilkinson
Streamlines many graphics workows for complex plots
Syntax centered around main ggplot function
Simpler qplot function provides many shortcuts
ggplot function accepts two arguments (1)Data set to be plotted(2)Aesthetic mappings provided by aes function
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 21 / 41
-
8/10/2019 Data Visualisation Using R
48/69
qplot function
qplot syntax is similar to Rs basic plot function
Arguments:I x: x-coordinates (e.g. col1)I y: y-coordinates (e.g. col2)I data: data frame with corresponding column namesI xlim, ylim: e.g. xlim=c(0,10)I
log: e.g. log="x" or log="xy"I main: main title; see ?plotmath for mathematical formulaI xlab, ylab: labels for the x- and y-axesI color, shape, sizeI ...: many arguments accepted by plot function
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 22 / 41
-
8/10/2019 Data Visualisation Using R
49/69
qplot: Scatterplot
R code> p print(p)
Dot Size and Color Relative to Some Values
x
y
2
4
6
8
10
!
!
!
!
!
!
!
!
!
!
2 4 6 8 10
cat
! A
! B
x
! 2
! 4
! 6
! 8
! 10
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 23 / 41
-
8/10/2019 Data Visualisation Using R
50/69
qplot:Scatterplot with Regression Line
R code> set.seed(1234)
> dsmall p print(p)
carat
price
0
5000
10000
15000
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
! !!
!
!
! !!!!
!
!
!
!
!
! !
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!!!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
! !
!
!!
!
!
!
!
!
! !!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!!
! !!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!!
!
! !
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!!
!
!
!!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
! !
!
!!
!
! !
!
!
!
!
!
!!
!
!!
! ! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!!
!
!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !!
!
!!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
0.5 1.0 1.5 2.0 2.5
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 24 / 41
-
8/10/2019 Data Visualisation Using R
51/69
ggplot: Scatterplot
R code> p print(p)
carat
price
5000
10000
15000
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!!
!
!
!!
!
!
!
!
! !!
!!
!!!
!
!
!
!
!
!!
!
!
!!
!
!
!!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!!
!! !
!
!!
!
!
!!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!
! !
!
!
!
!!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!!!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!! !!!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!!!
! !
!!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
! !
!
!
!!
!!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!!! !
! !!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!!
!!!! !!
!
!
!
!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!!
!
!
!
!!
!
!
! !
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!!
!
! !
!
!
!
!
!
!!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
! !!
!
!!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!! !
!
!!
!
!
!
!
!
!
!!
! !
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!!
!
!
!! !
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!!!
!
!
!
!
!
!
!
! !!!!
!
!!!
!
!!
!
!
!
!
!
!!!
!
! !!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
! !!
!!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!! !
!
!
!
!!!
!
!
!!
!
!!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
! !
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!!!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
! !
!!!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!
!
!
!!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
! !!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
0.5 1.0 1.5 2.0 2.5
color
! D
! E
! F
! G
! H
! I
! J
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 25 / 41
-
8/10/2019 Data Visualisation Using R
52/69
ggplot: Scatterplot with Regression Line
R code> p print(p)
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 26 / 41
-
8/10/2019 Data Visualisation Using R
53/69
ggplot: Jitter Plot
R code> p print(p)
color
price/carat
2000
4000
6000
8000
10000
D E F G H I J
color
D
E
F
G
H
I
J
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 27 / 41
-
8/10/2019 Data Visualisation Using R
54/69
Multiple Figures in Single Panel
Requirement - Multiple figures in a single panel
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 28 / 41
-
8/10/2019 Data Visualisation Using R
55/69
Multiple Figures in Single Panel
Requirement - Multiple figures in a single panel
o
o
o
o
o
o
oo
o
oo
oo
o
oo
o
oo
o
o
o
o
o
oo
o
o
o
o
o
o
o
oo
o
o
o
oo
0.0
0.2
0.4
0.6
0.8
Plot 1
Treatment
IndividualValues
Trt A Trt B
o
o
o
o
o
o
oo
o
oo
oo
o
oo
o
oo
o
o
o
o
o
oo
o
o
o
o
o
o
o
oo
o
o
o
oo
0.0
0.2
0.4
0.6
0.8
Plot 2
Treatment
IndividualValues
Trt A Trt B
o
o
o
o
o
o
oo
o
oo
oo
o
o o
o
oo
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o o
0.0
0.2
0.4
0.6
0.8
Plot 3
Treatment
IndividualValues
Trt A Trt B
o
o
o
o
o
o
oo
o
oo
oo
o
oo
o
oo
o
o
o
o
o
o o
o
o
o
o
o
o
o
o
o
o
o
o o
0.0
0.2
0.4
0.6
0.8
Plot 4
Treatment
IndividualValues
Trt A Trt B
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 28 / 41
-
8/10/2019 Data Visualisation Using R
56/69
Multiple graphs in Single Panel
R Codepdf("D:/Example1.pdf")
par(mfrow=c(2,2)) # Divide the plotting area
#
plot(b,a, xaxt = n, xlab = Treatment,ylab = Individual Values,
xlim=c(1,5),main = "Plot 1",cex=1.5,col=blue, pch = o)
axis(1,at=c(2,4),labels = c(Trt A, Trt B))
#
plot(jitter(b,amount=0.2),a, xaxt = n, xlab = Treatment,ylab =Individual Values, xlim=c(1,5),main = "Plot 2", cex=1.5,col=blue,
pch = o)
axis(1,at=c(2,4),labels = c(Trt A, Trt B))
#
#
dev.off()
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 29 / 41
-
8/10/2019 Data Visualisation Using R
57/69
Errorbar Plot
Using an inbuilt data in R named ToothGrowth whose variables arerenamed as rep, trt and time.
Errorbar plot using lineplot.CI from sciplot package in R
5
10
15
20
25
30
Time (Hours)
Value(MEAN
+/!
SE)
!
!
!
!
!
!
!
!
Trt A
Trt B
0.5 1 2
Errorbar plot using lineplot.CI function in sciplot package
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 30 / 41
-
8/10/2019 Data Visualisation Using R
58/69
Errorbar plot using sciplot package
R Codelineplot.CI(time, rep, group = trt, data = tg, cex = 1.5,
xlab = "Time (Hours)", ylab = "Value (MEAN +/- SE)",
cex.lab = 1.3, x.leg = 1,y.leg=30,col = c("red","dark green"),
pch = c(16,16),ylim=c(5,30), err.width = 0.05, xaxt = n,lwd=2)
axis(1,at=c(1,2,3),labels=c(0.5, 1, 2))
title("Errorbar plot using lineplot.CI function in sciplot package")
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 31 / 41
-
8/10/2019 Data Visualisation Using R
59/69
-
8/10/2019 Data Visualisation Using R
60/69
Errorbar plot using ggplot2 package
Errorbar plot using ggplot2 package in R
!
!
!
!
!
!
5
10
15
20
25
30
0.5 1.0 1.5 2.0
Time (Hours)
Value(Mean+/!S
E)
trt
!
!
Trt A
Trt B
Errorbar plot in ggplot2 package
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 32 / 41
-
8/10/2019 Data Visualisation Using R
61/69
Errorbar plot using ggplot2 package
R Codeggplot(summary, aes(x=time, y=rep, colour=trt)) +
geom_errorbar(aes(ymin=rep-se, ymax=rep+se), width=.05, lwd = 0.8) +
geom_line(lwd=0.8) +
geom_point(cex=3) + xlab("Time (Hours)") + ylab ("Value (Mean +/- SE)")+
ggtitle("Errorbar plot in ggplot2 package")+
theme_bw()+
scale_y_continuous(limits=c(5,30), breaks=0:30*5)
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 33 / 41
-
8/10/2019 Data Visualisation Using R
62/69
Errorbar plot using ggplot2 package
R Codepd
-
8/10/2019 Data Visualisation Using R
63/69
Errorbar plot using ggplot2 package
!
!
!
!
! !
5
10
15
20
25
30
0.5 1.0 1.5 2.0
Time (Hours)
V
alue(Mean+/!S
E)
trt
!
!
Trt A
Trt B
Errorbar plot with Jittering in ggplot2 package
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 35 / 41
-
8/10/2019 Data Visualisation Using R
64/69
Comparison of Kaplan Meier Survival Curves
R Code> library(ISwR)
> fit.bysex plot(fit.bysex,conf.int=TRUE,col=c("red","blue"),lty=1:2,main =Comparing Sur ...
> legend(1000,0.2, c("Male","Female"),lty=c(2,1),lwd=c(1,1),col=c("blue","red"))
0 1000 2000 3000 4000 5000
0.
0
0.
2
0.
4
0.6
0.
8
1.
0
Comparing Survival Curves by Sex
Male
Female
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 36 / 41
-
8/10/2019 Data Visualisation Using R
65/69
Forest plot using R package metafor
RE Model for All Studies
0.05 0.25 1.00 4.00
Relative Risk
Comstock et al, 1976
Comstock & Webster, 1969
Comstock et al, 1974
Rosenthal et al, 1961
Coetzee & Berjak, 1968
TPT Madras, 1980
Vandiviere et al, 1973
Hart & Sutherland, 1977
Rosenthal et al, 1960
Ferguson & Simes, 1949
Aronson, 1948
Stein & Aronson, 1953
Frimodt!Moller et al, 1973
27
5
186
17
29
505
8
62
3
6
4
180
33
16886
2493
50448
1699
7470
87886
2537
13536
228
300
119
1361
5036
29
3
141
65
45
499
10
248
11
29
11
372
47
17825
2338
27197
1600
7232
87892
619
12619
209
274
128
1079
5761
0.98 [ 0.58 , 1.66 ]
1.56 [ 0.37 , 6.53 ]
0.71 [ 0.57 , 0.89 ]
0.25 [ 0.15 , 0.43 ]
0.63 [ 0.39 , 1.00 ]
1.01 [ 0.89 , 1.14 ]
0.20 [ 0.08 , 0.50 ]
0.24 [ 0.18 , 0.31 ]
0.26 [ 0.07 , 0.92 ]
0.20 [ 0.09 , 0.49 ]
0.41 [ 0.13 , 1.26 ]
0.46 [ 0.39 , 0.54 ]
0.80 [ 0.52 , 1.25 ]
0.49 [ 0.34 , 0.70 ]
Systematic Allocation
Random Allocation
Alternate Allocation
TB+ TB! TB+ TB!
Vaccinated Control
Author(s) and Year Relative Risk [95% CI]
0.65 [ 0.32 , 1.32 ]RE Model for Subgroup
0.38 [ 0.22 , 0.65 ]RE Model for Subgroup
0.58 [ 0.34 , 1.01 ]RE Model for Subgroup
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 37 / 41
-
8/10/2019 Data Visualisation Using R
66/69
Forest plot using R package ggplot2
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 38 / 41
-
8/10/2019 Data Visualisation Using R
67/69
Summary
R is useful in Data visualization
Various graphics environment in R
R can be used to explore data and create various graphics
Various R functions from various packages for same purpose
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 39 / 41
-
8/10/2019 Data Visualisation Using R
68/69
Conclusion
From examples we have seen, how graphs generated in R are good interms of appearance, time taken for coding and overall quality of thegraph. So, R can be an ideal choice for data visualization (exploring) andcreating graphs.
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 40 / 41
-
8/10/2019 Data Visualisation Using R
69/69
Thanks for your attention!
Sameer Bamnote (PhUSE SDE 2014) Data Visualization Using R July 19, 2014 41 / 41