mixed models in r using the lme4 package part 2: lattice...

33
Presenting data xyplot densityplot dotplot Mixed models in R using the lme4 package Part 2: Lattice graphics Douglas Bates University of Wisconsin - Madison and R Development Core Team <[email protected]> University of Lausanne July 1, 2009

Upload: others

Post on 20-May-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Mixed models in R using the lme4 packagePart 2: Lattice graphics

Douglas Bates

University of Wisconsin - Madisonand R Development Core Team

<[email protected]>

University of LausanneJuly 1, 2009

Page 2: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Outline

Presenting data

Scatter plots

Histograms and density plots

Box-and-whisker plots and dotplots

Page 3: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Outline

Presenting data

Scatter plots

Histograms and density plots

Box-and-whisker plots and dotplots

Page 4: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Outline

Presenting data

Scatter plots

Histograms and density plots

Box-and-whisker plots and dotplots

Page 5: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Outline

Presenting data

Scatter plots

Histograms and density plots

Box-and-whisker plots and dotplots

Page 6: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Outline

Presenting data

Scatter plots

Histograms and density plots

Box-and-whisker plots and dotplots

Page 7: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Exploring and presenting data

• When possible, use graphical presentations of data. Timespend creating informative graphical displays is well invested.

• Ron Snee, a friend who spent his career as a statisticalconsultant for DuPont, once said, “Whenever I am writing areport, the most important conclusion I want to communicateis always presented as a graphic and shown early in the report.On the other hand, if there is a conclusion I feel obligated toinclude but would prefer people not notice, I include it as atable.”

• One of the strengths of R is its graphics capabilities.

• There are several styles of graphics in R. The style inDeepayan Sarkar’s lattice package is well-suited to the type ofdata we will be discussing.

• Deepayan’s book, Lattice: Multivariate Data Visualizationwith R (Springer, 2008) provides in-depth documentation andexplanations of lattice graphics.

Page 8: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

The formula/data method of specifying graphics

• The first two arguments to lattice graphics functions areusually formula and data.

• This specification is used in model-fitting functions (lm, aov,lmer, ...) and in other functions such as xtabs.

• The formula incorporates a tilde, (∼), character. A one-sidedformula specifies the value on the x-axis. A two-sided formulaspecifies the x and y axes.

• The second argument, data, is usually the name of a dataframe.

• Many optional arguments are available. Ones that we will usefrequently allow for labeling axes (xlab, ylab), and controllingthe type of information, type.

• The lattice package is not attached by default. You mustenter library(lattice) before you can use lattice functions.

Page 9: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Outline

Presenting data

Scatter plots

Histograms and density plots

Box-and-whisker plots and dotplots

Page 10: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

A simple scatterplot in lattice

> xyplot(optden ~ carb, Formaldehyde)

carb

optd

en

0.2

0.4

0.6

0.8

0.2 0.4 0.6 0.8

Page 11: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Scatterplots in lattice

• A scatter plot is the most versatile plot in applied statistics. Itis simply a plot of a numeric response, y, versus a numericcovariate, x.

• The lattice function xyplot produces scatter plots. I typicallyspecify type = c("g","p") requesting a background grid inaddition to the plotted points.

• The type argument takes a selection from

”p” points”g” background grid”l” lines

”b” both points and lines”r” reference (or “regression”) straight line

”smooth” scatter-plot smoother lines

• In evaluating a scatterplot the aspect ratio (ratio of verticalsize to horizontal size) can be important. In particular,differences in slopes are most apparent near 45o.

Page 12: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

General principles of lattice graphics

• The formula is of the form ∼x or y∼x or y∼x | f where x isthe variable on the x axis (usually continuous), y is thevariable on the y axis and f is a factor that determines thepanels.

• Titles can be added with xlab, ylab, main and sub. Titles canbe character strings or, more generally, expressions that allowfor special characters, subscripts, superscripts, etc. Seehelp(plotmath) for details.

• The groups argument, if used, specifies different point stylesand different line styles for each level of the group. If lines arecalculated, each group has separate lines.

• If groups is used, we usually also use auto.key to add a keyrelating the line or point styles to the groups.

• The layout specifies the number of columns and rows ofpanels.

Page 13: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

An enhanced scatterplot of the Formaldehyde data

Amount of carbohydrate (ml)

Opt

ical

den

sity

0.2

0.4

0.6

0.8

0.2 0.4 0.6 0.8

Page 14: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Saving plots

• I recommend using the facilities in the R application to saveplots and transcripts.

• To save a plot, ensure that the graphics window is active anduse the menu item File→Save To Clipboard→Windows

Metafile. (On a Mac, save as PDF.) Then switch to a wordprocessor and paste the figure.

• Adjust the aspect ratio of the graphics window to suit thepasted version of the plot before you copy the graphic.

• Those who want more control (and less cutting and pasting)could consider the Sweave system or the odfWeave package.

Page 15: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Outline

Presenting data

Scatter plots

Histograms and density plots

Box-and-whisker plots and dotplots

Page 16: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Histograms and density plots• A histogram is a type of bar plot created from dividing

numeric data into adjacent bins (typically having equal width).• The purpose of a histogram is to show the distribution or

density of the observations. It is almost never a good way ofdoing this.

• A densityplot is a better way of showing the density or, evenbetter, comparing the densities of observations associatedwith different groups. Also, densityplots for different groupscan be overlaid.

• If you have only a few observations you may want to use acomparative box-and-whisker plot (bwplot) or a comparativedotplot instead. Density plots based on a small number ofobservations tend to be rather “lumpy”.

• If the data are bounded, perhaps because the data must bepositive, a density plot can blur the boundary. However, thismay indicate that the data are more meaningfully representedon another scale.

Page 17: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Histogram of the InsectSprays data> histogram(~count, InsectSprays)

count

Per

cent

of T

otal

0

5

10

15

20

25

30

0 5 10 15 20 25

Page 18: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Density plot of the InsectSprays data> densityplot(~count, InsectSprays)

count

Den

sity

0.00

0.01

0.02

0.03

0.04

0.05

0.06

−10 0 10 20 30

●● ●●●

●● ●● ●●●● ● ●

● ●● ●● ●●

●●● ● ●

● ●●●

● ●● ●●● ● ●

●●● ●●●●● ●

●●●●● ●●

● ●● ●● ●●

●●● ●

●●

●●

●●

Page 19: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Density plot of the square root of the count> densityplot(~sqrt(count), InsectSprays, xlab = "Square root of count")

Square root of count

Den

sity

0.00

0.05

0.10

0.15

0.20

0.25

0 2 4 6

●●●●

●●● ●● ●●●

● ● ●●●● ●● ●

●● ●● ● ●● ●

●●● ●

●● ●● ● ●●●● ●

●●●●

●●●● ●●

●●●

●●

●●●

● ●●● ●

●● ●●●●

Page 20: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Density plot of the square root with fancy label> densityplot(~sqrt(count), InsectSprays, xlab = expression(sqrt("count")))

count

Den

sity

0.00

0.05

0.10

0.15

0.20

0.25

0 2 4 6

●● ●●●●

● ●●

●●●● ● ●● ●● ●● ● ●●

●●

● ●● ●● ●● ●●●

●● ● ●●●●●●●●● ●●●

● ●● ●●● ●● ●● ●●

●●● ●●● ●●●●

Page 21: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Comparative density plot of square root> densityplot(~sqrt(count), InsectSprays, groups = spray,+ auto.key = list(columns = 6))

count

Den

sity

0.0

0.5

1.0

1.5

0 2 4 6

●● ●●●●●

●●

●●●

● ● ●● ●● ●● ●

●●

●● ●●● ●● ●● ●●

● ●● ● ●●●● ●●●●

● ●● ●●

●● ●●● ●● ●● ●●● ●● ●●

● ●●●●

A B C D E F

Page 22: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Comparative density plot, separate panels> densityplot(~sqrt(count) | spray, InsectSprays, layout = c(1,+ 6))

count

Den

sity

0.00.51.01.5

0 2 4 6

●● ●●●●● ●● ●●●

A0.00.51.01.5

● ● ●● ●● ●● ● ●● ●

B0.00.51.01.5

● ● ●● ●● ●● ●● ● ●

C0.00.51.01.5

● ● ●●●● ●●●●● ●

D0.00.51.01.5

● ●● ●● ●●● ●● ●●

E0.00.51.01.5

●● ● ●● ●●● ●●●●

F

Page 23: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Comparative density plot, separate panels, strip at left> densityplot(~sqrt(count) | spray, InsectSprays, layout = c(1,+ 6), strip = FALSE, strip.left = TRUE)

count

Den

sity

0.0

0.5

1.0

1.5

0 2 4 6

●● ●●●●● ●● ●●●

A

0.0

0.5

1.0

1.5

● ● ●● ●● ●● ● ●● ●

B

0.0

0.5

1.0

1.5

● ● ●● ●● ●● ●● ● ●

C

0.0

0.5

1.0

1.5

● ● ●●●● ●●●●● ●

D

0.0

0.5

1.0

1.5

● ●● ●● ●●● ●● ●●

E

0.0

0.5

1.0

1.5

●● ● ●● ●●● ●●●●

F

Page 24: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Comparative density plot, separate panels, reordered> densityplot(~sqrt(count) | reorder(spray, count), InsectSprays)

count

Den

sity

0.0

0.5

1.0

1.5

0 2 4 6

● ● ●● ●● ●● ●● ● ●

C

0.0

0.5

1.0

1.5

● ●● ●● ●●● ●● ●●

E

0.0

0.5

1.0

1.5

● ● ●●●● ●●●●● ●

D

0.0

0.5

1.0

1.5

●● ●●●●● ●● ●●●

A

0.0

0.5

1.0

1.5

● ● ●● ●● ●● ● ●● ●

B

0.0

0.5

1.0

1.5

●● ● ●● ●●● ●●●●

F

Page 25: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Outline

Presenting data

Scatter plots

Histograms and density plots

Box-and-whisker plots and dotplots

Page 26: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Box-and-whisker plot and dotplot

• A box-and-whisker plot gives a rough summary (based on thefive-number summary - min, 1st quartile, median, 3rd quartile,max) of the distribution.

• A dotplot consists of points on a number line. For a largenumber of data values we jitter the y values to avoidoverplotting. By default a density plot also shows a dotplot.

• Box-and-whisker plots or dotplots are often used forcomparison of groups.

• It is widely believed that a comparative boxplot should havethe response on the vertical axis. Most of the time it is moreeffective to put the response on the horizontal axis.

• If the default ordering of the groups is arbitrary reorder themaccording to the level of the response (mean response, bydefault).

• Reordering makes it easier to see if the variability increaseswith the level of the response.

Page 27: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Vertical comparative box-and-whisker plot> bwplot(sqrt(count) ~ spray, InsectSprays)

coun

t

0

1

2

3

4

5

A B C D E F

Page 28: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Horizontal comparative box-and-whisker plot> bwplot(spray ~ sqrt(count), InsectSprays)

count

A

B

C

D

E

F

0 1 2 3 4 5

Page 29: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Reordered horizontal comparative box-and-whisker plot> bwplot(reorder(spray, count) ~ sqrt(count), InsectSprays)

count

C

E

D

A

B

F

0 1 2 3 4 5

Page 30: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Compressed horizontal comparative box-and-whisker plot> bwplot(reorder(spray, count) ~ sqrt(count), InsectSprays,+ aspect = 0.2)

count

C

E

D

A

B

F

0 1 2 3 4 5

• You can extract much more information from this, compressedplot than from the original vertical box-and-whisker plot.

• In Edward Tufte’s phrase, this plot has a greater“information/ink ratio”.

Page 31: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Comparative dotplots

• When the number of observations per group is small, abox-and-whisker plot can obscure the structure of the data,rather than illuminating it.

• By default, the density plot provides a dotplot on the “rug”.

• A comparative dotplot displays all of the data. The principlesdescribed for a comparative boxplot (factor on vertical axis,reorder levels if no natural order, choose an appropriate scale)apply here too.

• By default, the character in the dotplot is filled. I often useoptional arguments pch = 21 and jitter.y = TRUE to avoidoverplotting.

• Setting type = c("p","a") provides a line joining the groupaverages.

• Interaction plots can be produced as a comparative dotplotwith groups

Page 32: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Comparative dotplot of InsectSprays

> dotplot(reorder(spray, count) ~ sqrt(count), InsectSprays,+ type = c("p", "a"), pch = 21, jitter.y = TRUE)

count

C

E

D

A

B

F

0 1 2 3 4 5

●●●●

●●● ●● ●●●

●● ●● ●● ●●

●●● ●

● ● ●● ●● ●●

●● ● ●

● ● ●●●

● ●●●●

●●

●●

●●● ●

●●

●● ●●

●●●

●● ●●● ●

●●●

Page 33: Mixed models in R using the lme4 package Part 2: Lattice ...lme4.r-forge.r-project.org/slides/2009-07-01-Lausanne/2LatticeD.pdf · 7/1/2009  · There are several styles of graphics

Presenting data xyplot densityplot dotplot

Summary

• In order of importance the graphic displays I consider arescatter plots, density plots, box-and-whisker plots, dot plotsand histograms.

• Pay careful attention to layout and axis labels. Include unitsin the axis labels, if known.

• For mixed models we always have at least one unorderedcategorical covariate and often have a numeric response.Comparative dot plots and box-and-whisker plots will beimportant data presentation techniques for us.

• Plots of a continuous response by levels of a categoricalvariable work best with the category on the vertical axis.Consider reordering the levels of the category if they do nothave a natural order.