slideshare r book preview slides

5
Who Should Use This Book R is a very powerful and free computational statistics software/programming language that can have a steep learning curve. But for users that simply want to quickly plot simple graphs and perform simple linear regression, the necessary skills can be quickly learned. The purpose of this brief text is to provide a simple and easy ‘how to’ guide for absolute R novices who need to rapidly plot data and perform simple linear regression without too much effort. Once you start using R and acquire a feel for it, hopefully you’ll want to learn more about the details of the language and its power and utility for data analysis. The chapters are sometimes short and very specific. This is by design so that subject matter is relatively easy to locate. This is not a book about statistics and for more information about statistical theory, there are many good texts that can be consulted. This is also not a book for R experts. Constructive feedback concerning content and errors is welcomed. ([email protected] ) 1

Upload: christopher-breach

Post on 25-Jan-2017

59 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Who Should Use This BookR is a very powerful and free computational statistics software/programming language that can have a steep learning curve. But for users that simply want to quickly plot simple graphs and perform simple linear regression, the necessary skills can be quickly learned.

The purpose of this brief text is to provide a simple and easy ‘how to’ guide for absolute R novices who need to rapidly plot data and perform simple linear regression without too much effort. Once you start using R and acquire a feel for it, hopefully you’ll want to learn more about the details of the language and its power and utility for data analysis.

The chapters are sometimes short and very specific. This is by design so that subject matter is relatively easy to locate.

This is not a book about statistics and for more information about statistical theory, there are many good texts that can be consulted.

This is also not a book for R experts.

Constructive feedback concerning content and errors is welcomed. ([email protected] )

1

Table of Contents

CHAPTER 1 6

INSTALLING R AND R STUDIO 61.1 INTRODUCTION 61.2 DOWNLOADING R 61.3 DOWNLOADING RSTUDIO 8

CHAPTER 2 11

THE RSTUDIO ENVIRONMENT 11

2.1 INTRODUCTION 112.2 THE RSTUDIO LAYOUT 112.3. THE SCRIPT WINDOW 132.4 THE CONSOLE/COMMAND WINDOW 142.5 THE PLOT/UTILITIES WINDOW 142.6 THE ENVIRONMENT/HISTORY WINDOW 14

16

CHAPTER 3 16

THE WORKING DIRECTORY 16

3.1 INTRODUCTION 163.2 CREATE YOUR OWN RSTUDIO FOLDER 163.3 FIND THE CURRENT WORKING DIRECTORY 163.4 CHANGE THE WORKING DIRECTORY 17

18

CHAPTER 4 18

R PACKAGES 18

4.1 INTRODUCTION 184.2 THE PACKAGES WINDOW 184.3 DEFAULT PACKAGES 194.4 INSTALLING NEW PACKAGES 20

CHAPTER 5 23

BASIC DATA PREPARATION 23

5.1 INTRODUCTION 235.2 SINGLE X-Y PAIRS 235.2.1 THE SPREAD SHEET 235.2.2. READ THE CSV FILE INTO R 245.3 DATA WITH MULTIPLE Y FOR EACH X 255.3.1. THE SPREAD SHEET 25

2

CHAPTER 6 26

BASIC SCATTERPLOTS 26

6.1 INTRODUCTION 266.2 BASIC INTERACTIVE SCATTERPLOTS 266.2.1 BASIC INTERACTIVE PLOTTING: RUBBER ABRASION DATA 266.2.2 ALTERNATIVE METHOD OF PLOTTING 296.2.3 CHANGING THE X AND Y SCALES 316.2.4 MODIFYING THE PLOT CHARACTER AND COLOUR 326.3 BASIC INTERACTIVE HISTOGRAMS 356.4 PLOTTING USING SCRIPTS 366.5 SAMPLE SCRIPT FOR THE RUBBER DATA 396.6 TIDYING UP THE ENVIRONMENT AND CONSOLE WINDOWS 406.6.1 CLEANING UP THE GLOBAL ENVIRONMENT 406.6.2 CLEANING UP THE CONSOLE 41

42

CHAPTER 7 42

SIMPLE LINEAR REGRESSION 42

7.1 INTRODUCTION 427.2 CHECKING THE ASSUMPTIONS (DIAGNOSTICS): RUBBER ABRASION DATA 437.2.1 ASSUMPTION 1: LINEAR RELATIONSHIP OF YI AGAINST XI AND FITTING A STRAIGHT LINE 437.2.2 ASSUMPTION 2: NORMAL DISTRIBUTION OF THE RANDOM ERROR VARIANCE 477.2.2.1 HISTOGRAM OF RESIDUALS 487.2.2.2 NORMAL PROBABILITY PLOT OF RESIDUALS 507.2.3. ASSUMPTION 3: CONSTANT RANDOM ERROR VARIANCE 517.2.3.1. RAW RESIDUALS VERSUS FITTED VALUES IN R 527.2.3.2. STANDARDISED RESIDUALS VERSUS FITTED VALUES IN R 557.2.4. ASSUMPTION 4: INDEPENDENCE OF THE RANDOM ERROR 567.3 SLR DIAGNOSTICS WITH BUILT-IN R COMMANDS 577.3.1 GENERATING DIAGNOSTIC PLOTS WITH R 577.3.2 PLOTTING MULTIPLE DIAGNOSTIC GRAPHS ON A SINGLE PAGE 597.3.3 MODIFYING DIAGNOSTIC PLOTS WITH R 61

62

CHAPTER 8 62

PLOTTING CONFIDENCE INTERVALS & ERROR BARS 628.1 INTRODUCTION 628.2 CONFIDENCE INTERVALS IN R 628.2.1 A BASIC R SCRIPT TO PLOT CONFIDENCE INTERVALS FOR THE RUBBER ABRASION DATA 628.2.2 CONFIDENCE INTERVAL PART 1: GENERATING VALUES OF THE EXPLANATORY VARIABLE 648.2.3 CONFIDENCE INTERVAL PART 2: GENERATING VALUES OF THE RESPONSE VARIABLE FOR THE

CONFIDENCE INTERVAL 65

3

8.2.4 CONFIDENCE INTERVAL PART 3: PLOTTING THE CONFIDENCE INTERVAL AND THE FITTED LINE 668.2.5 PLOTTING CONFIDENCE INTERVALS WITHOUT THE FITTED LINE 678.2.6 PLOTTING CONFIDENCE INTERVALS SEPARATELY FROM THE FITTED LINE 688.3 ERROR BARS IN R 698.3.1 Y-ERROR BARS 698.3.2 X-ERROR BARS 72

74

CHAPTER 9 74

PLOTTING MULTIPLE SETS OF DATA ON A SINGLE GRAPH 749.1 INTRODUCTION 749.2. CU WIRE DATA SET 749.2.1. INTERACTIVE PLOTTING OF MULTIPLE DATA 749.2.2. INTERACTIVE ANALYSIS OF MULTIPLE DATA 789.2.3. SCRIPT FOR PLOTTING MULTIPLE DATA 819.3. AN EXAMPLE SCRIPT TO PLOT THE DATA WITH A LEGEND 829.3.1. EASY LEGEND PLACEMENT 829.3.2. PLACING THE LEGEND WITH A MOUSE CLICK 849.3.3. POSITIONING THE LEGEND WITH COORDINATES 859.3.4. LEGEND WITHOUT BORDERS 869.4. MULTIPLE PLOTS SCRIPT WITH LEGEND PLACEMENT 87

89

CHAPTER 10 89

PLOT CUSTOMISATION I: TEXT MODIFICATION 89

10.1 INTRODUCTION 8910.2. MAIN TITLE 9010.3. BASIC AXIS LABEL AND AXIS TITLE MODIFICATIONS 9110.3.1 COLOURS AND FONT SIZE 9110.3.2 SUPERSCRIPTS & SUBSCRIPTS 9410.3.3 GREEK CHARACTERS 96

97

CHAPTER 11 97

PLOT CUSTOMISATION II: MODIFYING AXES AND TEXT 97

11.1 INTRODUCTION 9711.2 HIGH & LOW LEVEL PLOT FUNCTIONS AND COMMANDS 9711.3 POSITIONING AXES, AXIS TITLE AND TEXT LABELS 9911.3.1. THE PLOT AREA 9911.3.2. THE PLOT MARGINS 9911.3.3 POSITIONING AXES, AXIS TITLE AND TEXT LABELS 10111.3.4 FINE CONTROL OF THE AXIS SCALES 102

4

11.3.5 TICK LENGTHS AND DIRECTION 10411.3.6 ADDING GRIDLINES 10511.3.7 ADDING TEXT TO THE PLOT AREA 10611.3.8 PRODUCING A REPORT QUALITY GRAPH 10711.3.9 EXPORTING A REPORT QUALITY GRAPH 109

111

CHAPTER 12 111

PLOT CUSTOMISATION III: ADDING A SECOND Y-AXIS 111

12.1 INTRODUCTION 11112.2 SECOND Y AXIS ON THE RIGHT 11112.3 SECOND Y AXIS ON THE LEFT 114

5