module -1 (r training) - intro to r - copy

21
1  Copyright © Ivy Profes sional School - 2009-10 (All Rights Reserved  I V Y P r o f essi o n a l S ch o o l P r ogr a m : K P O T r a inin g M o d u l e:Int r o d uctionto Sessi on : 1& 2 AXP Internal 9/17/15 1

Upload: rohitgahlan

Post on 09-Jan-2016

219 views

Category:

Documents


0 download

DESCRIPTION

intro

TRANSCRIPT

Page 1: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 1/21

1

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

IVY Professional School

Program: KPO Training

Module: Introduction to

Session: 1 & 2

AXP Internal9/17/15 1

Page 2: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 2/21

2

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

Lecture Overview

Why R, and R Paradigm

References, Tutorials and links

R Overview

Setting up R & RStudio for use

R nterface

R Workspace

!elp

R Packages

nput"Output

Reusing Results

AXP Internal9/17/15 2

Page 3: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 3/21

3

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

Why R?

t#s free$ t runs on a variety of platforms including Windows, %ni and 'acOS( t provides an unparalleled platform for programming new statistical

methods in an easy and straightforward manner( t contains advanced statistical routines not yet availa)le in other

packages( t has state*of*the*art graphics capa)ilities(

AXP Internal9/17/15 3

Page 4: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 4/21

4

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

R Overview R is a comprehensive statistical and graphical programming language and is a dialect of the S language+

-.. * S/+ R0 1ecker, 2' 3ham)ers, 0 Wilks

--/ * S4+ 2' 3ham)ers, T2 !astie--. * S5+ 2' 3ham)ers

R+ initially written )y Ross haka and Ro)ert 6entleman at 7ep( of Statistics of % of 0uckland, 8ew 9ealandduring --:s(Since --;+ international <R*core= team of > people with access to common 3?S archive(

@ou can enter commands one at a time at the command prompt ABC or run a set of commands from a source file(

There is a wide variety of data types, including vectors Anumerical, character, logicalC, matrices, dataframes, andlists(

To Duit R, use BDAC

'ost functionality is provided through )uilt*in and user*created functions and all data o)Eects are kept in memoryduring an interactive session(

1asic functions are availa)le )y default( Other functions are contained in packages that can )e attached to acurrent session as needed

 0 key skill to using R effectively is learning how to use the )uilt*in help system( Other sections descri)e theworking environment, inputting programs and outputting results, installing new functionality through packagesand etc(

 0 fundamental design feature of R is that the output from most functions can )e used as input to other functions(This is descri)ed in reusing results(

AXP Internal9/17/15 4

Page 5: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 5/21

5

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

nstalling and R Studio

6o to R homepage+ http+""www(r*proEect(org"  and follow the installation instructions

<RStudio is a new integrated development environment A7FC for R=

nstall the <desktop edition= from this link+ http+""www(rstudio(org"download"  

AXP Internal9/17/15 5

Page 6: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 6/21

6

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

%sing RStudio

Script editor 

?iew help, plots & filesGmanage packages

?iew varia)les in workspaceand history file

R console

AXP Internal9/17/15 6

Page 7: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 7/21

7

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

Set %p @our Workspace

3reate your working directory

Open a new R script file

AXP Internal9/17/15 7

Page 8: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 8/21

8

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

R ntroduction Results of calculations can )e stored in o)Eects using the assignment operators+

 0n arrow AH*C formed )y a smaller than character and a hyphen without a space$  The eDual character AIC(

These o)Eects can then )e used in other calculations( To print the o)Eect Eust enter thename of the o)Eect( There are some restrictions when giving an o)Eect a name+  O)Eect names cannot contain Jstrange# sym)ols like $, K, *, (  0 dot A(C and an underscore A C are allowed, also a name starting with a dot(

 O)Eect names can contain a num)er )ut cannot start with a num)er(  R is case sensitive, M and are two different o)Eects, as well as temp and temP(

 Fample+B 0n eampleB H* cA+:C

B NAB.C AH>CB yields / 4 5 - :

B !ow it worksB H* cA+:CB MB / 4 5 > Q ; . - :B B .B T T

B H >B T T T T B B . H >B T T T T T TBNcAT,T,T,T,,,,,T,TCB / 4 5 - :

AXP Internal9/17/15 8

Page 9: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 9/21

9

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

R ntroduction A3ontinuedC

To list the o)Eects that you have in your current R session use the function ls or thefunction o)Eects(

B lsAC

N y So to run the function ls we need to enter the name followed )y an opening A and a

closing C( Fntering only ls will Eust print the o)Eect, you will see the underlying R code ofthe function ls( 'ost functions in R accept certain arguments( or eample, one of thearguments of the function ls is pattern( To list all o)Eects starting with the letter +

B / I -

B y/ I :

B lsApatternIC

N /< f you assign a value to an o)Eect that already eists then the contents of the o)Eect will

)e overwritten with the new value Awithout a warning$C( %se the function rm to removeone or more o)Eects from your session(

B rmA, /C

Lets create two small vectors with data and a scatter plot(

U/ H* cA,/,4,5,>,QC

U4 H* cAQ,.,4,>,;,C

plotAU/,U4C

titleA'y first scatter plotC

R Warning ! R is a case sensitive language( OO, oo, and foo are three different o)EectsAXP Internal9/17/15 9

Page 10: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 10/21

10

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

R Workspace O)Eects that you create during an R session are hold in memory, the collection of o)Eects that you

currently have is called the workspace( This workspace is not saved on disk unless you tell R todo so( This means that your o)Eects are lost when you close R and not save the o)Eects, or worsewhen R or your system crashes on you during a session(

When you close the R6ui or the R console window, the system will ask if you want to save theworkspace image( f you select to save the workspace image then all the o)Eects in your currentR session are saved in a file (R7ata( This is a )inary file located in the working directory of R,

which is )y default the installation directory of R(

7uring your R session you can also eplicitly save the workspace image( 6o to the JileV menuand then select JSave Workspace(((#, or use the save( image function(

save to the current working directory

save(imageAC

Eust checking what the current working directory isgetwdAC

save to a specific file and location

save(imageA3+Program ilesRR*/(>(:)in(R7ataC

f you have saved a workspace image and you start R the net time, it will restore the workspace(So all your previously saved o)Eects are availa)le again( @ou can also eplicitly load a saved

workspace that could )e the workspace image of someone else( 6o the Jile# menu and selectJLoad workspace(((#(

AXP Internal9/17/15 10

Page 11: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 11/21

11

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

R Workspace Acontinued(C

3ommands are entered interactively at the R  user prompt( Up  and down arrow keys  scrollthrough your command history( @ou will pro)a)ly want to keep different proEects in differentphysical directories(

R gets confused if you use a path in your code like  c:\mydocuments\myfile.txt 

This is )ecause R sees as an escape character( nstead, use

c:\\my documents\\myfile.txt or c:/mydocuments/myfile.txt  getwdAC print the current working directory lsAC list the o)Eects in the current workspace

setwdAmydirectoryC change to mydirectory

setwdAc+"docs"mydirC

view and set options for the sessionhelpAoptionsC learn a)out availa)le optionsoptionsAC view current option settingsoptionsAdigitsI4C num)er of digits to print on output

 work with your previous commandshistoryAC display last /> commandshistoryAma(showInfC display all previous commands

save your command historysavehistoryAfileImyfileC default is (Rhistory

recall your command historyloadhistoryAfileImyfileC default is (Rhistory<

AXP Internal9/17/15 11

Page 12: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 12/21

12

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

R !elp & R 7atasets

!elp+ Once R is installed, there is a comprehensive )uilt*in help system( 0t the program#s command

prompt you can use any of the following+

help(startAC general help

helpAfooC help a)out function foo

Xfoo  same thing

aproposAfooC list all function containing string foo

eampleAfooC show an eample of function foo

Search for foo in help manuals and archived mailing listsRSiteSearchAfooC

6et vignettes on using installed packagesvignetteAC show availa)le vignettesvignetteAfooC show specific vignette

7atasets+R comes with a num)er of sample datasets that you can eperiment with( Type > data( ) to see the

availa)le datasets( The results will depend on which packages you have loaded( Type

help(datasetname) for details on a sample dataset(

AXP Internal9/17/15 12

Page 13: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 13/21

13

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

R Packages One of the strengths of R is that the system can easily )e etended( The system allows you to

write new functions and package those functions in a so called JR package# Aor JR li)rary#C( The R

package may also contain other R o)Eects, for eample data sets or documentation( There is alively R user community and many R packages have )een written and made availa)le on 3R08for other users( 2ust a few eamples, there are packages for portfolio optimiUation, drawingmaps, eporting o)Eects to html, time series analysis, spatial statistics and the list goes on andon(

When you download R, already a num)er Aaround 4:C of packages are downloaded as well( Touse a function in an R package, that package has to )e attached to the system( When you start Rnot all of the downloaded packages are attached, only seven packages are attached to thesystem )y default( @ou can use the function search to see a list of packages that are currentlyattached to the system, this list is also called the search path(

B searchACN (6lo)alFnv package+stats package+graphics

N5 package+gr7evices package+datasets package+utils

N; package+methods 0utoloads package+)ase<

To attach another package to the system you can use the menu or the li)rary function( ?ia themenu+ Select the JPackages# menu and select JLoad package(((#, a list of availa)le packages onyour system will )e displayed( Select one and click JOY#, the package is now attached to yourcurrent R session( ?ia the li)rary function+

B li)raryA'0SSCB shoesZ0N 4(/ .(/ :(- 5(4 :(; Q(Q -(> :(. .(. 4(4

Z1  N 5(: .(. (/ 5(/ (. Q(5 -(. (4 -(4 4(Q

AXP Internal9/17/15 13

Page 14: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 14/21

14

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

R Packages A3ontinued((C

The function li)rary can also )e used to list all the availa)le li)raries on your system with a shortdescription( Run the function without any arguments

B li)raryAC

Packages in li)rary #3+"PRO6R0["R"R*/>[(:"li)rary#+

)ase The R 1ase Package

1oot 1ootstrap R AS*PlusC unctions A3antyC

class unctions for 3lassificationcluster 3luster 0nalysis Ftended Rousseeuw et al(

codetools 3ode 0nalysis Tools for R

datasets The R 7atasets Package

71 R 7ata)ase nterface

foreign Read 7ata Stored )y 'inita), S, S0S,SPSS,Stata, Systat, d1ase, (((

graphics The R 6raphics Package

install I functionAC \

  install(packagesAcAmoments,graphics,Rcmdr,he)inC,

reposIhttp+""li)(stat(cmu(edu"R"3R08C

]

installAC

AXP Internal9/17/15 14

Page 15: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 15/21

15

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

R 3onflicting o)Eects

t is not recommended to do, )ut R allows the user to give an o)Eect a name that already eists(

f you are not sure if a name already eists, Eust enter the name in the R console and see if Rcan find it( R will look for the o)Eect in all the li)raries ApackagesC that are currently attached tothe R system( R will not warn you when you use an eisting name(

B mean I :

B mean

N : The o)Eect mean already eists in the )ase package, )ut is now masked )y your o)Eect mean( To

get a list of all masked o)Eects use the function conflicts(

B

N )odyH* mean<

The o)Eect mean already eists in the )ase package, )ut is now masked )y your o)Eect mean( Toget a list of all masked o)Eects use the function conflicts(

B conflictsAC

N )odyH* mean<

8ote+ @ou can safely remove the o)Eect mean with the function rmAC without risking deletion of themean function(

3alling rmAC removes only o)Eects in your working environment )y default(

AXP Internal9/17/15 15

Page 16: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 16/21

16

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

Output

The sink( ) function defines the direction of the output(

direct output to a file

sinkAmyfile, appendI0LSF, splitI0LSFC

return output to the terminal

sinkAC

The append option controls whether output overwrites or adds to a file( The split option determines if output is also sent to the screen as well as the output

file(!ere are some eamples of the sink() function( output directed to output(tt in c+proEects directory( output overwrites eisting file( no output to terminal(

sinkAmyfile(tt, appendITR%F, splitITR%FC

AXP Internal9/17/15 16

Page 17: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 17/21

17

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

6raphs

To redirect graphic output use one of the following functions( %se dev.off( ) to return output to theterminal(

Fample * output graph to Epeg file EpegAc+"mygraphs"myplot(EpgCplotACdev(offAC

AXP Internal9/17/15 17

Page 18: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 18/21

18

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

Reusing ResultsOne of the most useful design features of R is that the output of analyses can easily )e saved and

used as input to additional analyses(

Fample + lmAmpg[wt, dataImtcarsC

This will run a simple linear regression of miles per gallon on car weight using the dataframemtcars( Results are sent to the screen( 8othing is saved(

Fample / + fit H* lmAmpg[wt, dataImtcarsC

This time, the same regression is performed )ut the results are saved under the name fit( 8o

output is sent to the screen( !owever, you now can manipulate the results(

strAfitC view the contents"structure of fit<

The assignment has actually created a list called fit that contains a wide range of information

Aincluding the predicted values, residuals, coefficients, and more(

plot residuals )y fitted valuesplotAfitZresiduals, fitZfitted(valuesC

To see what a function returns, look at the value section of the online help for that function( !erewe would look at help(lm)( The results can also )e used )y a wide range of other functions(

produce diagnostic plotsplotAfitC

AXP Internal9/17/15 18

Page 19: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 19/21

19

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

T!08Y @O%

AXP Internal9/17/15 19

Page 20: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 20/21

20

  Copyright © Ivy Professional School - 2009-10 (All Rights Reserved 

Tutorials & We) links

Tutorials

AFach of the following tutorials are in P7 formatC

P( Yuhnert & 1( ?ena)les, 0n ntroduction to R+ Software for Statistical 'odeling & 3omputing 

2(!( 'aindonald, %sing R for 7ata 0nalysis and 6raphics 

1( 'uenchen, R for S0S and SPSS %sers 

W(2( Owen, The R 6uide 

7( Rossiter, ntroduction to the R ProEect for Statistical 3omputing for %se at the T3  W(8( ?ene)les & 7( '( Smith, 0n ntroduction to R

We links Paul 6eissler#s ecellent R tutorial  7ave Ro)ert#s Fcellent La)s on Fcological 0nalysis

Fcellent Tutorials )y 7avid Rossitier   Fcellent tutorial an nearly every aspect of R Ac"o Ro) Ya)acoffC ntroduction to R )y ?incent 9oonekynd  R 3ook)ook  7ata 'anipulation Reference 

AXP Internal9/17/15 20

Page 21: Module -1 (R Training) - Intro to R - Copy

7/17/2019 Module -1 (R Training) - Intro to R - Copy

http://slidepdf.com/reader/full/module-1-r-training-intro-to-r-copy 21/21

21

We) links AcontinuedC

R time series tutorial  R 3oncepts and 7ata Types presentation )y 7eepayan Sarkar nterpreting Output rom lmAC  The R Wiki   0n ntroduction to R  mport " Fport 'anual  R Reference 3ards 

YickStart  !ints on plotting data in R  Regression and 08O?0  0ppendices to o 1ook on Regression  26R a 2ava*)ased 6% for R N'acWindowsLinu  0 !and)ook of Statistical 0nalyses %sing RA1rian S( Fveritt and Torsten !othornC

AXP Internal9/17/15 21