phan tich hoi quy tuyen tinh don gian
DESCRIPTION
Ebook1001caudamthoaitienganhthongdungnhat 150624084411 Lva1 App6892TRANSCRIPT
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:1
PHN TCH HI QUI TUYN TNH N GIN 17.1 Phng trnh hi qui tuyn tnh
Phn tch hi qui tuyn tch n gin (Simple Linear Regression Analysis) l
tm s lin h gia 2 bin s lin tc: bin c lp (bin d on) trn trc honh x
vi bin ph thuc (bin kt cc) trn trc tung y. Sau v mt ng thng hi
qui v t phng trnh ng thng ny ta c th d on c bin y (v d: cn
nng) khi c x (v d: tui)
V d 1: Ta c 1 mu gm 6 tr t 1-6 tui, c cn nng nh bng sau:
Tui Cn nng (kg)
1 10 2 12 3 14 4 16 5 18 6 20
Ni cc cp (x,y) ny ta thy c dng 1 phng trnh bc nht: y=2x+8
(trong 2 l dc v 8 l im ct trn trc tung y khi x=0). Trong thng k
phng trnh ng thng (bc nht) ny c vit di dng:
y= x + [1]
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:2
y l phng trnh hi qui tuyn tnh, trong gi l dc (slope) v l chn
(intercept), im ct trn trc tung khi x=0.
Thc ra phng trnh hi qui tuyn tnh ny ch c trn l thuyt, ngha l cc tr s
ca xi (i=1,2,3,4,5,6) v yi tng ng, lin h vi nhau 100% (hoc h s tng
quan R=1)
Trong thc t him khi c s lin h 100% ny m thng c s sai lch gia tr
s quan st yi v tr s yi c on nm trn ng hi qui.
17.1.1 M hnh hi qui tuyn tnh
V d 2: Ta c 1 mu gm 6 tr em khc c cn nng theo bng sau:
Tui Cn nng (kg)
1 11 2 11 3 14 4 16 5 18 6 20
Khi v ng thng hi qui, ta thy cc tr s quan st y3, y4, y5, y6 nm trn ng
thng, cn y1 v y2 khng nm trn ng thng ny v s lin h gia xi v yi
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:3
khng cn l 100% m ch cn 97% v c s sai lch ti y1 v y2. S sai lch ny
trong thng k gi l phn d (residual) hoc errors.
Gi y1, y2, y3, y4, y5, y6 l tr s quan st v y1, y2, y3, y4, y5, y6 l tr s c on
nm trn ng hi qui, 1, 2, 3, 4, 5, 6 l phn d.
Nh vy 1= y1 y1
2 = y2 y2
3 = y3 y3
4 = y4 y4
5 = y5 y5
6 = y6 y6
Khi phng trnh hi qui tuyn tnh c vit di dng tng qut nh sau:
y= xi + i+ i [2]
Nh vy nu phn d i cng nh s lin h gia x,y cng ln v ngc li. Phn
lin h cn i gi l phn hi qui. M hnh hi qui tuyn tch c m t nh sau:
D liu= Hi qui (Regression) + Phn d (Residual)
17.1.2 c tnh h s tng quan v chn
Mun v c phng trnh hi qui tuyn tnh cn phi c tnh c dc
v chn trn trc tung.
V d 3: Nu chng ta chn mt mu thc t gm 30 em t 1-6 tui v kt qu cn
nng tng ng ca 30 em c v trong biu sau:
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:4
Lc ny ta khng th ni 30 im trn biu m phi v 1 ng thng i cng
gn vi tt c cc im cng tt. Nh vy 3 ng thng biu ta chn ng
thng no?. Nguyn tc chn ng thng no i gn c 30 im, c ngha lm sao
tng cc phn d i nh nht:
i= (yi- x )
v tng bnh phng ca phn d:
(i)2= (yi- x )
2
y l phng trnh bc 2 theo x. Trong ton hc, mun tm tr cc tiu ca 1
phng trnh bc 2, ngi ta ly o hm v cho o hm trit tiu (bng 0) s tm
c tr cc tiu ca x. Gii phng trnh ny, ta s tnh c 2 thng s v v
t 2 thng s ny ta s v c ng thng hi qui. Phng php ny trong ton
hc gi l phng php bnh phng nh nht (least square method).
Gii phng trnh trn ta c:
= r SySx
(r l h s tng quan; Sy l lch chun ca y v Sx l lch chun ca x)
r = 1
n-1 (
xi- x
Sx ) (
yi- y
Sy )
= y - x
v phng trnh hi qui tuyn tnh ca y theo x (bnh phng nh nht) l:
y = xi +
17.2 Phn tch hi qui tuyn tnh trong SPSS
Nhp s liu tui v cn nng cn c ca 30 tr 1-6 tui vo SPSS:
Ct 1: tui; ct 2: cn nng
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:5
Vo menu: >Analyze> Regression> Linear
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:6
Bng 17.1 Tm tt m hnh
H s tng quan R=0,918 v R2=0,843
Bng 17. 2 Phn tch ANOVA vi bin ph thuc l cn nng
Tng bnh phng phn hi qui (Regression)=336,14
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:7
Tng bnh phng phn d (Residual)=62,8
Trung bnh bnh phng hi qui: 336,14/ 1 (bc t do)=336,14
Trung bnh bnh phng phn d: 62,8/ 28(bc t do=n-2)=2,24
F= 336,14
2,24 = 149,8 v p
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:8
T phng trnh ny ta c th c on c cn nng theo tui ca tr, tuy nhin
nm trong mt gii hn no chng hn nh t 1-12 tui, v sau tui ny cn nng
tr s tng vt trong thi k dy th v khng cn lin h tuyn tnh vi tui na.
V d mun c on cn nng ca tr t qun th nghin cu ny:
7 tui Cn nng= 7,77 + 1,96 x7 = 21,49 kg
8 tui Cn nng= 7,77 + 1,96 x8 = 23,45 kg
17. 3 Cc gi nh trong phn tch hi qui tuyn tnh
Phn tch hi qui tuyn tnh khng ch l vic m t cc d liu quan st
c trong mu (sample) nghin cu m cn phi suy rng cho mi lin h
trong dn s (population). V vy, trc khi trnh by v din dch m hnh hi
qui tuyn tnh cn phi d tm vi phm cc gi nh. Nu cc gi nh b vi
phm th cc kt qu c lng khng ng tin cy c.
Cc gi nh cn thit trong hi qui tuyn tnh:
1. xi l bin s c nh, khng c sai st ngu nhin trong o lng.
2. Phn d (tr s quan st tr cho tr s c on) phn phi theo lut
phn phi chun
3. Phn d c tr trung bnh bng 0 v phng sai khng thay i cho mi
tr xi
4. Khng c tng quan gia cc phn d
V d: Mt nghin cu tm s tng quan gia cholesterol mu vi b dy
lp ni trung mc (NTM) ca ng mch cnh o c trn siu m vi
d liu ghi nhn 100 BN nh sau:
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:9
Biu phn tn (scatter) l mt phng tin tt nh gi mc
ng thng ph hp vi d liu quan st.
Vo menu: Analyze> Curve Estimation
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:10
Vo mn hnh Curve Estimation
Nhp chuyn BEDAYNTM (B dy ni trung mc) vo Dependent (s) v
CHOLESTEROL vo Variable. nh du nhy vo cc Include
constant in equation, Plot models v Linear (nu mun c lng s
lin h gia 2 bin theo dng phng trnh bc 2 th nh thm du nhy
vo Quadratic). Nhn OK, ta c biu sau:
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:11
y l phng trnh hi qui tuyn tnh vi y= 0,748 + 0,062x
Gi nh x ( cholesterol mu) l mt bin c nh, khng c sai st trong o
lng. Gi nh ny khng c vn nu bnh nhn c o mt phng
th nghim chun.
Cc gi nh cn li thc hin trong SPSS nh sau: Vo menu: Analyze> Regression> Linear...
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:12
Vo mn hnh Linear, Nhp chuyn BEDAYNTM qua Dependent v CHOLESTEROL qua Independent(s)
Nhn nt Plots, m hp thoi Plots:
Nhp chuyn phn d *ZRESID vo X (trc honh) v gi tr d on
vo Y (trc tung) xem phn d c phn b ngu nhin v phng sai
c c nh cho mi tr ca xi. Nhn du nhy vo Histogram v Normal
probability plot xem phn d c phn phi chun.
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:13
Nhn Continue, sau nhn OK cho kt qu sau:
Nh vy phn d c trung bnh (mean)=0 v lch chun (SD)=0,394
Biu phn b phn d c dng hnh chung u 2 bn, tr trung bnh
gn bng zero v SD gn bng 1. Nh vy gi nh phn d c phn phi
chun khng b vi phm.
Hoc xem biu P-P plot so snh gia phn phi tch ly ca phn d
quan st (Observed Cum Prob) trn trc honh v phn phi tch ly k
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:14
vng (Expected Cum Prob) trn trc tung. Nu cc im u nm gn
ng cho th phn phi phn d c coi nh gn chun.
Cui cng xem gi nh cc phng sai khng i vi mi gi tr ca x
(cholesterol mu) hoc gi l homoscedasticity. Nu cc tr phn d
phn tn ngu nhin quanh gi tr zero (ng ngang) th coi nh phng
sai khng thay i, v gi nh v homoscedasticity khng b vi phm.
-
TS Nguyen Ngoc Rang; Email: [email protected]; Website: bvag.com.vn; Trang:15
Nu phng sai thay i (ln dn hoc nh dn theo gi tr ca x) th gi l
Heteroscedascity (gi nh v phng sai c nh b vi phm). V d nh hnh
di y:
Tm li, vi v d trn cc gi nh ca phn tch hi qui tuyn tnh u tha
mn v ta c th kt lun l b dy ni trung mc ng mch cnh c lin h
tuyn tnh vi nng cholesterol mu theo phng trnh :
Y (B dy ni trung mc)= 0,062 X cholesterol + 0,748.
Nh vy c nng cholesterol tng ln 1 mmol/L th b dy ni trung mc
ng mch cnh tng ln 0,062mm.
Ti liu tham kho:
1. McClave J T and Sincich T. 2000. Simple linear regression in Statistics, 8th
edition, Prentice-Hall, USA, pp. 505-557.
2. Moore D. S. and McCabe G. P. 1999. Looking at Data-Relationships (Chapter
2), in Introduction to the Practice of Statistics, W.H. Freeman and Company,
New York, pp. 102-145.