book ii - definitions
TRANSCRIPT
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 1/27
BOOK II DEFINITIONS AND CERTAIN FORMULA
Time Value of Money
1. Present Value2. Future Value. Discounting!. Compounding". Cost of capital#. Opportunity cost$. Real risk free rate%. Discount rate&. Annuity1'. Perpetuity
C(.2 )*o+a+ili,ie- Mille* Ma,(ema,i/- an0 S,a,i-,i/- fo* Ri- Manaemen,.
1. Random variable – an uncertain uantity or number2. Outcome – observed value of a random variable from an e!periment. "vent # a single outcome or a set of outcomes!. $utually e!clusive events – events t%at cannot %appen at t%e same time". "!%austive events – t%ose t%at include all possible outcomes.#. Probability distribution # describes probabilities of all possible outcomes for a random variable. Probabilities
must sum to &. Assignment of probabilities to t%e possible outcomes for discrete and continuous randomvariables provides us 'it% discrete probability distributions and continuous probability distributions.
$. Discrete random variable # t%e number of possible outcomes can be counted( and for eac% possible outcome(t%ere is a measurable and positive probability.%. Probability )mass* function # denoted )p!*( speci+es t%e probability t%at a random variable is eual to a
speci+c value. P)!* is t%e probability t%at a random variable , take n a value !. p)!* - P),-!*&. Continuous random variable # one for '%ic% t%e number of possible outcomes is in+nite( even if lo'er and
upper bounds e!ist.1'. Discrete distribution # p)!* - '%en ! cannot occur or p)!* /( if it can. P)!* is read 0probability t%at a
random variable ,-!1.11. Continuous distribution # p)!* - even t%oug% ! can occur. 2e can only consider certain bounds for ,( suc%
as P( x1≤ X ≤ X 2) '%ere x1∧ x2 are actual numbers. P)!-3*- ( because it is a single point in a CRV
t%at can take in+nite range of possible values.12. Probability density function )pdf* # denoted f)!* t%at can be used to generate t%e probability t%at outcomes of
a continuous distribution lie 'it%in a particular range of outcomes. Pd used to calculate t%e probability of an
outcome t%at lies bet'een t'o values.1. Cumulative distribution function )cdf* # de+nes probability t%at a random variable( ,( takes on a value eual
to or less t%an a speci+c value( !. it represent t%e sum( or cumulative value( of t%e probabilities for t%eoutcomes up to and including t%e speci+ed outcome.
1!. 4nverse cumulative distribution function # used to +nd t%e value t%at corresponds to a speci+c probability.".g. )VaR*
1". Discrete uniform random variable – probabilities for all possible outcomes for a discrete random variable areeual.
1#. 5nconditional probability # )marginal probability* t%e probability of an event regardless of t%e past or futureoccurrence of ot%er events
1$. Conditional probability – '%ere t%e occurrence of one event a6ects t%e probability of t%e occurrence of anot%erevent. ".g. P)Recession7 monetary aut%ority increase interest rates*
a. P ( A|B )= P ( AB)
P ( B )1%. 8oint probability of t'o events # t%e probability t%at t%ey 'ill bot% occur toget%er.1&. $ultiplication rule of Probability # multiplication of conditional and unconditional probability.
a. P)A9* - P)A79* P)9*2'. 4ndependent events # refers to events for '%ic% t%e occurrence of one %as no in:uence on t%e occurrence of
t%e ot%ers.a. P)A79* - P)A* or euivalently P)97A* - P)9*
21. Dependent events # if t%e independence condition is not satis+ed( t%e events are dependent events. )t%eoccurrence of one is dependent on t%e occurrence of t%e ot%er*
22. Addition rule for probabilities # used to determine t%e probability t%at at least one of t%e t'o events 'illoccur. ".g. given t'o events A and 9( t%e addition rule can be used to determine t%e probability t%at eit%er A or9 'ill occur.
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 2/27
2. $utually e!clusive events # t%e ;oint probability of )A9* is <ero( t%e probability t%at eit%er A or 9 'ill occur issimply t%e sum of t%e unconditional probabilities.
C( Ba-i/ S,a,i-,i/- Mille* Ma,( an0 S,a,- fo* Ri- Manaemen,
1. =tatistics# 'ord used to refer to data and met%ods used to analy<e data2. Descriptive =tatistics # summary of important c%aracteristics of large data sets. 4nferential =tatistics # pertain to procedures used to make forecasts( estimates( ;udgements about large set
of data on t%e basis of statistical c%aracteristics of smaller set.!. Population # set of all possible members of a stated group". $easures of Central >endency # center( average of a data set. Can be used to represent t%e typical or
e!pected value of a data set.#. Population mean # describes t%e e!pected value?average of t%e entire data set )population*$. =ample mean # a partial mean of a sample of t%e population for n observations @ population. 5sed to
make inferebnces abut population mean '%en it is infeasible or not possible to get all members of population.%. Arit%metic mean # is t%e only measure of central tendency for '%ic% t%e sum f t%e deviations from t%e mean
is <ero.&. $edian # midpoint of a data set '%en ordered from ascending or descending order. Balf observations above
and belo' t%e median.1'. $ode # most freuent value observed in t%e dataset.11. eometric mean # used '%en calculation investment returns over multiple periods or '%en measuring
compound gro't% rates. nt% root of t%e product of all nE observed values.12. "!pected Value # 'eig%ted average of all possible outcomes of a random variable.1. Properties of "!pectation # see %and'ritten lecture notes1!. Variance – measure of t%e e!pected spread or dispersion of t%e random variables about t%e mean )suared* .
can only be used for one variable1". =ample Variance – variance of t%e sample 'it% n#& degrees of freedom for an unbiased estimator of variance.1#. =tandard Deviation – same as variance but in units of t%e mean?e!pected value. –suare root of t%e variance. –
spread or dispersion about t%e mean.1$. Properties of Variance # see %and'ritten notes1%. Covariance # e!pected value of t%e product of t%e deviations of t%e t'o random variables from t%eir respective
e!pected values. Describes t%e co#movement of t'o variables. Describes a linear relations%ip bet'een t'ovariables but does not mean muc% until reduced do'n to correlation )as a percentage*. ives direction of co#movement but not si<e)level* of t%e relations%ip.
1&. Properties of Covariance # see %and'ritten notes2'. Correlation # easier to interpret covariance '%en reduced by t%e standard deviations of t%e t'o variables.
$easures t%e strengt% of t%e linear relations%ip bet'een t%e t'o random variables. Ranges bet'een #&@-
ρ @-&
21. Properties of Correlation # see %and'ritten notes22. =catter plot – collection of points '%ic% represent t%e values of t'o variables. ),?* pair.2. 4nterpretation of Correlation CoeGcients – see %and'ritten notes2!. $oments – describe t%e s%ape of a probability distribution. Ra' moments are measured relative to an
e!pected value raised to an appropriate po'er
2". Ht% ra' moment – t%e e!pected value of t%e X k
. E ( X k )=∑ pi X i
k
2#. Central moments # kt% moments measured around t%e mean E ( X −u )k =∑ pi ( X i−u )k
2$. &st ra'?central moment – mean2%. 3nd central moment – variance2&. Ird central moment – =ke'ness'. Jt% Central moment – Hurtosis1. 4nterpret =ke'ness – and Pos#eg =ke'ness – see %and'ritten notes2. 4nterpret Hurtosis – and KeptoHurtic – Platykurtic – $esoHurtic see %and'ritten notes. Coske'ness see %and'ritten notes!. Cokurtosis see %and'ritten notes". 5nbiased estimator # one for '%ic% t%e e!pected value of t%e estimator is eual to t%e parameter you are
trying to estimate. 5nbiased estimator also eGcient 4 t%e variance of its sampling distribution is smaller t%anall t%e ot%er unbiased estimators of t%e parameter you are trying to estimate. ".g. t%e sample mean is anunbiased estimator of t%e population mean.
#. 9iased estimator – e!pected value of t%e estimator is not eual to parameter you are trying to estimate. Andmay be far o6.
$. Consistent estimator # one for '%ic% t%e accuracy of t%e parameter estimate increases as t%e sample si<eincreases.
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 3/27
%. Point estimate is a linear estimator '%en it can be used as a linear function of sample data. 4f t%e estimator ist%e best available )%as minimum variance*( e!%ibits linearity( and is unbiased( it is said to be BLUE 3+e-,
linea* un+ia-e0 e-,ima,o**
C( ! Di-,*i+u,ion- Mille* Ma,( an0 S,a,- fo* Ri- Manaemen,
1. Parametric Distribution # described using a mat%ematical function2. onparametric Distribution # like a %istorical distribution cannot be described 'it% a mat%ematical function .
makes no assumptions about t%e data and t%us type of distribution +ts data perfectly. DiGcult to dra' to dra'
conclusion or infer muc% from R.V.. Continuous 5niform Distribution # all random variables %ave t%e same probability and bounded by a(b(.
everyt%ing else probability is <ero.!. Cdf of a continuous 5niform Distribution – see %and'ritten notes". >%e 9ernoulli Distribution # a trial t%at determines t%e success or failure of an e!periment. >'o#point
distribution. '? probability p of success and -&#p or failure.#. 9inomial Random Variable – random variable de+ned by t%e e!periment of n 9ernoulli trials. umber of
success in n 9ernoulli trials.$. 9inomial distribution – see above.%. "!pected Value of a 9inomial Random Variable – see %and notes but simply e!pected number of successes.&. Poisson Distribution pdf – discrete distribution of outcomes of a random variable , t%at refers to t%e number of
success per unit or an e!periment. . >%e parameter lambda( λ ( refers to t%e average )e!pected value* of
t%e number of successes per unit. ".g. number of calls per %our arriving at a s'itc%board( or number of defects
per batc% in a prod process( no of p%onecalls per %our arriving at p%one. umber of ! patients a6ected from3 procedures 'it% average a6ected rate-3 persons?procedure.
1'. Pdf of a normal distribution #11. Cdf of a normal distribution12. Con+dence interval # a range of values around t%e e!pected outcome 'it%in '%ic% 'e e!pect t%e actual
outcome to be some speci+ed percentage of t%e time. LMN con+dence interval is a range t%at 'e e!pect t%erandom variable to be in LMN of t%e time.
1. Con+dence intervals for ormal Distribution # interval is based on t%e e!pected value )sometimes called apoint estimate* of a random variable and on its variability( '%ic% 'e measure 'it% a standard deviation.
a. For any normally distributed random variable( N are 'it%in one standard deviation.+. Appro!. LMN are 'it%in 3 standard deviations.
/. &'4 /on50en/e in,e*6al fo* 7 ´ x−1.65σ ´ x+1.65σ
0. &"4 /on50en/e in,e*6al fo* 7 ´ x−1.96σ ´ x+1.96σ
e. &&4 /on50en/e in,e*6al fo* 7 ´ x−2.58σ ´ x+2.58σ
f. umbers are t%e <#score of a standard normal.1!. =tandard ormal Distribution # normal distribution t%at %as been standardi<ed so mean-<ero and standard
deviation - & .1". =tandardi<ation process of converting an observed value for a random variable to its <#value.
z=obs− pop . mean
std dev =
x−u
σ
1#. Q#Values – standardi<ed random variable. Also determines number of standard deviations of normal. Asstd.normal r.v sd - &. And mean - .
1$. Kognormal Distribution ( pdf #1%. Kognormal Distribution( cdf
1&. ormal vs. Kognormal Distribution
2'. Central Kimit >%eorem – states t%at for simple random samples of si<e n from population 'it% a mean μ
and +nite variance σ 2
( t%e sampling distribution of t%e sample mean ´ x
approachesa norma probabiit! distribution "ithmean μ ¿ variance e#ua¿ σ
2
n as t%e sample si<e
becomes large. 9ecause '%en t%e sample si<e is large( t%e sum of independent and identically distributed r.vs'ill be normally distributed.
21. Properties of Central Kimit >%eorema.
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 4/27
22. =tudents t Distribution2. Properties of =tudentEs t2!. Pdf of =tudents t2". Cdf of =tudents t2#. ormal vs. t Distribution2$. C%i#=uared Distribution2%. Pdf of C%i#=uared Distribution2&. F Distribution'. Properties of F#Distribution1. $i!ture Distributions
C(. # Baye-ian Analy-i- Mille* Ma,(ema,i/- a S,a,i-,i/- fo* Finan/ial Ri- Manaemen,
1. 9ayes >%eorem2. 5nconditional Probability of events. 8oint probability of events!. 9ayesian Approac%". Freuentist Approac%#. 9ayes vs. Freuentist$. 9ayesian >%eorem 'it% $ultiple =tates%.
C(. $ 8y9o,(e-i- Te-,in an0 Con50en/e In,e*6al- Mille* Ma,(ema,i/- a S,a,i-,i/- fo* Finan/ial Ri-
Manaemen,
1. =imple Random =ampling # met%od of selecting a sample in suc% a 'ay t%at eac% item or person in t%e populationbeing studied %as t%e same likeli%ood of being included in t%e sample.
2. =ampling "rror – t%e di6erence bet'een sample statistic)mean( variance( std dev of t%e sample* and itscorresponding population parameter)true mean( variance( std dev of t%e population*
a. =ample error )se* - sample mean – population mean - ´ x− μ
. =ampling Distribution of t%e sample statistic # sample statistic itself is a random variable itself and t%erefore( %as aprobability distribution. >%is is t%e prob distribution of all possible sample statistics computer from a set of eualsi<e samples t%at are randomly dra'n from t%e same population.
!. =ampling Distribution of t%e mean – repeated process of sampling t%e mean from n observations from t%e population.
". $ean of t%e sample average – e.g. t'o random variables !& and !3. $( X 1+ X 2
2 )=2 μ x
2 = μ x *e/all
$ ( X 1+ X 2 )= μ x+ μ x=2 μ x t%erefore 'e can say t%e $ ( ´ X )= μ x
#. Variance of t%e =ample Average - %ar ( X 1+ X
2 )=2σ x2
∧a=c=0.5%ar ( a X 1+c X
2 )=a2
%ar ( X 1 )+c
2
%ar ( X 2 )=
σ 2
2
a. %ar ( ´ X )=σ 2
n ∈&eneraterms .
$. =tandard "rror - std dev of t%e sample average is kno'n as standard error.
σ X
√ n
%. Population mean – all observed values in population are summed and divided by number of observations in
population.&. =ample mean - sum of all values in a sample of population dived by no. of observations in t%e sample.1'. Dispersion – de+ned as t%e variability around t%e central tendency. >%eme in +nance is tradeo6 bet'een re'ard
and variability. Central tendency is measure of re'ard and dispersion is a measure of risk.11. Population variance # average of t%e suared deviations from t%e mean. Population variance uses all members of a
population.
12. =tandard deviation of t%e population - √∑i=1
'
( X − μ )2
' =σ
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 5/27
1. =ample variance s2=∑i=1
n
( X − μ )2
n−1
using n 'ill systematically underestimate t%e population variance( especially
for small sample si<es( kno'n as a biased estimator using n#& improves t%e statistical properties of
s2
asan estimator o( σ 2
1!. =ample standard deviation – suare root of t%e sample variance.1". =tandard error of sample mean – t%e standard deviation of t%e distribution of t%e sample means . sample std
dev?srt)n* s ´ X =s e ´ X
= s
√ n1#. Covariance – bet'een t'o random variables is a statistical measure of t%e degree to '%ic% t'o variables move
toget%er. Captures t%e linear relations%ip bet'een one variable and anot%er. Pos cov# move toget%er samedirection. eg cov – move in opp direction
1$. Population Covariance #co v x!=
∑i=1
'
( X i− μ X )( ) i− μ ! )
'
1%. =ample Covariance # s . c o v x!=
∑i=1
n
( X i− μ X ) () i− μ ! )
n−1
1&. Con+dence 4ntervals # estimates t%at result in a range of values '%ic% t%e actual value of a parameter 'ill lie( given
t%e probability &# * (
2'. Kevel of signi+cance – * is called a level of signi+cance for t%e con+dence interval. ( and t%e probability
21. Degree of con+dence # &# * probability is t%e degree of con+dence. ".g. estimation of pop mean of a rando
variables 'ill range from &M#3M 'it% LMN degree of con+dence ( or at t%e MN level of signi+cance.22. Construction of a con+dence interval – usually constructed by adding or subtracting an appropriate value form t%e
point estimate. 4n general( con+dence intervals take on t%e follo'ing form.
a. Point estimate + reliabiity factor ! standard error e.g. ´ x + (1−* σ x) '%ere
σ x is sampe std dev∨standard error .
2. Point estimate – value of a sample statistic of t%e population parameter2!. Reliability factor # number t%at depends on sampling distribution of t%e pint estimate and t%e probability t%at t%e
point estimate falls in t%e con+dence interval. )&# * *
2". =tandard error of t%e point estimate – -,an0a*0 0e6ia,ion of ,(e 9oin, e-,ima,e .σ
√ n
2#. Con+dence interval for Population mean )normal dbn* # ´ x + z* / 2
σ
√ n
2$. Commonly used normal distribution reliability factor – factor( a standard normal variable for '%ic% te probability in
t%e rig%t#%and tail of t%e distribution is * ?3 . in ot%er 'ords( t%is is t%e <#factor t%at leaves * ?3 or
probability in t%e upper tail.
a. z* /2 : 1.#" fo* &'4 /on50en/e In,e*6al- 3-inf le6el 1'4 "4 in ea/( ,ail
+. z* /2 : 1.&# fo* &"4 /on50en/e in,e*6al- 3-in le6el "4 2."4 in ea/( ,ail
/. z* /2 : 2."% fo* &&4 /on50en/e in,e*6al 3-in le6el 14 '."4 in ea/( ,ail
2%. Probabilistic 4nterpretation – after repeatedly samples of t%e population( constructing con+dence intervals for eac%sample mean( LLN of t%e resulting con+dence intervals 'ill( in t%e long run( include t%e population mean(
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 6/27
2&. Practical 4nterpretation # 2e are LLN con+dent t%at t%e population mean score is bet'een
x
(¿¿1 , x2)¿
for t%e
samples from t%is population.'. Population is normal 'it% unkno'n variance ( t#distribution – '%en t%e population is normal but variance is
unkno'n( 'e use t%e t#distribution to construct a con+dence interval # ´ x + t * /2
s
√ n last variable is t%e
standard error of t%e sample mean( s is sample std dev.1. Reliability factors for t#distribution # t#reliability factor )t#statistic or critical t#value* corresponding to a t#distributed
random variable 'it% n#& degrees of freedom( '%ere n is t%e sample si<e. >%e area under t%e tail of t%e t#
distribution to t%e rig%t oft *
2 is * ?3
Reliability factors depend on sample si<e and degrees of freedom )n#&*( so 'e cannot rely on standardset of factors. Con+dence intervals 'it% t#statistic 'ill be more conservative)'ider* t%an 'it% <#reliability factors.
2. =electing Appropriate >est#=tatistica. =i<e of t%e sample in:uences '%et%er or not 'e can construct t%e appropriate con+dence interval for
t%e sample mean.+. Distribution non#normal but population variance is kno'n # <#statistic can be used as long as t%e
sample si<e is large )n/I*. 2e can use t%is because t%e central limit t%eorem assures us t%at t%e
distribution of t%e sample mean is appro!. normal '%en n is large./. Distribution is normal and population variance is unkno'n – t#statistic can be used as long as t%e
sample si<e is large )n/I*. ou can use <#statistic but t#statistic is more conservative.. Bypot%esis testing – is t%e statistical assessment of a statement or idea regarding a population.!. Bypot%esis – a statement about t%e value of a population parameter developed for t%e purpose of testing a t%eory
or belief. Bypot%eses are stated in terms of t%e population parameter to be tested( like t%e population mean( .". Bypot%esis >esting procedures # based on sample statistics and probability t%eory( are used to determine '%et%er
a %ypot%esis is a reasonable statement and s%ould not be re;ected or 4f it is an unreasonable statement ands%ould be re;ected.
a. =tate t%e %ypot%esis+. =elect t%e appropriate test statistic/. =pecify t%e level of signi+cance0. =tate t%e decision rule regarding t%e %ypot%esise. Collect t%e sample and calculate t%e sample statistic
f. $ake a decision regarding t%e %ypot%esis. $ake a decision based on t%e results of t%e test
#. ull Bypot%esis – designated ( - 0 is t%e %ypot%esis t%e researc%er 'ants to re;ect. 4t is t%e %ypot%esis t%at is
actually tested and is t%e basis for t%e selection of t%e test statistics. >%e null is generally a simple statementabout a population parameter.
a. >ypical =tatement # - 0: μ= μ0 , - 0 : μ ≤ μ0∧ - 0 : μ μ0 '%ere μ is t%e population mean and
μ0 is t%e %ypot%esi<ed value of t%e population mean.
$. Alternative %ypot%esis # designated - A , is '%at is concluded if t%ere is suGcient evidence to re;ect t%e null
%ypot%esis. 4t is usually t%e alternative %ypot%esis t%e researc% is really trying to assess. 2%yS =ince you cannotreally prove anyt%ing 'it% t%e statistics( '%en t%e null is re;ected( t%e implication is t%at t%e alternative is valid.
%. C(oi/e of Null an0 Al,e*na,i6ea. $ost common null 'ill be an 0"ual to %ypot%esis1 . Alternative is often t%e %oped#for %ypot%esis.
2%en t%e null is t%at a coeGcient is eual to <ero( 'e %ope to re;ect it and s%o' signi+cance of t%erelations%ip.
&. $utually e!cusive alternative # '%en t%e null is 0less t%an or eual to1( t%e mutually e!clusive alternative is framedas greater t%an.
!'. 8y9o,(e-i- ,e-,in 9a*ame,e*- # includes t'o statisticsa. Te-, -,a,i-,i/ ; calculated from t%e sample data+. Critical value of t%e test statistic. –
!1. >est statistic – calculated by comparing t%e point estimate of t%e population parameter 'it% t%e %ypot%esi<ed valueof t%e parameter. )i.e. t%e value speci+ed in t%e null %ypot%esis*
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 7/27
!2. Te-, -,a,i-,i/ <test statistic−h!pothesized vaue
standard error o( the sampestatistic
!. =tandard error of t%e sample statistic – ad;usted standard deviation of t%e sample.!!. Critical value of t%e test statistic #!". Alternative Bypot%esis – can be one sided or t'o sided. 2%et%er t%e test is one side or t'o#sided depends on t%e
proposition being tested.!#. One#tailed test – considered if somet%ing is ;ust greater t%an or less t%an a value. 9ut if you do not kno' on '%ic%
side( t'o tailed tests are preferred.
a. U99e* ,ail= -
0: μ ≤ μo versus - A : μ μ
0
+. 5pper tailT if calculated test statistic is greater t%an &.JM at MN sign level( 'e conclude t%at t%esample stasitic is suGciently greater t%an %ypot%esi<ed value. 2e re;ect t%e null %ypot%esis.
i. 4f calculated test statistic is less t%an &.JM( 'e conclude t%at t%e sample statistic is notsuGciently di6erent from %ypot%esi<ed value( and 'e fail to re;ect t%e null.
/. Lo>e* ,ail = - 0: μ μo versus - A : μ ≤ μ
0
0. >%e appropriate set of %ypot%eses depends on '%et%er 'e believe t%e population mean( ( to be
greater t%an )upper tail* or ess t%an )lo'er tail* of t%e %ypot%esi<ed value( μ0 .
!$. T>o<,aile0 ,e-, – allo' for deviations on bot% sides of t%e %ypot%esi<ed value )in general cases( <ero*
a. >'o#tailed test an be structured as - 0 : μ= μ0 , - A : μ / μ0 ,
+. =ince t%e alternative allo's for values above and belo' %ypot%esi<ed parameter( a t'o tailed test usest'o critical values)or re;ection points*
!%. De/i-ion?*e@e/,ion *ule fo* a ,>o<,aile0 <,e-,
a. Re;ect - 0 if T test statistic / 5pper critical value or test statistic @ lo'er critical value
!&. >ype 4 "rror – t%e re;ection of t%e null %ypot%esis '%en it is actually true"'. >ype 44 "rror # t%e failure to re;ect t%e null %ypot%esis '%en it is actually false. )diGcult in practice – depends on
sample si<e and critical calue c%osen* )alternative 'as not statistically signi+cant enoug% due to sample si<e andcritical value .also collinearity '%en more t%an t'o variables involved*
"1. Po'er of a test – probability of correctly re;ecting t%e null %ypot%esis '%en it is actually false. - )& – P)>ype 44 error* *"2. >ype 4 and >ype 44 "rrors in Bypot%esis >esting
T*ue Con0i,ionDe/i-ion -
0is 0rue - A is 1ase
Do no, Re@e/, - 0 Correct Decision 4ncorrect Decision – >ype 44error
Re@e/, - 0
4ncorrect Decision – >ype 4error(=igni+cance level( . -P)>ype 4 error*
Correct Decision# Po'er oft%e >est& – P) >ype 44 error*
". Con+dence interval and Bypot%esis >estinga. Con+dence 4nterval is a range of values 'it%in '%ic% t%e researc%er believe t%e true population
parameter may lie.
+.Sam9le S,a,i-,i/ ; 3/*i,i/al 6alue3-,an0a*0 e**o* ≤ popuation parameter ≤ Sam9le -,a,i-,i/ 3/*i,i/al 6alue
3S,an0a*0 e**o*
/. Restated also as T </*i,i/al 6alue ≤test statistic ≤ /*i,i/al 6alue
i. test stat =sampe stat − pop param ( -
0 )standard error
0. >%is is t%e range 'e fail to re;ect t%e null for t'o tailed %ypot%esis test at given level of signi+cancee.
"!. =tatistical =igni+cance – does not imply economic signi+cance. 2%en somet%ing is statistically signi+cant based ondata( t%e economic bene+ts mig%t be diminis%ed based on e!ecuting and maintain t%e strategy. )transaction
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 8/27
costs( ta!es( do'nside risk from s%ort sales* '%ic% can diminis% returns and not make strategy economicallyviable in long term( even if it is statistically signi+cant above <ero.
"". "conomic =igni+cance – see directly above"#. P#value # probability of obtaining a test statistic t%at 'ould lead to a re;ection of t%e null %ypot%esis( assuming t%e
null %ypot%esis is true. 4t is t%e smallest level of signi+cance for '%ic% t%e null %ypot%esis can be re;ected."$. One >ailed test#p#value # is t%e probability t%at lies above t%e computed test statistic for upper tai tests or belo'
t%e computed test statistic for lo'er tail tests."%. >'o tailed test#pvalue # probability t%at lies above t%e positive value of t%e computed test statistic plus t%e
probability t%at lies belo' t%e negative value of t%e computed test statistic."&. >#test # employs a t#statistic t%at uses a %ypot%esis test based on t%e t#distribution. $ostly used '%en n/I and
population variance is unkno'n. Can be used if distribution is normal and kno'n and sample si<e n@ I.#'. Critical Q#values
Le6el of Sini5/an/e T>o<Taile0 Te-, One Taile0 Te-,.1':1'" <1.#" < 1.2%.'" :"4 <1.&# <1.#".'1: 14 14<2."% <2.
#1. C%i=uared >est # used for %ypot%esis tests concerning t%e variance of a normally distributed population. Ketting
σ 2
represent t%e true population variance and σ 02
represent t%e %ypot%esi<ed variance(
#2. Bypot%esis of a t'o#tailed test of single population variance # - 0: σ 2=σ 0
2 - A :σ
2/ σ 0
2
#. Bypot%eses for one#tailed c%i#suared test structure asT
a. - 0: σ 2
≤ σ 02 - A :σ
2 σ 0
2
U99e* ,ail ,e-,
+. - 0: σ 2
σ 02 - A :σ
2≤ σ 0
2
Lo>e* Tail ,e-,
#!. Bypot%esis >esting of population variance – reuires use of c%i#suared distributed test statistic ( denoted 2 2
.
>%e c%i#suared distribution is asymmetrical and approac%es t%e normal distribution in s%ape as t%e degrees offreedom increase.
#". C%i=uared >est =tatistic < 2 2=
(n−1 ) s2
σ 0
2
##. F<,e-, < %ypot%eses concerned 'it% t%e euality of t%e variances of t'o populations are tested 'it% an F#
distributed test statistic.a. used under assumption t%at t%e populations from '%ic% samples are dra'n are normally distributed
and t%at t%e samples are independent.
+. σ 12∧σ 2
2variances¿norma popuation1∧2
/. T>o ,aile0 F,e-, - 0 : σ 12=σ 2
2 - A :σ 1
2/ σ 2
2
0. One Taile0 FTe-, - 0:σ 12
≤ σ 22
versus - A :σ 12>σ 2
2
OR
- 0: σ 12
σ 22
versus - A :σ 12<σ 2
2
e. F -
s1
2
s2
2,
Al'ays put t%e larger variance in t%e numerator ) s1
2
* .Follo'ing t%is convention( 'e only
needed to consider critical value for rig%t#%and tail. Also F#critical value takes into account degrees of
freedom for calculation U. n1−1∧n2−1
f. 31
2=variance o( th smape o( n1 observations¿ popuation1
. 32
2=variance o( the samp eo( n2 observations ¿ popuation2
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 9/27
#$. C(e+y-(e6- IneGuali,y # states t%at for any set of observations( '%et%er sample or population data( and
regardless f t%e distribution( t%e percentage of t%e observations t%at like 'it%in k standard deviations of
t%e mean is at least1−
1
k 2 fo* all k >1
#%. Relations%ips of C%ebys%evEs ineuality
a. #4 lie >i,(in +1.25 standard deviations¿ the mean
+. "#4 lie >i,(in +1.5 standard deviations¿ the mean
/. $"4 lie >i,(in +2 standard deviations¿ the mean
0. %&4 lie >i,(in +3 standard deviations¿ the mean
e. &!4 lie >i,(in +4 standard deviations¿ the mean
#&. 4mportance of C%ebys%evEs 4neuality – it applies to any distribution. 4f 'e kno' t%e underlying distribution isnormal. 2e can be even more precise about t%e percentage of observations t%at 'ill fall 'it%in a given number of standard deviations of t%e mean.
a. "vents for nonnormal distributions may not be so rare( occurring about &&N of t%e time for events
beyond + I =tandard Deviations.
$'. 9acktesting # involves comparing e!pected outcomes against actual data. 4t is common for Risk managers to
backtest VaR models to ensure model is forecasting losses 'it% same freuency predicted by t%e con+denceinterval.a. 2%en t%e VaR measure is e!ceeded during a given testing period( it is kno'n as an ",C"P>4O or an
",C""DAC". Ater backtesting( if number of e!ceptions is greater t%an e!pected( t%e risk managermay be underestimating t%e actual risk and VaR may be underestimated. 4f no. of e!ceptions is lesst%an e!pected( risk manager may be overestimating actual risk.
+. $1. Kimits of backtesting VaR
a. 9acktesting VaR can cause issues because e!ceptions are often serially correlated. Big% probabilityt%at an e!ception 'ill occur after a previous period %ad an e!ception.
+. Also occurrence of e!ceptions tend to be correlated to overall market volatility. Big% e!ceptions 'it%%ig% mkt volatility and lo' e!ceptions 'it% lo' mkt volatility.
/. Failure of VaR model to uickly react to Risk levels.
C(. 11 Co**ela,ion- an0 Co9ula- 8ull Ri- Manaemen, an0 Finan/ial In-,i,u,ion-1. Correlation # measures t%e strengt% of t%e linear relations%ip bt' t'o variables over time.2. Covariance # measures t%e direction of t%e co#movement bet'een t'o variables over time.. R%o - #&( ( and & ( # standardi<ed measure is more convenient in risk analysis applications t%an covariance(
'%ic% can %ave values bet'een −4∧4 .
a. ρ x!=cov ( X ,) )
σ x σ !
+. co v x!= ρ x! σ x σ !
/.
X − E ( X ) () − E () ) ]= E ( X , ) )− E ( X ) E () )
¿cov ( x , ! )= E ¿ +rst time is e!pected value of product of , and .
0. Variables are de+ned as independent if kno'ledge of one variable does not impact t%e probability
distribution for anot%er variable. 4n ot%er 'ords conditional probability of V3 given informationregarding probability distribution of V& is eual to t%e unconditional probability of V3 as e!pressed int%e follo'ing euationU
e. P ( % 2|% 1= x¿= P(% 2) f. Correlation of <ero does not imply t%at t%ere is no dependence bet'een t%e t'o variables. 4t implies
t%ere is no linear relations%ip bet'een t%e t'o variables. 9ut value of one variable can still %ave anonlinear relations%ip 'it% t%e ot%er variable.
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 10/27
!. "2$A $odel #a. Conventional 2isdom suggests t%at more recent observations s%ould carry more 'eig%t because t%ey
more accurately re:ect t%e current market environment. 2e can calculate a ne' covariance on day nusing an e!ponentially 'eig%ted moving average)"2$A* model. $odel is designed to vary t%e 'eig%t
given to more recent observations )by ad;usting λ
+. co vn= λco vn−1+(1− λ) X n−1) n−1 ( '%ere λ=¿ 'eig%t for most recent covariance ( X n−1
- percentage c%ange for variable , on day n#&. ) n−1=¿ percentage c%ange for variable on day
n#&.
/. can aso beused¿update variance
0. And also t%e correlation from t%e "2$A covariance and "2$A variance.". ARCB)&(&* $odel # Alternative met%od to updating covariance rate for t'o variables , and .
a. HARC8311 mo0el i-= co vn="+* X n−1 ) n−1+ 5co vn−1
+. * ('eig%t of most recent observation on covariance X n−1 ) n−1
/. 5 ('eig%t to most recent covariance estimate (co vn−1 ) .
0. " ('eig%t is given to long term average covariance rate
e. "2$A is a special case of ARCB)&(&* >i,( > : ' a : 31< λ ¿ an0 5= λ .
f. co vn=6 % 7+* X n−1 ) n−1+ 5co vn−1 ( '%ere 6 is 'eig%t to t%e long#term variance % 7 . >%is
euation reuires t%ree 'eig%ts to sum to &N or ya9-& and long term average covariance ratemust eual '?)&#a#9*
#. Variance Covariance $atri! # can be constructed using t%e calculate estimates of variance and covariancerates for a set of variables. >%e diagonal of t%e matri! represents t%e variance rates '%ere i-;. Covariance
rates are all ot%er elements of t%e matri! '%ere i / 8
$. Positive =emide+nite # matri! is positive semide+nite if it is internally consistent. >%e follo'ing e!pression
de+nes t%e necessary condition for an ! variance#covariance matri!( 9 , to be internally consistent for
all !& vectors
" , '%ere
"0
is t%e transpose of vector(
" T
a. "0
9 " 0
+. 2%en small c%anges are made to a small positive#semide+nite matri!( t%e matri! 'ill most likelyremain positive#semide+nite. Bo'ever( c%anges to a large &!& 'ill most likely cause t%e matri!to no longer be positive#semide+nite.
c.%. 4nternally consistent
a. For a I!I matri!. ρ1,2
2 + ρ1,3
2 + ρ2,3
2 −2 ρ1,3 ρ1,2 ρ2,3≤1 b.
&. enerating =amples for 9ivariate ormala. =uppose 'e %ave a bivariate normal 'it% t'o variables , and .+. 4f variable , is kno'n( and value of variable is conditional on value of variable ,.
/. "!pected value of is normally distribute 'it% mean ofT
i. E () )= μ) + ρ X) σ )
( X − μ X )σ X
σ ) =σ ) √ 1− ρ X)
2
0. =teps for enerating t'o samples sets of variables from a bivariate normal distribution.
i. 4ndependent samples : x∧: ! are obtained from a univariate standardi<ed normal
distribution. 5sing inversenorm functions in programming languages.
ii. =amples ϵ x∧ϵ ! are t%en generated. >%e +rst samples of , variables is t%e same as t%e
random sample from a univariate standard normal distribution ϵ x=: x
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 11/27
iii. >%e conditional sample of variables is determined as follo's.
i6. ϵ != ρ X) : x+: !√ 1− ρ X)
2
>(e*e ρ X) is t%e correlation bet'een , and in t%e
bivariate normal distribution1'. Factor $odels # a factor model can be sued to de+ne correlations bet'een normally distributed variables. >%e
follo'ing euation is a one#factor model
a. ; i=* i 1 +√ 1−* i2
: i One factor models are structured as follo'sT
i. "very 5i %as a standard normal distribution )mean-( stddev -&*
ii. Constant * i is bet'een #& and &
iii. F and : i %ave standard normal distributions and are uncorrelated 'it% eac% ot%er.
i6. "very : i is uncorrelated 'it% eac% ot%er
6. All correlations bet'een ; i∧; 8 result from t%eir dependence on a common factor ( F.
+. Advantages of one#factor models.i. Covariance matri! for one#factor models is positive semi#de+niteii. umber of correlations bet'een variables is greatly reduced. 4f t%ere 'ere variables( t%is
'ould be ))#&*?3* calculations. One#facto model only reuires estimates for correlations('%ere eac% of t%e variables is correlated 'it% one factor( F.
iii. $ost 'ell kno'n one#factor model is t%e CAP$.
11. Copula # a copula creates a ;oint probability distribution bet'een t'o or more variables '%ile maintaining t%eirindividual marginal distributions. Accomplis%ed by mapping t%e marginal distributions to a ne' kno'ndistribution. $apping of eac% variable to a ne' distribution is done based on percentiles.
a. 5sing a copula is a 'ay to indirectly de+ne correlation structure bet'een t'o variables '%en it is notpossible to directly de+ne correlation. $ade by assuming t%e t'o univariate %as a ;oint bivariatenormal distribution.
12. $arginal Distributions –individual unconditional distribution of a random variable.1. Hey properties of copulas
a. Preservation of t%e original marginal distributions '%ile de+ning a correlation bet'een t%em.1!. Correlation copula – created by converting t'o distributions t%at may be unusual or %ave uniue s%apes and
mapping t%em to kno' distributions 'it% 'ell#de+ned properties( suc% as t%e normal distribution. – done bymapping on a percentile#to#percentile basis.
a. ".g. Mt% percentile obs for variable , marginal distribution is mapped to t%e M t% percentile point on
; x standard normal. $apped for every observation. Done for marginal distribution no' ; !
standard normal distribution. >%e correlations bet'een ; x∧; ! are referred to as t%e copula
correlation.
+. >%e conditional mean of ; ! is linearly dependent on ; x ( and t%e conditional standard
deviation of ; ! is constant because t%e t'o distributions are bivariate normal.
1". >ypes of Copulas
a. =tudentEs t#Copula # variables are mapped to distributions ; 1∧; 2 t%at %ave a bivariate
=tudentEs t distribution rat%er t%an a normal distribution.i. >%e procedure is used to create a =tudentEs t#Copula assuming bivariate =tudentEs t
distribution 'it% f degrees of freedom and correlation ρ
1. =tep & T Obtain values from 2 by sampling from t%e inverse c%i#suare distribution
'it% f degrees of freedom.2. =tep 3T Obtain values by sampling from a bivariate normal distribution 'it% correlation
ρ
. $ultiply √ (
2 by t%e normally distributed samples.
+. aussian )ormal* Copula # maps t%e marginal distribution of eac% variable to t%e standard normaldistribution. >%e mapping of eac% variable to t%e ne' distribution is done based on percentiles.
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 12/27
/. $ultivariate copula # used to de+ne a correlation structure for more t%an t'o variables. =uppose t%e
marginal distributions are kno'n for variables. % 1% 2 % 3 <% ' . Distribution % i for eac% i
variable is mapped to a standard normal distribution( ; i . >%us t%e correlation structure for all
variables is no' based on a multivariate normal distribution.0. One#factor copula model – often used to de+ne t%e correlation structure in multivariate copula models.
>%e nature of dependence bet'een variables is impacted by t%e c%oice of t%e ; i distribution. >%e
follo'ing euation de+nes a one#factor copula model. 2%ere F and : i are standard normal
distributions T
i. ; i=* i 1 +√ 1−* i2
: i
ii. ; i distribution %as a multivariate =tudentEs t#distribution if : i∧ 1 are assumed to %ave
a normal distribution and a =tudentEs t#distribution( respectively. >%e c%oice of ; i
determines t%e dependency of t%e 5 variables( '%ic% also de+nes t%e covariance copula fort%e V variables.
1#. >ail dependencea. >%ere is a greater tail dependence in a bivariate =tudentEs t#distribution t%an a bivariate normal
distribution.+. 4t is more common for t'o variables to %ave t%e same tail values at t%e same time using t%e bivariate=tudentEs t –distribution.
/. During a +nancial crisis or some ot%er e!treme market condition( it is common for assets to be %ig%lycorrelated and e!%ibit large losses at t%e same time. =uggests t%at t%e =tudentEs t Copula is bettert%an a aussian copula in describing t%e correlation structure of assets t%at %istorically %ave e!tremeoutliers in t%e distribution tails at t%e same time.
C(! Linea* Re*e--ion >i,( One Va*ia+le a,-on In,*o0u/,ion ,o E/onome,*i/-
1. Dependent)e!plained* variable # variable attempting to be e!plained by an independent ),* variable.2. Parameters of an euation # indicate t%e relations%ip )c%ange in t%e relations%ip* bet'een t'o variables.
)Kinear in an OK= Regression*. =catter plot # a visual representation of t%e relations%ip bet'een t%e dependent variable and a given
independent variable. 4t uses a standard t'o#dimensional grap% '%ere t%e values of t%e dependent( or
Variable ( are on t%e vertical a!isa. Can indicate t%e nature of t%e relations%ip bet'een t%e dependent and independent variable.+. A closer inspection can indicate if t%e relations%ip is linear or nonlinear.
!. Population regression coeGcients # 5 i # euation describing t%e relations%ip bet'een dependent and
independent variables( but includes t%e entire population. 4t can be described as t%e true parameter estimates". Regression coeGcients # parameters of t%e population regression euation.
a. E () i| X i )= 50+ 5i X i
#. "rror >erm )noise component* # Di6erence bet'een eac% and its corresponding conditional "!pectation. )t%eline t%at +ts t%e data*
a. ϵ i=) i− E ( ) i| X i )
+. Deviation from t%e e!pected value is t%e result of factors ot%er t%an t%e included ,#variable)s* ./. 9reaking do'n euation into deterministic)systematic* component E () i| X i ) and ϵ i
nonsystematic or random component.0. "rror terms represents e6ects from independent variables not included in t%e model.
$. =lope CoeGcient # e!pected c%ange in for a unit c%ange in ,%. =ample Regression Function # euation t%at represents a relations%ip bet'een t%e and , variable)s* t%at is
only based on t%e information in a sample of t%e population. 4n almost all cases( t%e slope and interceptcoeGcients 'ill be di6erent t%an t%e population regression function.
&. Residual # di6erence bet'een t%e actual and e!pected value )sample regression estimate*1'. Hey Properties of Regression #
a. >erm linear %as implications for bot% independent variable and coeGcients.
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 13/27
+. One interpretation of t%e term linear relates to t%e independent variable)s* and speci+es t%at t%eindependent variables enters into t%e euation 'it%out a transformation. )suc% as a suare root orlogarit%m*. ".g. , - ln)amount consumed*
/. 3nd interpretation for t%e term linear applies to t%e parameters. =peci+es t%at t%e dependent variable isa linear function of t%e parameters( but does not reuire t%at t%ere is linearity in t%e variables.
E () i| X i )= 50+ 5i
2 X i E () i| X i )= 50+
1
51
X i
d. 4t 'ould not be appropriate t apply linear regression to estimate t%e parameters of t%ese functions. >%e primary concern for linear models is t%at t%ey display linearity in t%e parameters. 2%en 'e refer
to a linear regression model 'e generally assume t%at t%e euation is linear in t%e parametersW it mayor may not be linear in t%e variables.
11. Ordinary Keast =uares )OK=* – process t%at estimates t%e population parameters 5 i 'it% corresponding
values for bi t%at minimi<e t%e suared residuals. )i.e. error terms* Recall t%e e!pression
e i−∑ ) i−( b0+b1 X i ) W t%e OK= sample coeGcients are t%ose t%atT
a. $inimi<e ∑ e i
2=∑ () i−(b0+b1 X i ))2
+. =lope coeGcients b1
/. b1=¿
∑i=1
n( X i− ´ X )() i−) )
∑i=1
n
( X i− ´ X )2
12. 4ntercept term # lineEs intersection 'it% t%e #a!is at ,-. 4t can be positive( negative( <ero.
a. b0=) −b1´ X ) −mean o( ) ∧ ´ X −mean o( X
1. $et%od of OK= # to minimi<e t%e sum of suared errors.a. $ost of t%e ma;or assumptions pertain to t%e regression modelEs residual term.
1!. Hey assumptions of OK= #a. >%ree Hey Assumptions
i. "!pected value of t%e error term( conditional on t%e independent variable( is <ero E (ϵ i| X i )=0.
ii. All ),(* observations are independent and identically distributed )i.i.d*iii. 4t is unlikely t%at large outliers 'ill be observed in t%e data. Karge outliers %ave t%e potential
to create misleading regression results.a. A linear relations%ip e!ists bet'een t%e dependent and independent variable.+. $odel is correctly speci+ed in t%at it includes t%e appropriate independent variable and does not omit
variables./. >%e independent variable is uncorrelated 'it% t%e error terms.
0. >%e variance of ϵ i is constant for all X i T %ar ( ϵ i| X i )=σ 2
e. o serial correlation of t%e error term e!ists i.e. =orr ( ϵ i ,ϵ 8 )=0 (or 8=1,2,3,< >%e point being
t%at kno'ing t%e value of an error for one observation does not reveal information concerning t%evalue of an error for anot%er observation.
f. >%e error term is normally distributed.1". 9ene+ts of OK= "stimators #
a. 4nterpretation and Analysis of Regression outputs are easily understood across +elds of study+. 5nbiased( consistent and under special conditions – eGcient
1#. Properties of OK= "stimators and t%eir =ampling Distributions #1$. 5nbiased estimator of t%e population mean – >%e mean of t%e sampling distribution is used as an estimator of
t%e population mean is said to be an unbiased estimator 2B" >%e e!pected value of t%e estimator is eual to
t%e parameter you are trying to estimate. "),* - μ
1%. Central Kimit >%eorem # 'it% large sample si<es( it is reasonable to assume t%at t%e sampling distribution 'illapproac% t%e normal distribution. >%is means t%at t%e estimator is also a consistent "stimator.
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 14/27
1&. Consistent "stimator # an unbiased estimator is one for '%ic% t%e e!pected value of t%e estimator is eual tot%e parameter you are trying to estimate. A consistent estimator is one for '%ic% t%e accuracy of t%eparameter estimate increases as t%e sample si<e increases.
2'. =um of =uares "rror )=="?==R* # sums of suares t%at results from placing a given intercept and slopecoeGcient into t%e euation and computing t%e residuals( suaring t%e residuals and summing t%em.
a. # ∑ () i−) )2=∑ e i2
21. >otal =um of =uares )==>* # ∑ () i−) )2
22. Relations%ip bet'een t%e t%ree == # >otal =um of =uares - "!plained =um of =uares =um of suaredresiduals
2. CoeGcient of Determination( >2
# a measure of t%e goodness of +t of t%e regression. 4t is interpreted as a
percentage of variation in t%e dependent variable e!plained by t%e independent variable. )N e!plained byregression parameters*
a. underlying concept is t%at for t%e dependent variable( t%ere is a total sum of suares)==> or >==*around t%e sample mean. >%e regression e!plains some portion of t%at >==.
+. >otal =um of =uares - "!plained =um of =uares =um of suared residuals
/. TSS : SSE SSR0. RJ2 : 1 < SSE?SSTe. RJ2 : 1 < SSR?SST : SSE?SST
2!. Correlation CoeGcient from OK= Regression # r=√ >2
– a standard measure of t%e strengt% of t%e linear
relations%ip bet'een t'o variables. O> similar to coeGcient of determination.a. Correlation coeGcient indicates t%e sign of t%e relations%ip( '%ere as t%e coeGcient of determination
does not.+. >%e coeGcient of determination can apply to an euation 'it% several independent variables( and it
implies causation or e!planatory po'er(U/. '%ile t%e correlation coeGcient only applies to t'o variables and does not imply causation bet'een
t%e variables.2". =tandard "rror of t%e Regression )="R* # measure t%e degree of variability of t%e actual #values relative to
t%e estimated #values from a regression euation.a. ="R gauges t%e 0+t1 of t%e regression line.+. >%e smaller t%e standard error( t%e better t%e +t./. ="R is t%e standard deviation of t%e error terms in t%e regression.0. ="R is also referred to as t%e standard error of t%e residual( or t%e standard error of "stiamte )=""*.
C(" Re*e--ion >i,( Sinle Re*e--o*< 8y9o,(e-i- Te-,- an0 Con50en/e In,e*6al- a,-on In,*o0u/,ion
,o E/onome,*i/-
1. Con+dence 4ntervals for regression coeGcients #
a. b1 +t c sb1 o*
b
[¿ ¿1−(t c ? sb1)<B
1<b
1+ (t c ? sb
1)]
¿
b.c−¿
t ¿is t%e critica t'o#tailed t#value for t%e selected con+dence level 'it% t%e appropriate number
of degrees of freedom( '%ic% is eual to t%e number of sample observations minus 3 )i.e. ( n#3*
c. sb1 standard error of t%e regression coeGcient. 4t is a function of t%e ="R. As ="R rises ( sb1 also
increases and t%e con+dence interval 'idens.d. $akes sense because ="R measures t%e variability of t%e data about t%e regression line( and t%e more
variable t%e data( t%e less con+dence t%ere is in t%e regression model to estimate a coeGcient.e.
2. Bypot%esis >ests regression coeGcients )t#test* # t#test may also be used to test %ypot%esis t%at t%e true
slope coeGcients( 9& is eual to some %ypot%esi<ed value. Ketting b1 be t%e point estimate for B1 ( t%e
appropriate test statistic 'it% n#3 degreees of freedom is T
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 15/27
a. t =b1−B1
sb1
+. >e8ect - 0
i( t >+t critic∨t <−t critic ( re;ection of t%e null means t%at t%e sope coeGcient is di6erent
from t%e %ypot%esi<ed value of B1
/. Appropriate test i( Bis stat si&ni(icant - 0: B1=0 - A : B1 /0
. P#value # smallest level of signi+cance for '%ic% t%e null %ypot%esis can be re;ected. A small p#value providessupport for re;ecting t%e null %ypot%esis
a. >'o tailed tests – pvalue is t%e probability t%at lies above t%e positive value of t%e computed teststatistic. Plus t%e probability t%at lies belo' t%e negative value of t%e computed test statistic.
+. Pvalue gives a general idea of statistical signi+cance 'it%out selecting a signi+cance level.
!. Predicted Values # values of t%e dependent variable based on t%e estimated regression coeGcients and aprediction about t%e value of t%e independent variable. >%ey are t%e values t%at are predicted by t%eregression euation( given an estimate of t%e independent variable.
a. ) =b
0+b
1 X p X p− (orecasted vaue o( independent variabe
". Con+dence 4ntervals for Predicted Values #
a. ) + (t c ? s( ) ) −(t c ? s( )<) <) +( t c ? s( )
#. =tandard error of t%e forecast # s( c%allenging to calculate. For t%e variance of a forecastU.
a. s( 2=3E >
2[1+1
n+
( X − ´ X )2
(n−1) s x
2]
$. Dummy variable #a. Often used to uantify t%e impact of ualitative events.+. Dummy variables are assigned 01 or 0&1/. Reason for including a dummy variable is to see if variable is signi+cant or not on t%e dependent
variable.0. "stimated regression coeGcients for dummy variables indicates t%e di6erence in t%e dependent
variable for t%e category represented by t%e dummy variable and t%e average value of t%e dependentvariable for all classes e!cept t%e dummy variable class.
. Bomoscedasticity # t%e variance of t%e residuals is constant across all observations in t%e sample.&. Beteroscedasticity # t%e variance of t%e residuals is O> constant across observations in t%e sample. >%is
%appens '%en t%ere are subsamples t%at are more spread out t%an t%e rest of t%e sample.1'. 5nconditional %eteroscedasticity # occurs '%en t%e %eteroscedasticity is not related o t%e level of t%e
independent variables. 4t does not systematically increase or decrease 'it% c%anges in t%e value ofindependent variables..
a. 2%ile t%is is a violation of eual variance( it usually causes no ma;or problems 'it% regression.11. Conditional Beteroscedasticity # %eteroscedasticity is related to t%e level of t%e independent variable. "!ists
if t%e variance of t%e residual term increases as t%e value o t%e independent variable increases(a. Creates signi+cant problems for statistical inference.
12. "6ects of Beteroscedasticity on Regression Analysis #a. =tandard errors are usually unreliable estimates.
+. CoeGcient estimates )t%e b 8¿ arenEt a6ected.
/. 4f t%e standard errors are too small( but t%e coeGcient estimates t%emselves are not a6ected( t%e t#statistics 'ill be too large and t%e null %ypot%esis of no statistical signi+cance is re;ected too often. >%eopposite is true '%en standard errors are too large.
1. Detecting Beteroscedasticity #a. A scatter plot of t%e residuals versus t%e independent variables can reveal patterns among
observations.+.
1!. Correcting Beteroscedasticity # beyond scope of FR$. =ome available tec%niues. – suc% as robust standarderrors.
1". Robust standard "rrors # used to recalculate t%e t#statistics using t%e original regression coeGcients.1#. auss#$arkov >%eorem # if t%e linear regression model assumptions are true and t%e regression errors
display %omoscedasticity( t%en t%e OK= estimators %ave t%e follo'ing propertiesT
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 16/27
a. >%e OK= estimated coeGcients %ave t%e minimum variance compared to ot%er met%ods of estimatingt%e coeGcients )i.e. t%ey are t%e most precise* 9"=>
+. OK= estimated coeGcients are based on linear functions. K4"AR/. >%e OK= estimated coeGcients are unbiased( '%ic% means t%at in repeated sampling t%e averages of
t%e coeGcients from t%e sample 'ill be distributed around t%e true population parameters. i.e.
E (b0 )=B0∧ E (b1 )=B1 594A="D
0. OK= estimate of t%e variance of t%e errors is unbiased. i.e. E (σ 2 )=σ
2
594A="D
e. 9K5" – 9est Kinear 5nbiased "stimators1$.OLS %eteroscedasticity can cause problems and t%ere can be better estimatorsT
a. LSE # 2eig%ted least suares estimator )produce estimator 'it% a smaller variance*+. LADE ; least absolute deviations estimator )less sensitive to e!treme outliers*
1%. >#statistic '%en sample si<e is small #a. 2%en sample si<e is small( t%e distribution of a t#statistic becomes more complicated to interpret.+. 2%en sample is small( 'e must assume assumptions underlying linear regression model. 4n order to
apply and interpret t%e t statistic( error terms must be %omoscedastic. )constant variance or errorterms* and t%e error terms must be normally distributed.* and t%e error terms must be normallydistributed.
C(# Linea* Re*e--ion >i,( Mul,i9le Re*e--o*- a,-on In,*o0u/,ion ,o E/onome,*i/-
1. Omitted variable 9ias # '%en relevant variables are absent from a linear regression model( results 'ill likelylead to incorrect conclusions as t%e OK= estimators may not accurately portray t%e actual data.
a. Omitted variable is correlated 'it% t%e movement of t%e independent variable in t%e model.+. Omitted variable is a determinant of t%e dependent variable.2. $et%ods for omitting variable bias # )if omitted variable is correlated 'it% slope coeGcient( t%en t%e error
term 'ill also be correlated 'it% t%e slope coeGcienta. Mul,i9le Re*e--ion+. if bias is found( it can be addressed by dividing data into groups and e!amining one factor at a time
'%ile %olding ot%er factors constant. 2e need to utili<e multiple independent coeGcients)$KRegression*
. $ultiple Regression # linear regression 'it% more t%an one independent variable.
a. ) i=B0+B1 X 1i+B2 X 2 i+<+Bk X ki+ei
!. OK= estimators $ultiple Regression #a. 4ntercept term is t%e value of t%e dependent variable '%en t%e independent variables are all eual to
<ero
+. =lope coeGcients – estimated c%ange in t%e dependent variable for a one#unit c%ange in t%eindependent variable – holding the other independent variables constant./. =lope coeGcients )parameters* are sometimes called partial slope coeGcients.
". Bomoscedasticity # refers to t%e condition t%at t%e variance of t%e error term is constant for all independent
variables( ,. %ar ( ϵ i| X i )=σ 2
#. Beteroscedasticity # dispersion to t%e error terms varies over t%e sample. )variance is a function of t%eindependent variables*
$. =tandard "rror of t%e Regression )="R* # measures t%e uncertainty abou t%e accuracy of t%e predicted
values of t%e dependent variable. ) i=b0+bi X i . rap%ically is stronger '%en t%e actual !(y( data points lie
closer to t%e regression line. ) e i are smaller*
a. Formally ="R is t%e standard deviation of t%e predicted values for t%e dependent variable("uivalently( it is t%e standard deviation of t%e error terms in t%e regression. ="R is sometimes
speci+ed as se
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 17/27
i.
(b0+bi X i )
) i−¿¿¿2¿¿
n−k −1
¿)
) i−¿¿¿2¿¿¿
∑i−1
n
¿
¿
3E>=√ 33>
n−k −1=√ ¿
ii. =maller t%e standard error ( t%e better t%e +t.
%. $ultiple CoeGcient of Determination( >2
# can be used to test t%e overall e6ectiveness of t%e entire set
of independent variables in e!plaining t%e dependent variable.a. =ame calculationU+.
&. Ad;usted >2
# '%en >2
by itself is not suGcient or be a reliable measure of e!planatory po'er of t%e
$ultiple regression model.
a. 9ecause >2
almost al'ays increases as an independent variable are added to t%e model( even if
marginal contribution is of adding a ne' variable is not statistically signi+cant. R"F"RR"D to asOverstimating t%e Regression.
+. >a
2=1−(
n−1
n−k −1? (1− >
2 )), n=no.o( obsk =no. o( indep variabes
/. >a2
isessthan∨e#ua ¿ >2
4ncreasing a ne' variable may eit%er increase or decrease t%e
Ad;usted R#suared.
1'. Assumptions of $ultiple Regression #a. A linear relations%ip e!ists bet'een t%e dependent and independent variables.+. 4ndependent variables are not random( and t%ere is no e!act linear relation bet'een t'o or more
independent variables./. "!pect value of t%e error term( conditional on t%e independent variables ( is <ero.
E (ei| X 1, X 2 , < , X k ¿=0
0. >%e variance of t%e error terms is constant for all observations.
e. "rror term for one observation is not correlated 'it% t%at of anot%er observation. E (e i e 8 )=0 , 8 /i
f. >%e error term is normally distributed.11. $ulticollinearity #
a. >%e condition '%en t'o or more independent variables( or linear combinations of independentvariables ( are %ig%ly correlated 'it% eac% ot%er.
+. Condition distorts t%e standard error of t%e regression and t%e coeGcient standard errors. –leads toproblems 'it% t#tests for statistical signi+cance.
/. Degree of correlation 'ill determine t%e di6erence bet'een Perfect and imperfect multicollinearity.12. Perfect $ulticollinearity – one independent variable is a perfect linear combination of t%e ot%er independent
variables.
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 18/27
1. 4mperfect $ulticollinearity # t'o or more independent variables are %ig%ly correlated( but less t%an perfectlycorrelated.
1!. "6ect of $ultiCollinearity #a. reater probability t%at 'e 'ill incorrectly conclude t%at a variable is not statistically signi+cant )>ype
44 "rror*+.
1". Detecting $ulticollinearity #a. >%e situation '%ere t#tests indicate t%at non o t%e individual coeGcients is signi+cantly di6erent from
<ero( '%ile t%e R#suared in t%e $KR model is %ig%.+. 4ndicates variables toget%er e!plain muc% of t%e variation but t%e individual independent variables do
not./. Only 'ay t%is %appens is '%en independent variables are %ig%ly correlated 'it% eac% ot%er.0. 4f absolute value of t%e sample correlation bet'een t'o indep variables is greater t%an .X
multicollinearity may be a problem.e. And if individual variables mig%t not be %ig%ly correlated( linear combinations mig%t lead to
multicollinearity. =o even if lo' correlation( it may not necessarily mean t%at multicollinearity is aproblem.
1#. Correcting $ulticollinearitya. 5sually reuires omitting a variable 'it% %ig%est correlation and lo' individual RY3+. =tep'ise regression until multicollinearity is minimi<ed.
C($ 8y9o,(e-i- Te-,- an0 Con50en/e In,e*6al- in Mul,i9le Re*e--ion a,-on In,*o0u/,ion ,o
E/onome,*i/-
1. Bypot%esis >esting of Regression coeGcients )$ultiple Regression* #a. eeded to test estimated slope coeGcients to determine if t%e independent variables make a
signi+cant contribution to e!plaining t%e variation in t%e dependent variable.2. Determining =tatistical =igni+cance # t#statistic used to test t%e signi+cance of t%e individual coeGcients in a
multiple regression is calculated using t%e same formula.
a. t =b 8−B 8
sb 8
-
estimated re&ress coe((icient @ h!pothesized vaue
coe((icient standard error o( b 8 t%e tstatistic %as n#k#&
degrees of freedom.
+. testin& statistica si&ni(icance=¿ - 0:b 8=0 - A : b 8 /0
. 4nterpreting p#values # as 4 said( Tp ( t%e smallest level of signi+cance for '%ic% t%e null %ypot%esis can bere;ected.
a. P#value is less t%an t%e signi+cance level( t%e null %ypot%esis can be re;ected.+. P#value is greater t%an t%e signi+cance level( t%e null %ypot%esis cannot be re;ected.
!. Ot%er tests of t%e Regression CoeGcients #a.
". Con+dence 4ntervals for Regression CoeGcients #
a. b 8 + t c sb 8
+. Critical t#value 'it% n#k#& degrees of freedom and a MN signi+cance level( '%ere n is t%e number ofobservations and k is t%e number of independent variables.
#. Predicting t%e Dependent Variable #a. 2e can make predictions about t%e dependent variable based on forecasted values of the independent
variables.+. 9ut 'e need predicted values for more t%an one independent variable.
/. ) =b0+b1
^ X 1 i+b2^ X 2 i+<+bk
^ X ki
$. 8oint Bypot%esis >esting # tests t'o or more coeGcients at t%e same time. 2e could develop a %ypot%esis fora linear regression model 'it% t%ree independent variables t%at sets t'o of t%ese coeGcients eual to <ero (
- 0: b1=0b2=0 versus alternative %ypot%esis t%at one of t%em is not eual to <ero. 4f ;ust one of t%e
eualities in t%e null does not %old( 'e can re;ect t%e entire null %ypot%esis.a. 5sing a ;oint %ypot%esis is preferred in certain scenarios since testing coeGcients individually leads to
a greater c%ance of re;ecting t%e null %ypot%esis.%. F#=tatistic #
a. F#test assesses %o' 'ell t%e set of independent variables( as a group( e!plains t%e variation in t%edependent variable.
+. F#statistic is used to test '%et%er at least one of t%e independent variables e!plains a signi+cantportion of t%e variation of t%e dependent variable.
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 19/27
/. ".g. J indep variables ( %ypot%esis are structured asT
i. - 0: B1=B2=B3=B4=0versus - A :ateast one B 8 /0
0. F#statistic '%ic% is al'ays a one#tailed test
E33
k
33>
n−k −1
"== – "!plained sum of suares ==R –
sum of suared residuals . "== and ==R are found in AOVA table.
&. One<,aile0 C*i,i/al F<6alue 1 c ( at t%e appropriate level of signi+cance. >%e degrees of freedom for t%e
numerator and denominator areT
a. d ( num=k
+. d ( denom=n−k −1
/. Decision rule T re;ect - 0 if F)test statistic*/ 1 c (critica vaue )
0. Re;ection of t%e null %ypot%esis at a stated level of signi+cance indicates t%at at least one of t%ecoeGcients is signi+cantly di6erent t%an <ero( '%ic% is interpreted to mean t%at at least one of t%eindependent variables in t%e regression model makes a signi+cant contribution to t%e e!planation oft%e dependent variable.
e. 2%en testing t%e %ypot%esis t%at all t%e regression coeGcients are simultaneously eual to <ero( t%e F#test is al'ays a one#tailed test( despite t%e fact t%at it looks like it s%ould be a t'o#tailed test becauset%ere is an eual sign in t%e null %ypot%esis.
f. 4t s%ould be noted t%at re;ecting t%e null %ypot%esis indicates one or bot% of t%e coeGcients aresigni+cant.
1'. =peci+cation 9ias # refers to %o' t%e slope coeGcient and ot%er statistics for a given independent variable areusually di6erent in a simple regression '%en compared to t%ose of t%e same variable '%en included in amultiple regression.
a. 4ndicated by t%e e!tent to '%ic% t%e coeGcients for eac% independent variable is di6erent '%encompared across euations.
11. Decision rule for t%e F#test #12. $et%od of re;ection #1. CoeGcient of multiple correlation # simply t%e suare root of R#suared. Al'ays positive in multiple
regression.1!. AOVA # table t%at spits out all t%e data for a multiple linear regression. F#tests( tstats( pvalues( LMN conf
intervals( parameter coeGcients( RY3( "==( ==R( >==
1". >2∧ Ad8usted >
2
#
1#. =ingle Restrictions involving multiple coeGcients1$. $odel misspeci+cation #
C(" Mo0ellin an0 Fo*e/a-,in T*en0 Die+ol0 Elemen,- of Fo*e/a-,in
1. $ean =uared "rror – statistical measure computed as t%e sum of suared residuals divided by t%e totalnumber of observations in t%e sample.
a. MSE : ∑t =1
0
et
2
0
e t = !t − ^ ! t t%e residual for observation t or di6erence bet'een t%e observed
and e!pected observation*
+. ! t =^ 50+ ^ 51 0 E t i.e. a regression model.
/. $=" is based on in#sample data.0. Regression model 'it% t%e smallest sum of suared residuals. >%e residuals are calculated as t%e
di6erence bet'een t%e actual value observed and t%e predicted value based on t%e regression model.e. Closely related to t%e RY3 and t%us euation 'it% smallest $=" also %as t%e %ig%est RY3
2. $odel =election – is one of t%e most important criteria in forecasting data.
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 20/27
a. 5nfortunately( selecting a model based on smallest $=" or %ig%est RY3 is not e6ective in providinggood out of sample models.
+. A better met%odology to select t%e best forecasting model is to +nd t%e model 'it% t%e smallest out#f#sample ( one#step#a%ead $=".
. Reducing $=" 9ias #a. 4n sample $=" to estimate out#of#sample $=" is not very e6ective because in#sample $=" cannot
increase '%en more variables are included in t%e forecasting model. $=" 'ill %ave DO22ARD bias'%en predicting out#of#sample bias. )too small*
+. $ust ad;ust t%e $=" bias 'it% =3 measure.
!. 32
measure # an unbiased estimate of t%e $=" because it corrects for degrees of freedom as follo'sT
a.s2=
∑t =1
0
et
2
0 −k
". Data mining # as more variables are included in a regression euation( t%e model is at greater risk of over#+tting t%e in#sample data.
a. Problem 'it% data mining is t%at t%e regression model does a very good ;ob of e!plaining t%e sampledata but does a poor ;ob of forecasting out#of#sample data.
+. As more parameters are introduced to a regression model( it 'ill e!plain t%e data better( but may be'orse at forecasting out#of#sample data. 4ncreasing t%e number of parameters 'ill not necessarilyimprove t%e forecasting model.
#. $odel =election Criteria # =election criteria is often compared based on a penalty factor.
a. >%e unbiased estimator( sY3 de+ned earlier can be re'ritten $ultiplying > to t%e numerator anddenominator(
$. Penalty factor # multiplying t%e sY3 by > to strike a penalty for degrees of freedom.
a.s2=(
0
0 −k )∑t =1
0
e t
2
0
+. >akes t%e form of a penalty factor , $="
%. Akaike information Criterion )A4C* # A= =e
2 k
0 ?
∑t =1
0
e t
2
0
&. =c%'ar< information criterion # #3= =0
k
0 ?
∑t =1
0
et
2
0
1'. "valuating Consistency #11. Consistency # is a key property t%at is used to compare di6erent selection criteria. >'o conditions are reuired
for a model selection criteria to be consistentTa. 2%en t%e >R5" $odel or data#generating process )DP* is one of t%e de+ned regression models( t%en
t%e probability of selecting t%e true model approac%es one as t%e sample si<e increases.+. 2%en t%e >R5" model is not one of t%e de+ned regression models being considered( t%en t%e
probability of selecting t%e 9"=> APPRO,4$A>4O $OD"K approac%es one as t%e sample si<eincreases.
12. O>"T because 'e live in a very comple! 'orld( almost all economic and +nancial models %ave assumptions
t%at simplify t%is comple! environment.a. $=" does not penali<e for degrees of freedom and t%erefore not consistent+. 5nbiased $="( =Y3 – penali<es for degrees of freedom but ad;ustment is too small for consistency/. A4C penalty/ sY3 W %o'ever 'it% large sample si<es( n( A4C tends to select models t%at %ave too many
variables or parameters0. =4C seems to be t%e parameter 'it% greatest penalty factor and most consistent '%en parameters are
increased relatively 'it% sample si<e.e. 95> if 'e t%ink t%e true model is very comple!( t%en t%e A4C s%ould be considered?e!amined.
1. Asymptotic "Gciency –a. 4s t%e property t%at c%ooses a regression model 'it% one#step#a%ead forecast error variances closest to
t%e variance of t%e true model. 4nterestingly( t%e A4C is asymptotically eGcient and t%e =4C is notasymptotically eGcient.
+.
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 21/27
C($ C(a*a/,e*iin Cy/le- Die+ol0 Elemen,- of Fo*e/a-,in
1. >ime series – is a set of observations for a variable over successive periods of time2. >rend # a consistent pattern t%at can be observed from t%e data( '%en plotting etc. eg. =easonal trend during
a certain time period. >o FOR"CA=> a time series( one needs to understand and c%aracteri<e its structure.. Autoregression # refers to t%e process of regressing a variable on lagged or past values of itself.!. Autoregressive )AR* model # '%en a dependent variable is regressed on one or more lagged values of itself. #
)e.g. past values of salesE used to predict t%e current value of t%e variable.". Covariance stationary # if its mean( variance and covariance 'it% lagged and leading values do not c%ange
over time. Covariance stationary is a reuirement of using AR models.#. Autocovariance function # t%e tool used to uantify stability of t%e covariance structure. 4ts important lies in
its ability to summari<e cyclicalE dynamics in a series t%at is covariance stationary.$. Autocorrelation function # refers to t%e degree of correlation and interdependency bet'een data points in a
time series. 9ec correlations lend t%emselves to clearer interpretation t%an covariances.%. Partial autocoreraltion function – refers to t%e partial correlation and interdependency bet'een data in a time
series t%at measures t%e association bet'een data in a series after controlling for t%e e6ects of laggedobservations.
&. Reuirements for series to be covariance stationary #a. Constant and +nite e!pected value. )"!pected value of a time series is constant over time*+. Constant and +nite variance )>ime series volatility around its mean does not c%ange over time./. Constant and +nite covariance bet'een values at any given lag )>%e covariance of t%e time series 'it%
leading or lagged values is constant*1'. 4mplications of 'orking 'it% models t%at AR" O> covariance stationary #
a. >%is is ac%ieved by 'orking 'it% models t%at provide special treatment to trend and seasonalitycomponents t%at are stationary( '%ic% allo's t%e remaining( residual( cyclical components to be
covariance stationary.+. Forecasting models '%ose 0probabilistic nature1 c%anges 'ould not lend itself 'ell to predicting t%e
future )lacks covariance stationarity*/. =uc% a process 'ould make t%e process f c%aracteri<ing a cycle diGcult( if not impossible. Bo'ever(
nonstationary series can be transformed to appear covariance stationary by using transformed data(suc% as gro't% rates.
11. 2%ite oise # a process 'it% <ero mean( constant variance( and no serial correlation is referred to as '%itenoise. =4$PK"=> type of time series process and it is used as a fundamental building block for more comple!time series processes. "V" >BO5B ( serially uncorrelated( it may not be serially independent and normallydistributed.
12. 4ndependent 2%ite oise # time series process t%at e!%ibits serial independence and lack of serial correlation)=trong '%ite noise*
1. Variants of 2%ite noise – 4ndependent '%ite noise and ormal)aussian 2%ite noise*1!. ormal )aussian* 2%ite oise # time series process t%at e!%ibits serial independence( serially uncorrelated
and normally distributed is a normal '%ite noise.1". Dynamic structure of '%ite noise #
a. 5nconditional mean and variance $5=> be constant for any covariance stationary process.+. Kack of any correlation in '%ite noise means t%at all autocovariances and autocorrealtions are <ero
beyond displacement <ero. ) refers to distance of a moving body from a central point* same result%olds for partial autocorrelation function of '%ite noise.
/. 9ot% conditional and unconditional means and variances are t%e same for an independent '%ite noise.)lack forecastable dynamics*
0. "vents in a '%ite noise process e!%ibit no correlation bet'een past and present.1#. Kag Operators # uanti+es %o' a time series evolves by lagging data series. 4t enables a model to e!press
%o' past data links to t%e present and %o' present data links to t%e future.
a. ".g. Kag Operator 7 ! t = ! t −1
+. ".g. Common Kag Operator( t%e First-Dierence Operator C C ! t =(1
− 7 ) ! t = !t − ! t −1 .
Applies a polynomial in t%e Kag Operator.1$. Distributed lag # key component of an operator. 4t is a 'eig%ted sum of present and past values in a data
series( ac%ieved by lagging present values upon past values.1%. 2oldEs representation t%eorem # is a model for t%e covariance stationary residual. )i.e. a model t%at is
constructed after making provisions for trends and seasonal components. *a. >%eorem enables t%e selection of t%e correct model to evaluate t%e evolution of covariance
stationarity.+. 2oldEs >%eorem utili<es an in+nite number of distributed lags( '%ere t%e one#step#a%ead forecast error
terms are kno'n as 04nnovations1.1&. eneral linear process #
a. is a component in t%e creation of forecasting models in a covariance stationary time series. 4t uses2olEs Representation >%eorem to e!press innovations t%at capture evolving information set. >%ese
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 22/27
evolving information sets move t%e conditional mean over time )recall a reuirement of stationarity isconstant unconditional mean*
+. >%us it can model t%e dynamics of time series process t%at is outside of covariance stationarity)5nstable*
2'. Rational polynomials #a. Applying 2oldEs >%eorem to in+nite distributed lags is not practical( t%erefore( 'e need to restate t%is
lag model as in+nite polynomials in t%e lag operator because in+nite polynomials do not necessarilycontain an in+nite number of parameters.
+. 4n+nite polynomials t%at are a ratio of +nite#order polynomials are kno'n as rational polynomials.21. Rational distributed lags # >%e distributed lags constructed from t%ese rational polynomials. 2it% t%ese lags(
'e can appro!imate 2old Representation >%eorem. An Autoregressive $oving Average Process )AR$A* is anpractical appro!imation for 2oldEs Representation >%eorem.22. =ample mean and sample autocorrelation –
a. =ample $ean – an appro!imation of t%e mean of t%e population '%ic% can be used to estimate t%eautocorrelation function.
i. ´ !=1
0 ∑t =1
0
! t
+. =ample Autocorrelation – estimates t%e degree to '%ic% '%ite noise c%aracteri<es a series of data.Recall t%at for a time series to be classi+ed as '%ite noise process( all autocorrelations must be <ero in
t%e 0population1 data set. >%e sample autocorrelation ( as a function of displacement( D , is
computed as follo'sT
i. ^ ρ (D )=∑
t =D +1
0
[ ( ! t −´ ! ) ( ! t −D − ! ) ]
∑t =1
0
( ! t −´ ! )2
/. =ample partial autocorrelation – can also be used to determine '%et%er a time series e!%ibits '%itenoise. 4t di6ers from sample autocorrelation in t%at it performs linear regression on a +nite or feasibledata series. .
i. Bo'ever( t%e outcome of sample partial correlations is typically identical to t%at ac%ievedt%roug% sample autocorrelation.
ii. =ample partial autocorrelations usually plot 'it%in t'o#standard errors bands )i.e. LMNcon+dence interval* '%en time series is '%ite noise.
2. Z#staitstic # can be used to measure t%e degree to '%ic% autocorrelations vary from <ero and '%et%er '%ite
noise is present in a dataset.a. >%is can be done by evaluation t%e overall statistical signi+cance of t%e autocorrelation.+. >%is statistical measure is appro!imately C%i#=uared 'it% mE degrees of freedom in large samples
under t%e null %ypot%esis of no autocorrelations.2!. 9o!#Pierce Z#statistic #
a. Re:ects t%e absolute magnitudes of t%e correlations( because it sums t%e suared autocorrelations.+. >%us t%e signs do not cancel eac% ot%er out( and large positive or negative autocorrelations
coeGcients 'ill result in large Z#statistics.2". K;ung#9o! Z#=tatistic #
a. 4s similar to 9o!#Pierce Z#=tatistic ",C"P> t%at it replaces t%e sum of suared autocorrelations 'it% a'eig%ted sum of suared autocorrelations. For large samples si<es( 'eig%ts for bot% statistics areroug%ly eual. $ore used on small data sets but same results 'it% large data sets.
C(% Mo0ellin Cy/le- = MA AR an0 ARMA Mo0el- Die+ol0 Elemen,- of Fo*e/a-,in
1. $oving Average Process –a. 5sed to capture t%e relations%ip bet'een a time series variable and its current and lagged random
s%ocks.+. 4s a linear regression of t%e current values of a time series against bot% t%e current and previous
unobserved '%ite noise error terms( '%ic% are random s%ocks.2. $a)&* Process # +rst order $oving Average Process %as a mean of <ero and a constant variance and can be
de+ned asT
a. !t =ϵ t + ϵ t −1
+. ϵ t −¿ current random '%ite noise s%ock
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 23/27
/. ϵ t −1−¿ one<9e*io0 lae0 *an0om >(i,e noi-e -(o/ 3uno+-e*6a+le -(o/
0. −¿ coeGcient for t%e lagged random s%ock.
e. A (1 ) process isconsidered t the (irst order becauseit on! %as one lagged error term.
f. 0his !ieds ver! short −termmemor! because it only incorporates '%at %appens one period ago.
4f 'e ignore t%e lagged error term for a moment( and assume ϵ t >0 ( t%en ! t >0 . >%is is
euivalent to saying t%at a positive error term 'ill yield a positive dependent variable ) ! t * .
. 2%en adding back t%e lagged error term( 'e are no' saying t%at t%e dependent variable is impactedby not only t%e current error term( but also t%e previous periods unobserved error term( '%ic% is
ampli+ed by a coeGcient ) *
. Autocorrelation cuto6 –
a. ρ
1=
1
1+1
2, '%ere ρD =0 (or D >1
+. For any value beyond t%e +rst lagged error term( t%e autocorrelation 'ill be <ero in an $A)&* process.
>%is is important because it is one condition of being covariance stationary)mean - and var- σ 2
*
'%ic% is a condition of t%is process being a useful estimator.!. $oving average representation # %as bot% a current random s%ock and an unobservable s%ock on t%e
independent side of t%is euation.a. Present a problem for forecasting in t%e real 'orld because it does not incorporate observable s%ocks.
>%e solution for t%is problem is kno'n as autoregressive representation.". Autoregressive representation # t%e $A)&* process is inverted so 'e %ave a lagged s%ock and a lagged value
of t%e time series itself. >%e condition of inverting t%e $A)&* process is ||<1 . >%e autoregressive
representation( '%ic% is an algebraic rearrangement of t%e $A)&* process( is e!pressed in t%e follo'ingformula.
a. ϵ t = !t + ϵ t −1
+.
processo( inversion enabes the (orecaster¿ express current observabes∈termso( past observabes .
#. $A)* process # forecasters can broaden t%eir %ori<on to a +nite order $A process of order q ( '%ic%
essentially adds lag operators out to t%e #th
observation and potentially improves t%e $A)&* process.
a. ! t =ϵ t +1 ϵ t −1+<+# ϵ t −#
b. ϵ t −# t% period lagged random '%ite noise s%ock
c. coeGcients for t%e lagged random s%ocks
0. $A)* process t%eoretically captures comple! patters in greater detail( '%ic% can potentially providefor more robust forecasting. >%is also lengt%ens t%e memory from one period to t%e t% period.
e. $a)* process also e!%ibits autocorrelation cuto6 after t%e t% lagged error term. Again t%is isimportant because covariance stationarity is essential to t%e predictive ability of t%e model.
$. First#order autoregressive )AR)&* * process #a. $ore capable of capturing a more robust relations%ip and not in need of being inverted because it is
already in a favorable rearrangement – compared to t%e unad;usted moving average process.+. $ust %ave mean - and constant variance./. =peci+ed in t%e form of a variable regressed against itself in a lagged form. >%is relations%ip can be
s%o'n belo'T
i. ! t =ϕ ! t −1+ϵ t
ii. ϕ−¿ coeGcient of t%e lagged observation of t%e variable being estimated.
iii. ϵ t current random '%ite noise s%ock
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 24/27
d. Predictive ability depends on it being covariance stationary( i.e. abs)p%i* @&e. Process allo's us to use a past observed variable to predict a current observed variable.
%. Forecasters need #a. >o accurately estimate t%e autoregressive parameters( 'e need to accurately estimate t%e
autocovariance of t%e data series. "nter t%e uler#2alker "uation.&. ule#2alker "uation #
a. uler#2alker concept used to solve for t%e autocorrelations of an AR)&* process.
+. ρt =ϕt (or t =0,1,2,<
c. 5sed to reinforce a very important distinction bet'een autoregressive processes and moving average
processes.0. Recall t%at moving average processes e!%ibit autocorrelation cuto6( '%ic% means autocorrelation are
essentially <ero beyond t%e order of t%e process.e. >%e signi+cance of t%e uler#2alker "uation is t%at for autoregressive processes( t%e autocorrelation
decays vary gradually.
i. ".g. ϕ=0.651st period=0.6512nd period 0.65
2=0.4225 ,∧soon<
f. 4f coeGcient 'as negative( 'e 'ould e!perience an oscillating autocorrelation decay.1'. Ar)p* Process )eneral pt% Order autoregressive Process #
a. "!pands t%e AR)&* process out to t%e pt% observation as seen belo'T
i. ! t =ϕ1 ! t −1+ϕ2 ! t −2+<+ϕ p ! t − p+ϵ t
+. AR)p* process is also covariance stationary if |ϕ|<1 and it e!%ibits t%e same decay in
autocorrelations t%at 'as found in t%e AR)&* process./. 2%ile an AR)&* process only evidences oscillation in its autocorrelations)s'itc%ing from pos to neg*
11. Autoregressive $oving Average Process ) AR$A * –a. Possible for a time series to s%o' bot% signs of bot% $A and AR processes. And t%eoretically capture a
ric%er relations%ip.+. A combination of unobservable s%ocks and lagged)past be%avior* values.
/. ! t =ϕ ! t −1+ϵ t + ϵ t −1
d. $ust be covariance stationary if abs)t%eta* @ &.e. Autocorrelations still decay graduallyf. Provides %ig%est possible set of combinations for time series forecasting of t%e t%ree models discussed
in t%is topic..
&3. Applications of AR and AR$A process #a. $ust pay attention to decay of autoccorelation function.b. Determine if any spikes in autocorrelation function '%ic% may indicate using an AR or AR$A.c. ".g. if every &3t% auto correlation ;umps up( maybe a seasonality e6ect.0. 5seful to test various models using regression results. )easiest t see di6erences using data t%at follo's
some pattern of seasonality.
C(2 E-,ima,in Vola,ili,ie- an0 Co**ela,ion- fo* Ri- Manaemen, 8ull O9,ion- Fu,u*e- an0 O,(e*
De*i6a,i6e- &,( e0
1. 2eig%ting =c%emes to estimate volatility –a. "ually 'eig%ted )average*+. Autoregressive conditional %eteroscedasticty model )ARCB*/. "2$A $odel0. enerali<ed ARCB)&(&* – ARCB)&(&* #e. $a!imum likeli%ood estimator
2. EGually >ei(,e0 3A6e*ae ;a. >raditional models +rst used c%ange in asset value from period to period. From continuously
compounded return over successive days is represented asT
i. ui=ln( 3i
3 i−1), "here 3 i=asset price at time i .
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 25/27
ii.
u= 1
m∑i=1
m
un−i "here n= present period pus no . o( observations, m ,eadin&up¿current perio
iii. $a!imum likeli%ood estimator of variance( assuming ´ μ=0 is :
1. σ 2=
1
m∑i=1
m
μn−i
2
3. =implest terms ( %istorical data is used to generate returns in asset pricing series.Bistorical returns used to generate volatility parameter.
I. Can be used to infer e!pected reali<ations of risk.J. "ually 'eig%ted by &?m . all periods %ave t%e same 'eig%t. 4f 'e 'ant more strengt%
on earlier periods( 'e canTb. eneral 2eig%ting =c%eme –
i. σ 2=∑
i=1
m
* i μn−1
2"here * i="ei&ht on return i da!s a&o. Fei&hts must ∑
1.
¿
ii. condition∑i=1
m
* i=1
iii. Ob;ective is to generate greater in:uence on recent observations( t%en all alp%as 'ill decline invalue for later observations.
. Autoregressive conditional %eteroscedasticity model ) ARCB)m* * –a. One e!tension is to assume long#run variance t%roug% ARCBT
b. σ n2
=6 % 7+∑i=1
m
* i un−i
2
"ith6 +∑ * i=1
c. % 7=on& runvariance not "ei&hted
d. "=6 % 7 on& runvariance "ei&hted b! parameter 6
e. σ n2="+∑
i=1
m
* i un−i
2
f.!. "2$A –
a. $odel is a speci+c case of t%e general 'eig%ting model presented in t%e previous section.+. >%e main di6erence 4s t%at t%e 'eig%ts are assumed to deline e!ponentially back t%roug% time.
/.
σ n2= λ σ n−1
2+(1− λ )un−1
2
"here λ−"ei&ht on previous voatiit! est .∧u is return( ch& )∈ previous period
d. 9ene+t of "2$A is t%at it reuires fe' data points.e. Current estimate of variance 'ill t%en feed into t%e ne!t periods estimate( as 'ill t%is periodEs suared
return.f. >ec%nically t%e only ne'E piece of information for t%e volatility calculation 'ill be t%at attritubted
suared return.
". 2eig%ts of "2$A model - λ. ARCB )&(&* model # one of t%e most popular met%ods of estimating volatility. ot only incorporates t%e most
recent estimates of variances and suared return( but also a variable t%at accounts for a long#run averagelevel of variance.
X. ARCB )p(* – p is for number of lagged terms on %istorical returns suared. Z stands for number of laggedterms on %istorical volatility.
%. 2eig%ts of ARCB)&(&* model
a. σ n2="+* un−1
2 + 5 σ n−1
2
i. a : >ei(,in on ,(e 9*e6iou- 9e*io0- *e,u*nii. B ; >ei(,in on 9*e6iou- 6ola,ili,y e-,ima,e
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 26/27
iii. > ; >ei(,e0 lon<*un 6a*ian/e : 6 % 7
i6. % 7 : lon<*un a6e*ae 6a*ian/e :"
1−* − 5
6. * + 5+6 =1
6i. * + 5<1 fo* -,a+ili,y -o ,(a, 6 i- no, nea,i6e.
&. =et ARCB parameters to become an "2$A model.
a. "=0 , * =1− λ , 5= λ .
+. ARCB model adds to t%e information generated by t%e "2$A model in t%at it also assigns a'eig%ting to t%e average long#run variance estimate.
/. ARCB also %as an additional c%aracteristic '%ic% is t%e implicit assumption t%at variance tends torevert to a long#term average level.
0. Recognition of a mean#reverting c%aracteristic in volatility is an important feature '%en pricingderivative securities suc% as options.
1'. $ean Reversiona. "!perience indicates t%at volatility e!%ibits a mean#reverting c%aracteristic. A ARCB model tends to
display a better t%eoretical ;usti+cation t%an t%e "2$A model.+. $et%ods for estimating ARCB parameters )or 'eig%ts*( %o'ever( often generates outcomes t%at are
not consistent 'it% t%e modelEs assumptions./. =um of t%e 'eig%ts a and 9 are sometimes greater t%an &# '%ic% causes instability in t%e volatility
estimation.0. 4n t%is case( t%e analyst must resort to using "2$A model.e. 4f model is to be stationary over time – t%e persistence ) sum )a 9* * plus be less t%an one )'.r.t.
t%e reversion to t%e mean*f. >%e P"R=4=>"C" describes t%at rate at '%ic% t%e volatility 'ill revert to its long term value follo'ing
a large movement. >%e B4B"R t%e persisitence )given t%at it is less t%an one* t%e longer it 'ill take to revert to t%e mean
follo'ing a s%ock or large movement.(. Perisitence of & means t%at t%ere is no reversion and 'it% eac% c%ange in volatility( a ne' level is
attained.
.
11. Volatility using $a!imum Kikeli%ood "stimatora. $K" selects values of model parameters t%at ma!imi<e t%e likeli%ood t%at t%e observed data 'ill occur
in t%e sample.+. Reuires formulating an e!pression or function for t%e underlying probability distribution of t%e data
and t%en searc%ing for t%e parameters t%at ma!imi<e t%e value generated by t%e e!pression.12. ARCB "stimation
a. "stimated using ma!imum likeli%ood tec%niues.+. "stimation begins 'it% a guess of t%e modelEs paramters./. >%en a calculation of t%e likeli%ood function based on t%ose parameter estimates is made.0. Parameters t%en slig%tly ad;usted until t%e likeli%ood function fails to increase( at '%ic% t%e estimation
process assumes it %as ma!imi<ed t%e function and stops.e. >%e values of t%e parameters at t%e point of ma!imum value in t%e likeli%ood function are t%en used to
estimate ARCB model volatility.1. Volatility term structure #
a. ood ;ob at modeling volatility clustering '%en periods of %ig% volatility tend to be follo'ed by ot%erperiods of %ig% volatility and periods of lo' volatility tend to be follo'ed by subseuent periods of lo'
volatility.
+. >%us( t%ere is autocorrelation in ui2
.
/. 4f ARCB models does a good ;ob of e!plaining volatility c%anges( t%ere s%ould be very little
autocorrelation in ui2
? σ i2
0. Does a good ;ob at forecasting volatility from a term structure perspective )time to maturity* "vent%oug% t%e actual volatility term structure +gures are some'%at di6erent from t%ose forecasted byARCB models( ARCB generated volatility data does an e!cellent ;ob in predicting %o' t%e volatilityterm structure responds to c%anges in volatility.
1!. 4mpact of volatility c%anges1". "stimating covariances and correlations #
a.
7/24/2019 BOOK II - Definitions
http://slidepdf.com/reader/full/book-ii-definitions 27/27
1#. Consistency condition for covariances
C( ! Fa+oi Simula,ion Mo0ellin Fa+oi Simula,ion an0 O9,imia,ion in Finan/e.
&. =imulation models – generate random inputs t%at are assumed to follo' a probability distribution.a. 2it% inputs( simulation model t%en generates scenarios or trials based on probabilities associated 'it%
probability distributions.b. Kast step is to analy<e t%e c%aracterisitics of t%e probability generated inputs )samples* – )mean(
variance( ske'ness( conf intervals*c.
3. C%oosing =imulation models #
a. 9ootstrapping >ec%niueb. Parameter "stimate >ec%niuec. 9est Fit tec%niued. =ub;ective uess >ec%niue
I. 9ootstrapping tec%niue #J. Parameter "stimate >ec%niue #M. 9est#Fit >ec%niue #. =ub;ective uess >ec%niue #X. $onte Carlo =imulation #. Advantages of =imulation $odeling #L. Correlations can be incorporated into simulation modeling #&. Relations%ip bet'een Accuracy and umber of =cenarios&&. "stimator 9ias&3. Discreti<ation "rror 9ias –
&I. 4dentifying $ost "Gcient "stimator #&J. 4nverse >ransform $et%od #&M. 4nverse >ransform met%od for Discrete Distributions #&. 4nverse >ransform $et%od for Continuous Distributions #&X. Ba<ard Rate #&. Pseudorandom umber enerators&L. =ee random numbers #3. $idsuare tec%niue #3&. Congruential pseudorandom number generator #33. Kinear congruential pseudorandom generator #3I. Zuasirandom =euences ? Ko'#discrepancy =euences #3J. =trati+ed =ampling #3M. Katin Bypercube =ampling $et%od #
3.enerali<ed permutation matri! #