unit 3 multiple linear regression

56
NYU WIRELESS Unit 3 Multiple Linear Regression EL-GY 6143/CS-GY 6923: INTRODUCTION TO MACHINE LEARNING PROF. PEI LIU 1

Upload: others

Post on 10-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unit 3 Multiple Linear Regression

NYUWIRELESS

Unit 3 Multiple Linear RegressionEL-GY 6143/CS-GY 6923: INTRODUCTION TO MACHINE LEARNING

PROF. PEI LIU

1

Page 2: Unit 3 Multiple Linear Regression

NYUWIRELESS

Learning ObjectivesqFormulate a machine learning model as a multiple linear regression model.

◦ Identify prediction vector and target for the problem.

qWrite the regression model in matrix form. Write the feature matrix

qCompute the least-squares solution for the regression coefficients on training data.

qDerive the least-squares formula from minimization of the RSS

qManipulate 2D arrays in python (indexing, stacking, computing shapes, …)

qCompute the LS solution using python linear algebra and machine learning packages

2

Page 3: Unit 3 Multiple Linear Regression

NYUWIRELESS

OutlineqMotivating Example: Understanding glucose levels in diabetes patients

qMultiple variable linear models

qLeast squares solutions

qComputing the solutions in python

qSpecial case: Simple linear regression

qExtensions

3

Page 4: Unit 3 Multiple Linear Regression

NYUWIRELESS

Example: Blood Glucose LevelqDiabetes patients must monitor glucose level

qWhat causes blood glucose levels to rise and fall?

qMany factors

qWe know mechanisms qualitatively

qBut, quantitative models are difficult to obtain◦ Hard to derive from first principles◦ Difficult to model physiological process precisely

qCan machine learning help?

4

Page 5: Unit 3 Multiple Linear Regression

NYUWIRELESS

Data from AIM 94 Experiment

qData collected as series of events◦ Eating◦ Exercise◦ Insulin dosage

qTarget variable glucose level monitored

5

Page 6: Unit 3 Multiple Linear Regression

NYUWIRELESS

Demo on GitHubqAll code is available in github:https://github.com/pliugithub/MachineLearning/blob/master/unit03_mult_lin_reg/demo_glucose.ipynb

6

Page 7: Unit 3 Multiple Linear Regression

NYUWIRELESS

Using Google ColaboratoryqTwo options for running the demos:

qOption 1: ◦ Clone the github repository to your local machine and run it using jupyter notebook◦ Need to install all the software correctly

qOption 2: Run on the cloud in Google Colaboratory

qFor Option 2:◦ Go to https://colab.research.google.com/◦ File->Open Notebook◦ Select GitHub tab◦ Enter github URL:

https://github.com/sdrangan/introml◦ Select the unit03_mult_lin_reg/demo1_glucose.ipynb

7

Page 8: Unit 3 Multiple Linear Regression

NYUWIRELESS

Demo on Google Colab

8

Page 9: Unit 3 Multiple Linear Regression

NYUWIRELESS

OutlineqMotivating Example: Understanding glucose levels in diabetes patients

qMultiple variable linear models

qLeast squares solutions

qComputing the solutions in python

qSpecial case: Simple linear regression

qExtensions

9

Page 10: Unit 3 Multiple Linear Regression

NYUWIRELESS

Simple vs. Multiple RegressionqSimple linear regression: One predictor (feature)

◦ Scalar predictor 𝑥◦ Linear model: "𝑦 = 𝛽! + 𝛽"𝑥◦ Can only account for one variable

qMultiple linear regression: Multiple predictors (features)◦ Vector predictor 𝒙 = (𝑥" , … , 𝑥#)◦ Linear model: "𝑦 = 𝛽! + 𝛽"𝑥" + ⋯+ 𝛽#𝑥#◦ Can account for multiple predictors◦ Turns into simple linear regression when 𝑘 = 1

10

Page 11: Unit 3 Multiple Linear Regression

NYUWIRELESS

Comparison to Single Variable ModelsqWe could compute models for each variable separately:

𝑦 = 𝑎" + 𝑏"𝑥"𝑦 = 𝑎$ + 𝑏$𝑥$⋮

qBut, doesn’t provide a way to account for joint effects

qExample: Consider three linear models to predicting longevity:◦ A: Longevity vs. some factor in diet (e.g. amount of fiber consumed)◦ B: Longevity vs. exercise◦ C: Longevity vs. diet AND exercise◦ What does C tell you that A and B do not?

11

Page 12: Unit 3 Multiple Linear Regression

NYUWIRELESS

Special Case: Single VariableqSuppose 𝑘 = 1 predictor.

qFeature matrix and coefficient vector:

𝐴 =1 𝑥!⋮ ⋮1 𝑥"

, 𝛽 = 𝛽#𝛽!

qLS soln: 𝛽 = !$𝐴%𝐴

&! !$𝐴%𝑦 = 𝑃&!𝑟

𝑃 = 1 �̅��̅� 𝑥' , 𝑟 = -𝑦

𝑥𝑦qObtain single variable solutions for coefficients (after some algebra):

𝛽! =𝑠()𝑠('

, 𝛽# = -𝑦 − 𝛽!�̅�, 𝑅' =𝑠()'

𝑠('𝑠)'

12

Page 13: Unit 3 Multiple Linear Regression

NYUWIRELESS

Loading the Data

qscikit-learn package:◦ Many methods for machine learning◦ Datasets◦ Will use throughout this class

qDiabetes dataset is one example

13

Page 14: Unit 3 Multiple Linear Regression

NYUWIRELESS

Simple Linear Regression for Diabetes DataqTry a fit of each variable individually

qCompute 𝑅*' coefficient for each variable

qUse formula on previous slide

q“Best” individual variable is a poor fit◦ 𝑅#$ ≈ 0.34

14

Best individual variable

Page 15: Unit 3 Multiple Linear Regression

NYUWIRELESS

Scatter PlotqNo one variable explains glucose well

qMultiple linear regression could be much better

15

Page 16: Unit 3 Multiple Linear Regression

NYUWIRELESS

OutlineqMotivating Example: Understanding glucose levels in diabetes patients

qMultiple variable linear models

qLeast squares solutions

qComputing the solutions in python

qSpecial case: Simple linear regression

qExtensions

16

Page 17: Unit 3 Multiple Linear Regression

NYUWIRELESS

Finding a Mathematical Model

17

qGoal: Find a function to predict glucose level from the 10 attributes

qProblem: Several attributes ◦ Need a multi-variable function

Attributes Target𝑦 =Glucose level

𝑦 ≈ "𝑦 = 𝑓(𝑥" , … , 𝑥"!)

𝑥": Age𝑥$: Sex𝑥%: BMI𝑥&: BP𝑥': S1

⋮𝑥"!: S6

Page 18: Unit 3 Multiple Linear Regression

NYUWIRELESS

Matrix Representation of DataqData is a matrix and a target vector

q𝑛 samples: ◦ One sample per row

q𝑘 features / attributes /predictors: ◦ One feature per column

qThis example:◦ 𝑦( = blood glucose measurement of i-th sample◦ 𝑥(,* : j-th feature of i-th sample◦ 𝒙(

+ = [𝑥(,", 𝑥(,$,…, 𝑥(,#]: feature or predictor vector◦ i-th sample contains 𝒙( ,𝑦(

18

𝑋 =𝑥22 ⋯ 𝑥23⋮ ⋱ ⋮𝑥42 ⋯ 𝑥43

Attributes

Samples𝑦 =𝑦2⋮𝑦4

Target vector

Page 19: Unit 3 Multiple Linear Regression

NYUWIRELESS

In class exercise

19

Page 20: Unit 3 Multiple Linear Regression

NYUWIRELESS

OutlineqMotivating Example: Understanding glucose levels in diabetes patients

qMultiple variable linear models

qLeast squares solutions

qComputing the solutions in python

qSpecial case: Simple linear regression

qExtensions

20

Page 21: Unit 3 Multiple Linear Regression

NYUWIRELESS

Multivariable Linear Model for Glucose

21

qGoal: Find a function to predict glucose level from the 10 attributes

qLinear Model: Assume glucose is a linear function of the predictors:

glucose ≈ prediction = 𝛽# + 𝛽! Age +⋯+ 𝛽+ 𝐵𝑃 + 𝛽, S1 +⋯+ 𝛽!# S6

qGeneral form:

AttributesTarget𝑦 =Glucose level

𝑦 ≈ "𝑦 = 𝑓(𝑥" , … , 𝑥"!)

Age, Sex, BMI,BP,S1, …, S6𝒙 = [𝑥" , … , 𝑥"!]

10 Features

𝑦 ≈ "𝑦 = 𝛽! + 𝛽"𝑥" + ⋯+ 𝛽&𝑥& + 𝛽'𝑥' + ⋯+ 𝛽"!𝑥"!Target

Intercept

Page 22: Unit 3 Multiple Linear Regression

NYUWIRELESS

Multiple Variable Linear ModelqVector of features: 𝒙 = [𝑥!, … , 𝑥*]

◦ 𝑘 features (also known as predictors, independent variable, attributes, covariates, …)

qSingle target variable 𝑦◦ What we want to predict

qLinear model: Make a prediction M𝑦

𝑦 ≈ M𝑦 = 𝛽# + 𝛽!𝑥! +⋯+ 𝛽*𝑥*qData for training

◦ Samples are (𝒙( , 𝑦(), i=1,2,…,n. ◦ Each sample has a vector of features: 𝒙( = [𝑥(" , … , 𝑥(#] and scalar target 𝑦(

qProblem: Learn the best coefficients 𝜷 = [𝛽# , 𝛽!, … , 𝛽*] from the training data

22

Page 23: Unit 3 Multiple Linear Regression

NYUWIRELESS

Example: Heart Rate IncreaseqLinear Model: HR increase ≈ 𝛽# + 𝛽! mins exercise + 𝛽' exercise intensity

qData:

23

Subjectnumber

HR before HR after Mins ontreadmill

Speed(min/km)

Days exercise/ week

123 60 90 1 5.2 3

456 80 110 2 4.1 1

789 70 130 5 3.5 2

⋮ ⋮ ⋮ ⋮ ⋮ ⋮

Measuring fitness of athletes

https://www.mercurynews.com/2017/10/29/4851089/

Page 24: Unit 3 Multiple Linear Regression

NYUWIRELESS

Why Use a Linear Model?qMany natural phenomena have linear relationship

qPredictor has small variation◦ Suppose 𝑦 = 𝑓 𝑥◦ If variation of 𝑥 is small around some value 𝑥!, then

𝑦 ≈ 𝑓 𝑥! + 𝑓, 𝑥! 𝑥 − 𝑥! = 𝛽! + 𝛽"𝑥,

𝛽! = 𝑓 𝑥! − 𝑓, 𝑥! 𝑥! , 𝛽" = 𝑓′(𝑥!)

qSimple to compute

qEasy to interpret relation◦ Coefficient 𝛽* indicates the importance of feature j for the target.

qAdvanced: Gaussian random variables: ◦ If two variables are jointly Gaussian, the optimal predictor of one from the other is linear predictor

24

Page 25: Unit 3 Multiple Linear Regression

NYUWIRELESS

Matrix ReviewqConsider

𝐴 =1 23 45 6

, 𝐵 = 2 03 2 , 𝑥 = 2

3 ,

qCompute (computations on the board):◦ Matrix vector multiply: 𝐴𝑥◦ Transpose: 𝐴+

◦ Matrix multiply: 𝐴𝐵◦ Solution to linear equations: Solve for 𝑢: 𝑥 = 𝐵𝑢◦ Matrix inverse: 𝐵-"

25

Page 26: Unit 3 Multiple Linear Regression

NYUWIRELESS

Slopes, Intercept and Inner ProductsqModel with coefficients 𝜷: M𝑦 = 𝛽# + 𝛽!𝑥! +⋯+ 𝛽*𝑥*qSometimes use weight bias version:

M𝑦 = 𝑏 + 𝑤!𝑥! +⋯+𝑤*𝑥*

◦ 𝑏 = 𝛽! : Bias or intercept◦ 𝒘 = 𝜷":# = [𝛽" , … , 𝛽#]: Weights or slope vector

qCan write either with inner product:

M𝑦 = 𝛽# + 𝜷!:* ⋅ 𝒙 or M𝑦 = 𝑏 + 𝒘 ⋅ 𝒙qInner product:

◦ 𝒘 ⋅ 𝒙 = ∑*/"# 𝑤*𝑥*

◦ Will use alternate notation: 𝐰+𝒙 = 𝒘, 𝒙

26

Page 27: Unit 3 Multiple Linear Regression

NYUWIRELESS

Matrix Form of Linear RegressionqData: 𝒙., 𝑦. , 𝑖 = 1,… , 𝑛

qPredicted value for 𝑖-th sample: M𝑦. = 𝛽# + 𝛽!𝑥.! +⋯+ 𝛽*𝑥.*qMatrix form

M𝑦!M𝑦'⋮M𝑦"

=11

𝑥!!𝑥'!

⋯⋯

𝑥!*𝑥'*

⋮ ⋮ ⋯ ⋮1 𝑥"! ⋯ 𝑥"*

𝛽#𝛽!⋮𝛽*

qMatrix equation: �̀� = 𝑨 𝜷

27

𝜷 with 𝑝 = 𝑘 + 1 coefficient vector

𝑨 a 𝑛×(𝑘 + 1) feature matrix

K𝒚 a 𝑛 predicted values

Page 28: Unit 3 Multiple Linear Regression

NYUWIRELESS

In-Class Exercise

28

Page 29: Unit 3 Multiple Linear Regression

NYUWIRELESS

OutlineqMotivating Example: Understanding glucose levels in diabetes patients

qMultiple variable linear models

qLeast squares solutions

qComputing the solutions in python

qSpecial case: Simple linear regression

qExtensions

29

Page 30: Unit 3 Multiple Linear Regression

NYUWIRELESS

Least Squares Model FittingqHow do we select parameters 𝜷 = (𝛽#, … , 𝛽*)?

qDefine M𝑦. = 𝛽# + 𝛽!𝑥.! +⋯+ 𝛽*𝑥.*◦ Predicted value on sample 𝑖 for parameters 𝜷 = (𝛽! , … , 𝛽#)

qDefine average residual sum of squares:

RSS 𝜷 :=e/0!

"

𝑦. − M𝑦. '

◦ Note that "𝑦( is implicitly a function of 𝜷 = (𝛽! , … , 𝛽#)◦ Also called the sum of squared residuals (SSR) and sum of squared errors (SSE)

qLeast squares solution: Find 𝜷 to minimize RSS.

30

Page 31: Unit 3 Multiple Linear Regression

NYUWIRELESS

Variants of RSSqOften use some variant of RSS

◦ Note: these are not standard

qResidual sum of squares: RSS = ∑.0!" 𝑦. − M𝑦. '

qRSS per sample or Mean Squared Error:

MSE =RSS𝑛

=1𝑛e.0!

"

𝑦. − M𝑦. '

qNormalized RSS or Normalized MSE: ⁄𝑅𝑆𝑆 𝑛

𝑠)'=𝑀𝑆𝐸𝑠)'

=∑.0!" 𝑦. − M𝑦. '

∑.0!" 𝑦. − -𝑦 '

31

Page 32: Unit 3 Multiple Linear Regression

NYUWIRELESS

Finding Parameters via OptimizationA general ML recipe

qPick a model with parameters

qGet data

qPick a loss function◦ Measures goodness of fit model to data◦ Function of the parameters

qFind parameters that minimizes loss

32

Linear model: M𝑦 = 𝛽# + 𝛽!𝑥! +⋯+ 𝛽*𝑥*Data: 𝒙., 𝑦. , 𝑖 = 1,2, … , 𝑛

Loss function: 𝑅𝑆𝑆 𝛽#, … , 𝛽* ≔ ∑ 𝑦. − M𝑦. '

Select 𝜷 = (𝛽#, … , 𝛽*) to minimize 𝑅𝑆𝑆 𝜷

General ML problem Multiple linear regression

Page 33: Unit 3 Multiple Linear Regression

NYUWIRELESS

RSS as a Vector NormqRSS is given by sum:

RSS =e.0!

"𝑦. − M𝑦. '

qDefine norm of a vector: ◦ 𝒙 = 𝑥"$ + ⋯+ 𝑥0$ "/$

◦ Standard Euclidean norm.◦ Sometimes called ℓ-2 norm. ℓ is for Lebesque

qWrite RSS in vector form:RSS = 𝒚 − �̀� '

33

Page 34: Unit 3 Multiple Linear Regression

NYUWIRELESS

Least Squares SolutionqConsider cost function of the RSS:

RSS 𝜷 = Q(/"

2𝑦( − "𝑦( $ , "𝑦( =Q

*/!

3𝐴(*𝛽*

◦ Vector 𝜷 that minimizes RSS called the least-squares solution

qLeast squares solution: The vector 𝜷 that minimizes the RSS is:

n𝜷 = 𝑨%𝑨 &!𝑨%𝒚

◦ Can compute the best coefficient vector analytically◦ Just solve a linear set of equations ◦ Will show the proof below

34

Page 35: Unit 3 Multiple Linear Regression

NYUWIRELESS

Proving the LS FormulaqLeast squares formula: The vector 𝜷 that minimizes the RSS is:

n𝜷 = 𝑨%𝑨 &!𝑨%𝒚

qTo prove this formula, we will:

◦ Review gradients of multi-variable functions

◦ Compute gradient 𝛻𝑅𝑆𝑆 𝜷

◦ Solve 𝛻𝑅𝑆𝑆 𝜷 = 0

35

Page 36: Unit 3 Multiple Linear Regression

NYUWIRELESS

Gradients of Multi-Variable FunctionsqConsider scalar valued function of a vector: 𝑓 𝜷 = 𝑓(𝛽!, … , 𝛽")

qGradient is the column vector:

𝛻𝑓 𝜷 =⁄𝜕𝑓(𝜷) 𝜕𝛽!⋮⁄𝜕𝑓(𝜷) 𝜕𝛽"

qRepresents direction of maximum increase

qAt a local minima or maxima: 𝛻𝑓 𝜷 = 0◦ Solve 𝑛 equations and 𝑛 unknowns

qEx: 𝑓 𝛽!, 𝛽' = 𝛽! sin 𝛽' + 𝛽!' 𝛽'. ◦ Compute 𝛻𝑓 𝜷 . Solution on board

36

Page 37: Unit 3 Multiple Linear Regression

NYUWIRELESS

Proof of the LS FormulaqConsider cost function of the RSS:

RSS = Q(/"

2𝑦( − "𝑦( $ , "𝑦( =Q

*/!

3𝐴(*𝛽*

◦ Vector 𝜷 that minimizes RSS called the least-squares solution

qCompute partial derivatives via chain rule: 1233144

= −2∑.0!" 𝑦. − M𝑦. 𝐴.5, 𝑗 = 1,2, … , 𝑘

qMatrix form: RSS = 𝐴𝜷 − 𝒚 ', 𝛻𝑅𝑆𝑆 = −2𝐴%(𝒚 − 𝑨𝜷)

qSolution: 𝐴% 𝒚 − 𝐴𝜷 = 0 → 𝜷 = 𝐴%𝐴 &!𝐴%𝒚 (least squares solution of equation 𝐴𝜷 = 𝐲)

qMinimum RSS: 𝑅𝑆𝑆 = 𝒚% 𝐼 − 𝐴 𝐴%𝐴 &!𝐴% 𝒚◦ Proof on the board

37

Page 38: Unit 3 Multiple Linear Regression

NYUWIRELESS

LS Solution via Auto-Correlation FunctionsqEach data sample has a linear feature vector:

𝐴. = 𝐴.#, ⋯ , 𝐴.* = 1, 𝑥.!, ⋯ , 𝑥.*

qDefine sample auto-correlation matrix and cross-correlation vector:◦ 𝑅55 =

"2𝐴+𝐴, 𝑅55 ℓ,𝑚 = "

2∑(/"2 𝐴(ℓ𝐴(7 (correlation of feature ℓ and feature m)

◦ 𝑅58 ="2𝐴+𝑦, 𝑅85 ℓ = "

2∑(/"2 𝐴(ℓ𝑦( (correlation of feature ℓ and target)

qLeast squares solution is: 𝜷 = 𝑅66&!𝑅6)

38

Page 39: Unit 3 Multiple Linear Regression

NYUWIRELESS

Mean Removed Form of the LS SolutionqOften useful to remove mean from data before fitting

qSample mean: -𝑦 = !$∑.0!$ 𝑦., �̅�5 =

!$∑.0!$ 𝑥.5, v𝒙 = �̅�!, ⋯ , �̅�*

qDefined mean removed data: w𝑋.5 = 𝑥.5 − �̅�5, y𝑦. = 𝑦. − -𝑦

qSample covariance matrix and cross-covariance vector:◦ 𝑆99 ℓ,𝑚 = "

:∑(/": (𝑥(ℓ−�̅�ℓ)(𝑥(7−�̅�7), 𝑆99 =

":V𝑿𝑻V𝑿

◦ 𝑆98 ℓ = ":∑(/": (𝑥(ℓ−�̅�ℓ)(𝑦(−X𝑦), 𝑆98 =

":V𝑿+Y𝒚

qMean-Removed form of the least squares solution:

M𝑦 = 𝜷!:* ⋅ 𝒙 + 𝛽#, 𝜷!:* = 𝑆((&!𝑆(), 𝛽# = -𝑦 − 𝜷!:* ⋅ v𝒙

qProof: On board

39

Page 40: Unit 3 Multiple Linear Regression

NYUWIRELESS

𝑅!: Goodness of FitqMultiple variable coefficient of determination:

𝑅" =𝑠#" −𝑀𝑆𝐸

𝑠#"= 1 −

𝑀𝑆𝐸𝑠#"

◦ MSE = <==2= "

2∑(/"2 𝑦( − "𝑦( $

◦ Sample variance is: 𝑠8$ ="2∑(/"2 𝑦( − X𝑦 $, X𝑦 = "

2∑(/"2 𝑦( ,

qInterpretation:

◦ >=?@!"

= Error with linear predictorError predicting bymean

◦ 𝑅$ = fraction of variance reduced or “explained” by the model.

qOn the training data (not necessarily on the test data):◦ 𝑅$ ∈ [0,1] always◦ 𝑅$ ≈ 1 ⇒ linear model provides a good fit ◦ 𝑅$ ≈ 0 ⇒ linear model provides a poor fit

40

Page 41: Unit 3 Multiple Linear Regression

NYUWIRELESS

In-Class Exercise

41

Page 42: Unit 3 Multiple Linear Regression

NYUWIRELESS

OutlineqMotivating Example: Understanding glucose levels in diabetes patients

qMultiple variable linear models

qLeast squares solutions

qComputing the solutions in python

qSpecial case: Simple linear regression

qExtensions

42

Page 43: Unit 3 Multiple Linear Regression

NYUWIRELESS

Arrays and Vector in Python and MATLAB

43

qThere are some key differences between MATLAB and Python that you need to get used to

qMATLAB◦ All arrays are at least 2 dimensions◦ Vectors are 1×𝑁 (row vectors) or 𝑁×1 (column) vectors◦ Matrix vector multiplication syntax depends if vector is on left or right: x’*A or A*x

qPython:◦ Arrays can have 1, 2, 3, … dimension◦ Vectors can be 1D arrays; matrices are generally 2D arrays◦ Vectors that are 1D arrays are neither row not column vectors◦ If x is 1D and A is 2D, then left and right multiplication are the same: x.dot(A) and A.dot(x)

qLecture notes: We will generally treat 𝑥 and 𝑥% the same. ◦ Can write 𝑥 = (𝑥" , … , 𝑥:) and still multiply by a matrix on left or right

Page 44: Unit 3 Multiple Linear Regression

NYUWIRELESS

Fitting Using sklearnqReturn to diabetes data example

qAll code in demo

qDivide data into two portions:◦ Training data: First 300 samples◦ Test data: Remaining 142 samples

qTrain model on training data.

qTest model (i.e. measure RSS) on test data

qReason for splitting data discussed next lecture.

44

Page 45: Unit 3 Multiple Linear Regression

NYUWIRELESS

Manually Computing the SolutionqUse numpy linear algebra routine to solve

𝛽 = 𝐴%𝐴 &!𝐴%𝑦

qCommon mistake:◦ Compute matrix inverse 𝑃 = 𝐴+𝐴 -", ◦ Then compute 𝛽 = 𝑃𝐴+𝑦◦ Full matrix inverse is VERY slow. Not needed.◦ Can directly solve linear system: 𝐴 𝛽 = 𝑦◦ Numpy has routines to solve this directly

45

Page 46: Unit 3 Multiple Linear Regression

NYUWIRELESS

Calling the sklearn Linear Regression methodqConstruct a linear regression object

qRun it on the training data

qPredict values on the test data

46

Page 47: Unit 3 Multiple Linear Regression

NYUWIRELESS

In-Class Exercise

47

Page 48: Unit 3 Multiple Linear Regression

NYUWIRELESS

OutlineqMotivating Example: Understanding glucose levels in diabetes patients

qMultiple variable linear models

qLeast squares solutions

qComputing in python

qExtensions

48

Page 49: Unit 3 Multiple Linear Regression

NYUWIRELESS

Transformed Linear ModelsqStandard linear model: M𝑦 = 𝛽# + 𝛽!𝑥! +⋯+ 𝛽7𝑥7qLinear model may be too restrictive

◦ Relation between 𝑥 and 𝑦 can be nonlinear

qUseful to look at models in transformed form:

M𝑦 = 𝛽!𝜙! 𝒙 +⋯+ 𝛽8𝜙8 𝒙◦ Each function 𝜙* 𝒙 = 𝜙*(𝑥" , … , 𝑥A) is called a basis function◦ Each basis function may be nonlinear and a function of multiple variables

qCan write in vector form: M𝑦 = 𝝓 𝒙 ⋅ 𝜷◦ 𝝓 𝒙 = 𝜙" 𝒙 , … , 𝜙3 𝒙 , 𝜷 = [𝛽" , … , 𝛽3]

49

𝑥

𝑦

Page 50: Unit 3 Multiple Linear Regression

NYUWIRELESS

Fitting Transformed Linear ModelsqConsider transformed linear model

M𝑦 = 𝛽!𝜙! 𝒙 +⋯+ 𝛽8𝜙8 𝒙

qWe can fit this model exactly as before◦ Given data 𝒙( , 𝑦( , 𝑖 = 1, … , 𝑁◦ Want to fit the model from the transformed variables 𝜙* 𝒙 to target 𝑦◦ Define the transformed matrix:

𝐴 =𝜙"(𝒙") ⋯ 𝜙3(𝒙")

⋮ ⋮ ⋮𝜙"(𝒙:) ⋯ 𝜙3(𝒙:)

◦ Predictions: "𝑦 = 𝐴𝛽◦ Least squares fit t𝛽 = 𝐴+𝐴 -"𝐴+𝑦

50

Page 51: Unit 3 Multiple Linear Regression

NYUWIRELESS

Example: Polynomial FittingqSuppose 𝑦 only depends on a single variable 𝑥,

qWant to fit a polynomial model◦ 𝑦 ≈ 𝛽! + 𝛽"𝑥 +⋯𝛽#𝑥#

qGiven data 𝑥( , 𝑦( , 𝑖 = 1,… , 𝑛

qTake basis functions 𝜙* 𝑥 = 𝑥* , 𝑗 = 0,… , 𝑑qTransformed model: "𝑦 = 𝛽!𝜙! 𝑥 +⋯+𝛽A𝜙A 𝑥qTransformed matrix is:

𝑨 =1 𝑥" ⋯ 𝑥"A⋮ ⋮ ⋯ ⋮1 𝑥2 ⋯ 𝑥2A

, 𝛽 =𝛽!⋮𝛽A

◦ 𝑝 = 𝑑 + 1 transformed features from 1 original feature

qWill discuss how to select 𝑑 in the next lecture

51

Page 52: Unit 3 Multiple Linear Regression

NYUWIRELESS

Other Nonlinear ExamplesqMultinomial model: M𝑦 = 𝑎 + 𝑏!𝑥! + 𝑏'𝑥' + 𝑐!𝑥!' + 𝑐'𝑥!𝑥' + 𝑐C𝑥''

◦ Contains all second order terms◦ Define parameter vector 𝛽 = 𝑎, 𝑏" , 𝑏$ , 𝑐" , 𝑐$ , 𝑐%◦ Transformed vector 𝜙 𝑥" , 𝑥$ = [1, 𝑥" , 𝑥$ , 𝑥"$ , 𝑥"𝑥$ , 𝑥$$]◦ Note that the features are nonlinear functions of 𝒙 = [𝑥" , 𝑥$]

qExponential model: M𝑦 = 𝑎!e&DB( +⋯+ 𝑎7𝑒&DC(◦ If the parameters 𝑏" , … , 𝑏A are fixed, then the model is linear in the parameters 𝑎" , … , 𝑎A◦ Parameter vector 𝛽 = 𝑎" , … , 𝑎A◦ Transformed vector 𝜙 𝑥 = [𝑒-D#9 , … , 𝑒-D$9]◦ But, if the parameters 𝑏" , … , 𝑏A are not fixed, the model is nonlinear in 𝑏" , … , 𝑏A

52

Page 53: Unit 3 Multiple Linear Regression

NYUWIRELESS

Linear Models via Re-ParametrizationqSometimes models can be made into a linear model via re-parametrization

qExample: Consider the model: M𝑦 = 𝐴𝑥!(1 + 𝐵𝑒&(E)◦ Parameters (𝐴, 𝐵)

qThis is nonlinear in (𝐴, 𝐵) due to the product 𝐴𝐵: M𝑦 = 𝐴𝑥! + 𝐴𝐵𝑥!𝑒&(E

qBut, we can define a new set of parameters:◦ 𝛽" = 𝐴 and 𝛽$ = 𝐴𝐵

qThen, M𝑦 = 𝛽!𝑥! + 𝛽'𝑥!𝑒&(E

qBasis functions: 𝜙 𝑥!, 𝑥' = [𝑥!, 𝑥!𝑒&(E]

qAfter we solve for 𝛽!, 𝛽' we can recover 𝐴, 𝐵 via inverting the equations:

𝐴 = 𝛽!, 𝐵 =𝛽'𝐴

53

Page 54: Unit 3 Multiple Linear Regression

NYUWIRELESS

Example: Learning Linear SystemsqLinear system: 𝑦$ = 𝑎%𝑦$&% +⋯+ 𝑎'𝑦$&' + 𝑏(𝑥$ +⋯+ 𝑏)𝑥$&) +𝑤$

qTransfer function: 𝐻 𝑧 = *$+⋯+*%-&%

%&.'-&'&⋯&.(-&(

qGiven input sequence and output sequence for T samples,

How do we determine 𝛽 = 𝑎%, ⋯ , 𝑎', 𝑏(, ⋯ , 𝑏) /?

qCan be solved using linear regression!

qWrite 𝑦 = 𝐴𝛽 + 𝑤 and define 𝐴, y◦ See homework problem

qMany applications◦ Learning dynamics in robots / mechanical systems◦ Modeling responses in neural systems◦ Stock market time series◦ Speech modeling. Fit a model each 25 ms.

54

Page 55: Unit 3 Multiple Linear Regression

NYUWIRELESS

One Hot EncodingqSuppose that one feature 𝑥5 is a categorical variable

qEx: Predict the price of a car, 𝑦, given model 𝑥! and interior space 𝑥'◦ Suppose there are 3 different models of a car (Ford, BMW, GM)◦ Bad idea: Arbitrarily assign an index to each possible car model◦ Can give unreasonable relations

qOne-hot encoding example: ◦ With 3 possible categories, represent 𝑥" using 3 binary features (𝜙", 𝜙$ , 𝜙%)◦ Model: 𝑦 = 𝛽! + 𝛽"𝜙" + 𝛽$𝜙$ + 𝛽%𝜙% + 𝛽&𝑥$◦ Essentially obtain 3 different models:

◦ Ford: 𝑦 = 𝛽% + 𝛽& + 𝛽'𝑥(◦ BMW: 𝑦 = 𝛽% + 𝛽( + 𝛽'𝑥(◦ GM: 𝑦 = 𝛽% + 𝛽) + 𝛽'𝑥(

◦ Allows different intercepts (or mean values) for different categories!

55

Model 𝝓𝟏 𝝓𝟐 𝝓𝟑

Ford 1 0 0

BMW 0 1 0

GM 0 0 1

Page 56: Unit 3 Multiple Linear Regression

NYUWIRELESS

Lab: Robot CalibrationqPredict the current draw

◦ Needed to predict power consumption

qPredictors:◦ Joint angles, velocity and acceleration◦ Strain gauge readings (measure of load)

qFull website at TU Dortmund, Germany◦ http://www.rst.e-technik.tu-

dortmund.de/cms/en/research/robotics/TUDOR_engl/index.html

◦ http://www.rst.e-technik.tu-dortmund.de/forschung/robot-toolbox/MERIt/MERIt_Documentation.pdf

56