on sharp identi cation regions for regression under ... · situation situation: standard simple...

On Sharp Identi�cation Regions for Regression

Under Interval Data

Georg Schollmeyer Thomas Augustin

Department of Statistics LMU Munich

Situation

Situation: Standard simple linear model Y = β0 + β1X + ε with interval-censored

outcomes.

Y unobserved, only Y and Y observed and all we know is Y ∈ [Y,Y].

model partially identi�ed, there are many parameters (β0, β1) leading to the

same distribution of the observable variables.

●

● ●

●

●●

●

●●

●

2 4 6 8 10

2040

6080

100

x

y

−−

−

−−

− −

−−−

−−

−

− −

−

− −

−

−−

−

− −

−

2 4 6 8 10

020

4060

8010

012

0

x

y

"What is the entity to estimate?"

Situation

Situation: Standard simple linear model Y = β0 + β1X + ε with interval-censored

outcomes.

Y unobserved, only Y and Y observed and all we know is Y ∈ [Y,Y].

model partially identi�ed, there are many parameters (β0, β1) leading to the

same distribution of the observable variables.

●

● ●

●

●●

●

●●

●

2 4 6 8 10

2040

6080

100

x

y

−−

−

−−

− −

−−−

−−

−

− −

−

− −

−

−−

−

− −

−

2 4 6 8 10

020

4060

8010

012

0

x

y

"What is the entity to estimate?"

Page 4: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

Identi�cation regions for the simple linear model

a) Sharp Marrow Region:

SMR(Y,Y) = {β | E(Y | X ) ≤ β0 + β1X ≤ E(Y | X )}

−20 −10 0 10 20

−30

0−

200

−10

00

100

x

E(

|x)

Page 5: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

−20 −10 0 10 20

−30

0−

200

−10

00

100

x

E(

|x)

Page 6: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

−20 −10 0 10 20

−30

0−

200

−10

00

100

x

E(

|x)

Page 7: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

b) Sharp Collection Region:

SCR(Y,Y) = {argminE((β0 + β1X − Y )2) | Y ∈ [Y,Y]}

−20 −10 0 10 20

−30

0−

200

−10

00

100

x

E(

|x)

Page 8: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

−20 −10 0 10 20

−30

0−

200

−10

00

100

x

E(

|x)

Page 9: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

−20 −10 0 10 20

−30

0−

200

−10

00

100

x

E(

|x)

Page 10: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

−20 −10 0 10 20

−30

0−

200

−10

00

100

x

E(

|x)

Page 11: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

−20 −10 0 10 20

−30

0−

200

−10

00

100

x

E(

|x)

Page 12: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

Sharp Setloss Region

c) Sharp Setloss Region:

LS(Y,Y, Γ) =

∫ [E(Y | x)− sup

β∈Γ(β0 + β1x)

]2

+

[E(Y | x)− inf

β∈Γ(β0 + β1x)

]2

dP(x)

SSR(Y,Y) =⋃

argminΓ⊆R2

LS(Y,Y, Γ)

−20 −10 0 10 20

−30

0−

200

−10

00

100

x

E(

|x)

Page 13: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

LS(Y,Y, Γ) =

∫ [E(Y | x)− sup

β∈Γ(β0 + β1x)

]2

+

[E(Y | x)− inf

β∈Γ(β0 + β1x)

]2

dP(x)

SSR(Y,Y) =⋃

argminΓ⊆R2

LS(Y,Y, Γ)

−20 −10 0 10 20

−30

0−

200

−10

00

100

x

E(

|x)

Page 14: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

LS(Y,Y, Γ) =

∫ [E(Y | x)− sup

β∈Γ(β0 + β1x)

]2

+

[E(Y | x)− inf

β∈Γ(β0 + β1x)

]2

dP(x)

SSR(Y,Y) =⋃

argminΓ⊆R2

LS(Y,Y, Γ)

−20 −10 0 10 20

−30

0−

200

−10

00

100

x

E(

|x)

Page 15: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

LS(Y,Y, Γ) =

∫ [E(Y | x)− sup

β∈Γ(β0 + β1x)

]2

+

[E(Y | x)− inf

β∈Γ(β0 + β1x)

]2

dP(x)

SSR(Y,Y) =⋃

argminΓ⊆R2

LS(Y,Y, Γ)

−20 −10 0 10 20

−30

0−

200

−10

00

100

x

E(

|x)

Page 16: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

LS(Y,Y, Γ) =

∫ [E(Y | x)− sup

β∈Γ(β0 + β1x)

]2

+

[E(Y | x)− inf

β∈Γ(β0 + β1x)

]2

dP(x)

SSR(Y,Y) =⋃

argminΓ⊆R2

LS(Y,Y, Γ)

−20 −10 0 10 20

−30

0−

200

−10

00

100

x

E(

|x)

Page 17: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

Relations between SMR, SCR and SSR

SSR ⊇ SMR ⊆ SCR

SSR ⊆ SCR? "often approximately"

Page 18: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

SSR ⊇ SMR ⊆ SCR

SSR ⊆ SCR?

"often approximately"

Page 19: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

SSR ⊇ SMR ⊆ SCR

SSR ⊆ SCR? "often approximately"

Page 20: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

region well-speci�ed case misspeci�ed case

SMR true parameters (β0, β1) ∈ SMRnot "well-estimable" ?

SSR true parameters (β0, β1) ∈ SSR

"well-estimable" (e.g. replace

expectations by means)

good description of the bounds

Y,Y and thus good description

of Y

SCR true parameters (β0, β1) ∈ SCR

"well-estimable" (e.g. replace

expectations by means)

set of all possible good linear

descriptions for every

Y ∈ [Y,Y]

Page 21: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

0 1 2 3 4 5

05

1015

2025

x

E(

|x)

0 1 2 3 4 5

−10

−5

05

1015

2025

xE

( |x

)

Figure: The conditional expectations E(Y | x) (black), E(Y | x) (dashed) and E(Y | x)

(dashed) for a well-speci�ed (left) and a misspeci�ed (right) situation.

Page 22: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

−2 −1 0 1 2 3 4 5

−2

02

46

810

12

ß1

ß0

●

−3 −2 −1 0 1 2 3 4

−6

−4

−2

02

46

8

ß1

ß0

●

Figure: The identi�cation regions SMR (red), SCR (black) and SSR (blue) for a well-

speci�ed (left) and a misspeci�ed (right) situation. The true Parameter is dotted green

(for the misspeci�ed situation the green point is the best linear predictor for y). In the

misspeci�ed case SMR is empty.

Page 23: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

Beresteanu, A., & Molinari, F. (2008): Asymptotic properties for a class of

partially identi�ed models, Econometrica, 76, 763�814.

�erný, M. & Rada M. (2011): On the possibilistic approach to linear

regression with rounded or interval-censored data. Measurement Science

Review, 11, 34�40.

Ponomareva, M., & Tamer, E. (2011): Misspeci�cation in moment inequality

models: back to moment equalities?. Econometrics Journal, 14, 186-203.

Stoye, J. (2007): Bounds on generalized linear predictors with incomplete

outcome data. Reliable Computing, 13, 293�302.

Page 24: On Sharp Identi cation Regions for Regression Under ... · Situation Situation: Standard simple linear model Y = 0 + 1 X +"with interval-censored outcomes. Y unobserved, only Y and

region well-speci�ed case misspeci�ed case

SMR truth ∈ SMRnot "well-estimable"

monotone

idempotent

?

SSR truth ∈ SSR

"well-estimable"

not monotone

idempotent

good description of the bounds

Y,Y and thus good description

of Y

SCR truth ∈ SCR

"well-estimable"

monotone

not idempotent and thus

maybe too rough

set of all possible good linear

descriptions for every

Y ∈ [Y,Y]

on sharp identi cation regions for regression under ... · situation situation: standard simple...

Documents