which transient when? - a utility function for transient follow-up scheduling

60
Which transient, when? A utility function for transient follow-up scheduling Tim Staley (4 Pi Sky group) Southampton Wednesday Seminar, May 2015 WWW: 4pisky.org , timstaley.co.uk/talks

Upload: tim-staley

Post on 05-Aug-2015

16 views

Category:

Science


0 download

TRANSCRIPT

Which transient, when?A utility function for transient follow-up scheduling

Tim Staley(4 Pi Sky group)

Southampton Wednesday Seminar, May 2015

WWW: 4pisky.org , timstaley.co.uk/talks

Automate all the things Forecasting transients Decision theory Future work

Outline

Automate all the things

Forecasting transients

Decision theory

Future work

Automate all the things Forecasting transients Decision theory Future work

LSST predicted transient rates

(LSST science book, 2009)

Orphan GRBS: ∼ 1000 per year.

“The reader should be cautioned that many of theserates are very rough.”

Guesstimates: 104 – 106 alerts per night(depending on your definition).

Automate all the things Forecasting transients Decision theory Future work

Transient rates today

Notices / events in a 30 day period around April 2015:

É GCN: 106 (perhaps 10 new events)

É ATEL: 146 (probably similar ratio of new / follow-up)

É GAIA: 17 (public)

É PTF: 299 (internal)

É CRTS: 81(CSS) + 105 (MLS) (automatic detections)

Just counting initial events: ∼500 / month, or ∼17 /day

Automate all the things Forecasting transients Decision theory Future work

Robotic follow-up facilitiesSlow rise of automated sites / networks, e.g.LCOGT: 2 × 2m + 9 × 1m telescopes, over six sites.

(http://lcogt.net/observatory)

Automate all the things Forecasting transients Decision theory Future work

The (wider) problem

How do we begin to ‘close the loop’ ofobserve =⇒ analyse =⇒ observe?

Automate all the things Forecasting transients Decision theory Future work

The (wider) problem

How do we begin to ‘close the loop’ ofobserve =⇒ analyse =⇒ observe?

Transient discovery

Observation prioritization

Classification estimates

Interesting?

Yes

Schedule optimization

External alerts

Survey data

Telescope agents

Follow-up data

Automate all the things Forecasting transients Decision theory Future work

The (wider) problem

How do we begin to ‘close the loop’ ofobserve =⇒ analyse =⇒ observe?

Transient discovery

Observation prioritization

Classification estimates

Interesting?

Yes

Schedule optimization

External alerts

Survey data

Telescope agents

Follow-up data

Science!

Automate all the things Forecasting transients Decision theory Future work

Diversion on DG-CVn superflare

Fender 2014, http://adsabs.harvard.edu/abs/2014arXiv1410.1545F

Osten et al (in prep)

Automate all the things Forecasting transients Decision theory Future work

Missing pieces

Transient discovery

Observation prioritization

Classification estimates

Interesting?

Yes

Schedule optimization

External alerts

Survey data

Telescope agents

Follow-up data

É We’ve found a potentially interesting new transient.

É Looks like it could be one of class A, B, or C.

É What now?

Automate all the things Forecasting transients Decision theory Future work

We found a transient!What now?

Two implicit goals of follow-up observation:

É Improving classification(Let’s see if it’s really class A.)

É Further observation(Tell me more! / Boring!)

. . . I’ll mainly be talking about the former.

Automate all the things Forecasting transients Decision theory Future work

We found a transient!What now?

Two implicit goals of follow-up observation:

É Improving classification(Let’s see if it’s really class A.)

É Further observation(Tell me more! / Boring!)

. . . I’ll mainly be talking about the former.

Automate all the things Forecasting transients Decision theory Future work

Outline

Automate all the things

Forecasting transients

Decision theory

Future work

Automate all the things Forecasting transients Decision theory Future work

Working with tiny dataWe’ve found a transient. But, very few datapoints:

−10 −5 0 5 10 15 20 25 30Time

0

2

4

6

8

10

12

Flu

x

Detection thresholdData

Automate all the things Forecasting transients Decision theory Future work

The set-up

How do we predict the possible futures for a giventransient?

For now, make some simplifying assumptions:

É Parametric lightcurve models for each class oftransient.

É Each transient class has a known prior distributionover the morphological parameters, and this ismultivariate Normal.

Automate all the things Forecasting transients Decision theory Future work

Assumption: Parametric models

Deterministic, finite number of parameters e.g.

y = ƒ (t, t0, , τrse, τdecy)

−40 −20 0 20 40 60 80 100Time

0

2

4

6

8

10

Flu

x

Automate all the things Forecasting transients Decision theory Future work

Result: Line of best fit

(Maximum likelihood)

−10 −5 0 5 10 15 20 25 30Time

0

2

4

6

8

10

12

Flu

x

ML fitDetection thresholdData

Automate all the things Forecasting transients Decision theory Future work

Assumption: Multivar-Normal priors

Known priors, normally distributed, covariant

2

3

4

rise_

tau

8 16 24 32

a

6

12

18

24

deca

y_ta

u

2 3 4

rise_tau

6 12 18 24

decay_tau

Automate all the things Forecasting transients Decision theory Future work

Construct: Model lightcurve ensembles

−20 0 20 40 60 80Time

0

2

4

6

8

10

12

14

16

18

Flu

x

Automate all the things Forecasting transients Decision theory Future work

Result: MAP fit

−10 −5 0 5 10 15 20 25 30Time

0

2

4

6

8

10

12

Flu

x

ML fitMAP fitDetection thresholdData

Automate all the things Forecasting transients Decision theory Future work

But...

−10 −5 0 5 10 15 20 25 30Time

0

2

4

6

8

10

12

Flu

x

ML fitMAP fitTrueDetection thresholdData

Automate all the things Forecasting transients Decision theory Future work

Constrained parameter distributions

Take our two datapoints, run some MCMC fitting...

3.0

3.6

4.2

rise_

tau

12151821

deca

y_ta

u

10 15 20 25 30

a

0

4

8

12

t0

3.0 3.6 4.2

rise_tau12 15 18 21

decay_tau

0 4 8 12

t0

Automate all the things Forecasting transients Decision theory Future work

Constrained lightcurve ensembles

−30 −20 −10 0 10 20 30 40 50Time

0

2

4

6

8

10

12

14

16

18

Flu

x

TrueObservations

Comparison, 2 datapoints

Automate all the things Forecasting transients Decision theory Future work

Constrained lightcurve ensembles

−30 −20 −10 0 10 20 30 40 50Time

0

5

10

15

Flu

x

TrueForecast epochObservations

0.00 0.05 0.10 0.15Prob.

0

5

10

15

Comparison, 2 datapoints

Automate all the things Forecasting transients Decision theory Future work

Constrained lightcurve ensembles

−30 −20 −10 0 10 20 30 40 50Time

0

2

4

6

8

10

12

14

16

Flu

x

TrueForecast epochObservations

0.0 0.2 0.4 0.6Prob.

0

2

4

6

8

10

12

14

16

Comparison, 1 datapoints

Automate all the things Forecasting transients Decision theory Future work

Constrained lightcurve ensembles

−30 −20 −10 0 10 20 30 40 50Time

0

5

10

15

Flu

x

TrueForecast epochObservations

0.00 0.05 0.10 0.15Prob.

0

5

10

15

Comparison, 2 datapoints

Automate all the things Forecasting transients Decision theory Future work

Constrained lightcurve ensembles

−30 −20 −10 0 10 20 30 40 50Time

0

2

4

6

8

10

12

Flu

x

TrueForecast epochObservations

0.0 0.2 0.4 0.6Prob.

0

2

4

6

8

10

12

Comparison, 3 datapoints

Automate all the things Forecasting transients Decision theory Future work

Constrained lightcurve ensembles

−30 −20 −10 0 10 20 30 40 50Time

0

2

4

6

8

10

12

Flu

x

TrueForecast epochObservations

0.0 0.2 0.4 0.6Prob.

0

2

4

6

8

10

12

Comparison, 4 datapoints

Automate all the things Forecasting transients Decision theory Future work

Constrained lightcurve ensembles

−30 −20 −10 0 10 20 30 40 50Time

0

2

4

6

8

10

12

Flu

x

TrueForecast epochObservations

0.00 0.25 0.50 0.75Prob.

0

2

4

6

8

10

12

Comparison, 6 datapoints

Automate all the things Forecasting transients Decision theory Future work

Comparing model ensembles

−30 −20 −10 0 10 20 30 40 50Time

0

2

4

6

8

10

12

14

16

18

Flu

xPrior ensemble, Type 1

Automate all the things Forecasting transients Decision theory Future work

Comparing model ensembles

−30 −20 −10 0 10 20 30 40 50Time

0

5

10

15

20

25

30

35

Flu

xPrior ensemble, Type 2

Automate all the things Forecasting transients Decision theory Future work

Comparing model ensembles

−30 −20 −10 0 10 20 30 40 50Time

0

5

10

15

20

25

30

35

40

Flu

xComparison, 0 datapoints

Automate all the things Forecasting transients Decision theory Future work

Comparing model ensembles

−30 −20 −10 0 10 20 30 40 50Time

0

5

10

15

20

25

30

35

Flu

x

TrueForecast epochObservations

0.00 0.05 0.10 0.15 0.20Prob.

0

5

10

15

20

25

30

35

Comparison, 2 datapoints

Automate all the things Forecasting transients Decision theory Future work

Comparing model ensembles

−30 −20 −10 0 10 20 30 40 50Time

0

5

10

15

20

25

30

35

Flu

x

TrueForecast epochObservations

0.0 0.2 0.4 0.6Prob.

0

5

10

15

20

25

30

35

Comparison, 2 datapoints

Automate all the things Forecasting transients Decision theory Future work

Outline

Automate all the things

Forecasting transients

Decision theory

Future work

Automate all the things Forecasting transients Decision theory Future work

The crux of the problem

(Couldn’t get past thecrux)

É Computers are good atoptimisation.

É e.g., picking out a goodobserving schedule.

É But need a way toassign value.

Automate all the things Forecasting transients Decision theory Future work

The crux of the problem

(Couldn’t get past thecrux)

É Computers are good atoptimisation.

É e.g., picking out a goodobserving schedule.

É But need a way toassign value.

Automate all the things Forecasting transients Decision theory Future work

−6 −4 −2 0 2 4 6Epoch

0.0

0.2

0.4

0.6

0.8

1.0

Rel

ativ

e flu

x

stablelogisticnull

Intrinsic lightcurves

Automate all the things Forecasting transients Decision theory Future work

−6 −4 −2 0 2 4 6Epoch

0.0

0.2

0.4

0.6

0.8

1.0

Rela

tive f

lux

stable

logistic

null

Intrinsic LC's with noise estimates

Automate all the things Forecasting transients Decision theory Future work

−6 −4 −2 0 2 4 6Epoch

−0.5

0.0

0.5

1.0

1.5

Rel

ativ

e flu

xSampling with noise

Automate all the things Forecasting transients Decision theory Future work

−6 −4 −2 0 2 4 6Epoch

−0.5

0.0

0.5

1.0

1.5

Rel

ativ

e flu

xSampling with noise

Automate all the things Forecasting transients Decision theory Future work

0 1−0.5

0.0

0.5

1.0

1.5R

elat

ive

flux

T=-5.0

0 1

T=-4.0

0 1PDF value

T=-3.0

0 1

T=-2.0

0 1

T=-1.0

0 1

T=0.0

0 1

T=1.0

0 1

T=2.0

stablelogisticnull

Class PDF at each epoch

Automate all the things Forecasting transients Decision theory Future work

PDF value−0.5

0.0

0.5

1.0

1.5

Rel

ativ

e flu

x

T=-5.0 T=-4.0 T=-3.0 T=-2.0 T=-1.0 T=0.0 T=1.0 T=2.0

stablelogisticnull

−5 −4 −3 −2 −1 0 1 2Epoch

−0.55−0.50−0.45−0.40−0.35−0.30−0.25−0.20

FoM

Information content

Evaluating each epoch

Automate all the things Forecasting transients Decision theory Future work

The problem with information content

PDF value−0.5

0.0

0.5

1.0

1.5R

elat

ive

flux

T=-5.0 T=-4.0 T=-3.0 T=-2.0 T=-1.0 T=0.0 T=1.0 T=2.0

stablelogisticnull

−5 −4 −3 −2 −1 0 1 2Epoch

−0.55−0.50−0.45−0.40−0.35−0.30−0.25−0.20

FoM

Information content

Evaluating each epoch

Information content weights all classes equally.What if we’re more interested in identifying one

particular class?

Automate all the things Forecasting transients Decision theory Future work

Introducing: confusion matrices

Common or garden empirical confusion matrix:

(Automatic classification of time-variable X-ray sources;K. Lo et al, 2014)

Automate all the things Forecasting transients Decision theory Future work

Confusion matrices

Label True class

A B C

A P( A | A ) P( A | B ) P( A | C )

B P( B | A ) P( B | B ) P( C | C )

C P( C | A ) P( C | B ) P( C | C )

Automate all the things Forecasting transients Decision theory Future work

Probabilistic confusion matricesExample

0.0 0.5 1.0 1.5 2.0 2.5PDF value

−0.5

0.0

0.5

1.0

1.5

Rel

ativ

e flu

x

T=-2

stablelogisticnull

True classstable logistic null

Labelstable 0.549 0.449 0.002logistic 0.449 0.541 0.010null 0.002 0.010 0.988

Diagonal entries representrecall for each class.

Automate all the things Forecasting transients Decision theory Future work

Probabilistic confusion matricesExample

0.0 0.5 1.0 1.5 2.0 2.5PDF value

−0.5

0.0

0.5

1.0

1.5

Rel

ativ

e flu

x

T=-2

stablelogisticnull

True classstable logistic null

Labelstable 0.549 0.449 0.002logistic 0.449 0.541 0.010null 0.002 0.010 0.988

Diagonal entries representrecall for each class.

Automate all the things Forecasting transients Decision theory Future work

PDF value−0.5

0.0

0.5

1.0

1.5

Rel

ativ

e flu

x

T=-5.0 T=-4.0 T=-3.0 T=-2.0 T=-1.0 T=0.0 T=1.0 T=2.0

stablelogisticnull

−5 −4 −3 −2 −1 0 1 2Epoch

0.450.500.550.600.650.700.750.800.850.90

FoM

Information content (shifted)Total Recall

Evaluating each epoch

Automate all the things Forecasting transients Decision theory Future work

PDF value−0.5

0.0

0.5

1.0

1.5

Rel

ativ

e flu

x

T=-5.0 T=-4.0 T=-3.0 T=-2.0 T=-1.0 T=0.0 T=1.0 T=2.0

stablelogisticnull

−5 −4 −3 −2 −1 0 1 2Epoch

0.50.60.70.80.91.0

FoM

Total RecallNull Recall

Evaluating each epoch

Automate all the things Forecasting transients Decision theory Future work

As applied to the previous example . . .

−30 −20 −10 0 10 20 30 40 50Time

0

5

10

15

20

25

30

35Flu

x

True

Forecast epoch

Observations

0.00 0.08 0.16 0.24 0.32Prob.

0

5

10

15

20

25

30

35

−30 −20 −10 0 10 20 30 40 50Time

−0.70−0.65−0.60−0.55−0.50−0.45−0.40

IC s

core

Comparison, 2 datapoints

Automate all the things Forecasting transients Decision theory Future work

Summary

É Data + models + Bayesian analysis =⇒Ensemble forecasts

É Ensemble forecasts + utility function =⇒Figure-of-merit for evaluating possible actions

É Using the figure-of-merit as basis for a decision =Applied Bayesian decision theory(AKA active machine learning)

Automate all the things Forecasting transients Decision theory Future work

Outline

Automate all the things

Forecasting transients

Decision theory

Future work

Automate all the things Forecasting transients Decision theory Future work

Scheduler, simulation, testing

É Still need to implement a (basic) scheduler usingfigure-of-merit as input.

É Then: simulate (easy, already have models!)

Automate all the things Forecasting transients Decision theory Future work

Optimization / Automated Planning

É Optimizing for ASAP classification(Future-discounted weighting schemes?)

É Non-myopic (better-than-greedy) scheduling.

É Multi-armed bandit problem.

Automate all the things Forecasting transients Decision theory Future work

Optimization / Automated PlanningMulti-armed bandit problem

Image credit: Wikipedia/Yamaguchi (CC BY-SA 3.0)

Automate all the things Forecasting transients Decision theory Future work

Model refinement (and basic validity)

É Multivariate normal prior — valid?

É Sum of multiple multivariate normals?

É Non-parametric modelling (Gaussian processes)?

Automate all the things Forecasting transients Decision theory Future work

Culture

Pretty sure we can make this work. . . but there’s aculture / chicken-and-egg problem.

Automate all the things Forecasting transients Decision theory Future work

Code packages used

É astropy.modeling — for the models interface.

É Numpy — efficient lightcurve-model calculations.

É pandas — (Python ANalysis of DAta Series) forhandling time-series data.

É statsmodels — Kernel density estimates.

É emcee — for MCMC

É (Py)MultiNest — For model-evidence calculations.

É Seaborn — Neat plotting tools, aestheticallypleasing defaults for Matplotlib.

Automate all the things Forecasting transients Decision theory Future work

Fin

Plenty more to do . . . watch this space.