deep shallow transits learning - exoplanets ii · shallow transits: inter-temporal correlations...
TRANSCRIPT
Shallow
Transits
Deep
Learning
Shay Zucker, Raja Giryes
Elad Dvash
(Tel Aviv University)
Yam Peleg
(Deep Trading)
Red Noise and Transit Detection
Deep transits:
traditional methods
(BLS) work well
Shallow transits:
inter-temporal
correlations might
mask the signal
Pont, Zucker & Queloz 2006, MNRAS, 373, 231
BLS: Kovács, Zucker & Mazeh 2002, A&A, 391, 369
Gaussian Processes
• An elegant way to model inter-temporal correlations
• Use a kernel function to model the correlation
• A kernel is parameterized by a few hyperparameters
• Fitting is very hard (involves inversion of huge matrices)
• Simultaneous GP fitting and transit search even harder…
• Rasmussen & Williams 2006 (textbook)
• Aigrain et al. 2016 (application to K2 light curves)
• Foreman-Mackey et al. 2017 (approximate fast fitting)
𝑘 𝑡𝑖 − 𝑡𝑗 = 𝐴𝑠2exp −
𝑡𝑖−𝑡𝑗
𝜆𝑠
2
+𝐴𝑞2 exp −
sin2 𝜋 𝑡𝑖−𝑡𝑗 /𝑇𝑞
2−
𝑡𝑖−𝑡𝑗
𝜆𝑞
2
+ 𝐴𝑤2 𝛿 𝑡𝑖 − 𝑡𝑗
Deep Learning
Neural Networks
“a set of computational heuristics to train
highly nonlinear parametric functions
structured in a layered form to perform a
certain task”
Biological Neuron
McCulloch-Pitts Neuron
Deep Learning in a Nutshell
• Supervised learning: given examples with ground truth
• (‘training set’)
• Loss function (error quantification)
• Loss function depends analytically on the synaptic weights
• Backpropagation of derivatives (chain rule) through layers
• Slowly update the synaptic weights (e.g. gradient descent,
Metropolis-Hastings, etc.) to minimize loss function
• Essential ingredients: Non linearity and layered structure
• A growing multitude of neural network architectures
Feasibility Study
• Zucker & Giryes 2018, AJ, 155, 4
• Fictitious planet-hunting space telescope
• Noise simulated by GP
– White noise
– Red noise (squared exponential)
– Quasi periodic noise
– Hyperparameters drawn randomly
Feasibility Study
Feasibility Study
Receiver Operating Characteristic (ROC) curve
Deep
LearningHPF+BLS
Feasibility Study
Adding outliers and discontinuities
Deep
LearningOutlier
removal
+HPF+BLS
Sample detections (FPR=0.01)
Sample detections (FPR=0.01)
Sample false detections (FPR=0.01)
TESS ETE-6-based test
• Time sampling provided in ETE-6 (with gaps)
• White noise more dominant
• Red noise, same as in previous study
• DL still outperforms BLS, but less convincingly
• First attempts in estimating period and detrending
TESS ETE-6-based test
DL still outperforms BLS,
but less convincingly
DL
BLS+HPF
TESS ETE-6-based test
Estimating period
First attempts
TESS ETE-6-based test
Estimating period
First attempts
What next?
• Work in progress: use DL to:
- Detrend light curves
- Characterize transit signals
- Identify individual transits
• Introduce complications (gaps, TTV, multis etc.)
• Mine old data for hidden planets (Kepler, CoRoT)
• Use DL to fit GPs
• Apply Deep Learning to RV (to overcome activity)
• Prepare for PLATO
Related Works
• Vanderburg & Shallue 2018, AJ, 155, 94
- ‘Identifying’ – not ‘detecting’…
- Traditional approach to detect TCEs in resonant systems
- Deep learning for vetting, not detecting
• Pearson, Palafox & Griffith 2017, MNRAS, 474, 478
- Discrete grid of transit parameters (not distributions)
- Quasi-periodicity+white noise, not GP
Summary
• Deep learning neural networks are the future!
• May achieve unprecedented performance,
specifically for small planets,
with long periods,
around G-type stars
• A fundamentally different approach (nonlinear)
• Zucker & Giryes 2018, AJ, 155, 4