autonomous pipelines david brett leicester e-science talk edinburgh autonomous pipelines david...

21
E-Science talk Edinburgh Autonomous Pipelines David Brett Leicester AUTONOMOUS AUTONOMOUS PIPELINES PIPELINES David Brett, Leicester University

Upload: bertina-mills

Post on 13-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

AUTONOMOUS AUTONOMOUS PIPELINESPIPELINES

David Brett, Leicester University

Page 2: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Project: Why work on an autonomous classification program?

WASP: Wide Angle Search for Planets telescope

Leicester, St-Andrews, Cambridge, QU Belfast and Open Universities.

Variable Identification: Period searching

Classification System: Artificial Neural Networks

Methods and Results.

Methods and the Future.

Talk Map

Page 3: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Why do any of this?Why do any of this?

Tera-scale computing age:

• Volume of collected data

• Repetitive nature of the data reduction

• “Brute force” approach

• Creates one more layer of abstraction

Page 4: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

WASPWASP(Wide-Angle Search for Planets)

9.5o

9.5o

Four 20482 CCD chips (recently funding for five)

For comparison: the INT “wide-field camera” images roughly the size of the full

moon

1% photometry down to 13th magnitude and detections down to 17th (30s exposure)

5TB per year (raw)

But what do we do with all those bits?

Page 5: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Source Extraction and Data Reduction

Stages:

• Home grown programs for “cleaning” the raw data.

• Use of conventional packages such as SExtractor for source extraction

• Variability checking programs

• Periodic variability locating programs

• Phased lightcurve recognition software

• Results database

Page 6: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Periodic Variables

Phase-folding:

• Fast to execute

• Easy to implement

• Simple to understand

• e.g. 2 or the L-Statistic

Two Main Methods

Frequency Analysis:

• Slower to execute

• Trickier to code

• More reliable

• e.g. Lomb-Scargle or Schwarzenberg-Czerny

Page 7: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Periodic Variables

Phase-Folding: 2

• Maximum deviation from a constant line.

• Binned data, uses bin mean.

• Intra-bin deviation not taken into account

• Very quick to implement and compute.

• REM! Looking for a maximum, not a minimum.

Page 8: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Periodic Variables

Phase-Folding: L-Statistic

• Also uses binned data.

• Additionally considers intra-bin deviation from bin-mean.

• Divide 2 value by the intra-bin dispersion, enhancing low deviation trial periods.

• Quick and accurate with medium to low-noise data.

• Created by S. Davies, 1990.

Page 9: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Periodic Variables

Frequency Analysis:

Lomb-Scargle

• Uses the whole unbinned data time series (DTS).

• Created by Lomb 1976, refined by Scargle 1982. Code adapted from NR in C.

Period (days)

Stat

Page 10: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Periodic Variables

Frequency Analysis:

Schwarzenberg-Czerny

• Uses the whole unbinned data time series (DTS).

• Created by A. Schwarzenberg-Czerny 1996. Code adapted from S-C code.

Period (days)

Stat

Page 11: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Periodic Variables

Choice of Trial Periods:

• Linear difference in period, dP.

• Linear difference in phase, d.

• Too small a dP and we may search too fine a parameter space and waste CPU time.

• Too large a dP and we will not search finely enough.

OK

7%

Page 12: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Periodic Variables

Page 13: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Periodic Variables

Conclusions:

• Phase-folding methods are swiftest

• Frequency Analysis methods are generally more reliable

• Autonomous pipelines require reliability over speed

• Schwarzenberg-Czerny would be the method of choice

• In which case a better period choice method is needed

Page 14: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Autonomous Classification

2 Main Stages:

• Memory Pattern Matching

• Modification of the Artificial Neural Network (ANN)

INITIAL FINAL

Page 15: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Autonomous Classification

Memory Pattern Matching:

Why?

Page 16: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Autonomous Classification

Memory Pattern Matching:

• It allows us to begin grouping similar shapes together

• This grouping encourages self-organisation

• To pattern-match is the underlying goal!

• Finding a sensible position on the network for a pattern allows us to change the network

How?

Page 17: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Autonomous Classification

Memory Pattern Matching:

Lightcurve Pattern

Node 0 Pattern

Node 1 Pattern

Node 0 has the lowest weight difference vector, node 0 wins

WEIGHTS

Page 18: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Autonomous Classification

Modification of the ANN:

• Modification affects an area

• Lessens as geometrical distance increases

• Area mixing encourages grouping

• The network can self-organise

• Hotspots occur

Page 19: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Autonomous Classification

Modification of the ANN:

• Adjust the weights on the network nodes so that they better represent the lightcurve.

• is the learning parameter. It decreases on each learning iteration of the network. 00.

• P is the power (from the neighbour function) of the current node.

)( iii LwPdw

Page 20: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

The Future

• Enhanced clustering mechanism.

• More precise shape-similarity evaluating methods.

• More dynamically adaptive choice of trial periods for period searching.

• Refinement of these ideas and trying other methods.

• Research if >2D networks are worthwhile in the current format.

Page 21: Autonomous Pipelines David Brett Leicester E-Science talk Edinburgh AUTONOMOUS PIPELINES David Brett, Leicester University

E-Science talk Edinburgh

Autonomous Pipelines

David Brett Leicester

Questions?Questions?