pattern matching in dame using aura technology jim austin, robert davis, bojian liang, andy pasley...

Post on 12-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Pattern Matching in DAME using AURA technology

Jim Austin, Robert Davis, Bojian Liang, Andy Pasley

University of York

Distributed Aircraft Maintenance Environment - DAME

Overview

• Context• AURA technology• DAME pattern matching problem• AURA solution• Search performance• Next steps

Distributed Aircraft Maintenance Environment - DAME

Context

• Vibration data from all engines in flight• Detection of unusual vibration patterns

– Novelties, anomalies– Automatic or manual

Search for similar vibration behaviour– Need to search large volumes of historical vibration

data

• Investigate search results and associated data– Service data records– CBR tools: Sheffield

Distributed Aircraft Maintenance Environment - DAME

AURA technology

• AURA– Proven technology for searching large data sets– Ability to scale and maintain performance– Easily parallelised

• Examples– Address matcher– Molecular matcher

• Operation– Vectors compared to stored examples– Uses bit level comparison methods– Correlation Matrix Memory operations

Distributed Aircraft Maintenance Environment - DAME

AURA architecture

Dat

a A

dapt

or

Sto

reS

ea

rch

Inp

ut

pa

tte

rn

Candidate Engine(Back check)

Indexer

Output pattern

AURASearchEngine

Results

binary

Store & Search

Store &

Search

Indexes or Data

ResultStore

Candidate Selector

Distributed Aircraft Maintenance Environment - DAME

AURA storage & recall

Inp

ut

pa

tte

rn

Output pattern

AURASearchEngine

binary

2 1 20 0 0 0

* *Correlation Matrix Memories

Distributed Aircraft Maintenance Environment - DAME

AURA software

• AURA re-designed– To improve performance of the AURA library in terms of

both memory usage and search times• 3 fold reduction in memory

• 3 fold reduction in search time

– To make the library easy to use• Simple API

• Typically only 4 or 5 API calls used

• Enable implementation as an OGSI GT3 service

– To engineer the library to commercial software standards• Comprehensive user guide and reference manual

Distributed Aircraft Maintenance Environment - DAME

Pattern matching problem

• Vibration data from sensors forms Z-mod data.• Tracked orders extracted from Z-mod data

Fre

quen

cy

Time

Trackedorder

TimeA

mpl

itude

Distributed Aircraft Maintenance Environment - DAME

Pattern matching problem

• Novelty or anomaly identified in tracked order data by feature detectors

Forms Query sub-sequence

Distributed Aircraft Maintenance Environment - DAME

Pattern matching problem

• Search for sub-sequences similar to the query in a large volume of tracked order data.– Need to investigate all possible alignments– Benchmark method is sequential scan– Noisy data: imprecise matching required– Various possible similarity measures

• Euclidian distance

• Correlation

Distributed Aircraft Maintenance Environment - DAME

AURA solution

StoredTime series

AURA SearchEngine

Results

EncodedQuery

QueryTime Series

AURABackcheck

Encoded Time Series

Candidate Matches

Distributed Aircraft Maintenance Environment - DAME

AURA solution

• Encoding: reduction in dimensionality – e.g. from 100pts to 10 values.

• Approximate search– From ~ 1,000,000s of alignments down to ~1000s of

candidate matches

• Backcheck– From ~1000s candidate matches to 100 or fewer results

Distributed Aircraft Maintenance Environment - DAME

Encoding technique

• Piecewise Aggregate Approximation• Values encoded using integer bins

Y-A

xis

X-Axis

Distributed Aircraft Maintenance Environment - DAME

Search efficiency

• Approximate search using AURA– Fast method of discarding poor matches– AURA search typically an order of magnitude or more faster

than sequential scan. – Candidate matches typically <1% of total.– Back check stage very efficient due to reduction in volume

of data• typically 1% or less of processing time for full sequential scan.

Distributed Aircraft Maintenance Environment - DAME

Data size

• Assume– Fleet of 100 aircraft, 4 engines each– Flying 10 hours per day– 5 data points per tracked order per second – 4 bytes per data point

• Totals– approx. 100 GigaBytes per year per tracked order– Roughly 10 tracked orders of interest so…

• Total approx. 1 TeraByte per year

Distributed Aircraft Maintenance Environment - DAME

Search performance

• Deployed system assumptions– 100 CPUs 2GHz each with 1GByte RAM.

• One per aircraft

– Each search needs to check 25,000,000,000 alignments of the query per year of tracked order data.

• Sequential scan– Measured at approx. 2 seconds for 5,000,000 alignments of

a 100 data point query (one CPU).– Extrapolates to approx. 500 seconds to search 5 years of

data assuming 1 CPU per aircraft

– This is too slow! Need to support multiple searches and searches on more than one tracked order.

Distributed Aircraft Maintenance Environment - DAME

Search performance

• Using AURA and PAA based approach– Search time reduced by approx an order of magnitude.

– Can search 5 years of data for 100 aircraft in approx:

50 seconds

– Believe this to be a workable solution – But response times potentially slower than this

• Need to handle a number of searches in parallel

• Communications and other overheads

Distributed Aircraft Maintenance Environment - DAME

Next steps

• Technology– Refine similarity measures and encoding methods.

• Architecture– Develop additional services to distribute and organise the

search– Support multiple searches in parallel

• Measurement– Perform scaling trials on engine data– Obtain better estimates of overall performance

• Multiple searches

• Overheads

top related