678)$6789$6:;.4!0&>$6+?

1
Unsupervised Approaches for Post-Processing in Computationally Efficient Waveform-Similarity-Based Earthquake Detection Karianne Bergen 1 , Clara Yoon 2 , Ossian O’Reilly 2 , Gregory Beroza 2 1 Institute for Computational and Mathematical Engineering, Stanford University, 2 Department of Geophysics, Stanford University email: [email protected] Introduction Fingerprint and Similarity Thresholding (FAST) promises to allow large-scale blind search for similar waveforms in long-duration continuous seismic data [1]. Waveform similarity search applied to datasets of months to years of data will identify significantly more low-magnitude events than traditional methods for earthquake detection. New approaches for processing the output from similarity-based detection are required - manual inspection is infeasible for large data volumes. We explore data mining techniques for improved detection post-processing. FAST: Method Overview FAST is inspired by the Waveprint [2] algorithm for identifying audio clips, adapted to continuous seismic waveform data. Preprocessing: spectrogram (a.er bandpass filtering) Data: con6nuous 6me series data A B Detec1on Results Post-Processing § Iden6fying events § Combining over network § Removing false posi6ves § Clustering waveforms ( , ) ( , ) ( , ) ( , ) Database Genera1on & Search Fast approximate similarity search using § MinHash and § Locality Sensi6ve Hashing FAST Algorithmic Pipeline Feature Extrac1on Spectral Image Top coefficients (most discrimina-ve) Binary Fingerprint Haar Transform Database search returns list of “candidate pairs” - post-processing is necessary to eliminate non-earthquakes (false positives, correlated noise) Event Identification and Network Detection How do we identify earthquakes from waveform pairs returned by FAST? 0.988 0.975 0.970 event 1 event 2 Output of FAST(single channel): sparse matrix - (candidate) pairs of similar waveforms Single event pairs often result in multiple detections: time-adjacent windows overlap Multiple (sequential) detections of a single event pair appear along a diagonal line (fixed inter-event time Δt ) in similarity matrix Link all detections for each event pair for improved thresholding How do we combine single-station detection results from FAST over a network of seismic stations? Network detection can improve detection sensitivity Limited move-out (multiple channels at single sta- tion or nearby stations): sum single-channel similar- ity matrices network similarity matrix Challenge: move-out varies between stations and is unknown a priori in blind search Inter-event time is uniform across network for a given event pair Pseudo-association: group detections by inter- event time (diagonal) across multiple stations Data set: Iquique foreshocks, 2014-03-21 Time (s), from 83158 0 20 40 60 PSGCX PB11 PB08 PB01 PATCX Time (s), from 84075 0 20 40 60 CC = 0.627 CC = 0.792 CC = 0.814 CC = 0.775 CC = 0.829 Waveforms of event pair recorded across multiple stations 83160 83180 83200 83220 84080 84100 84120 84140 0 0.1 0.2 0.3 0.4 0.5 0.6 >0.7 !me index 1 !me index 2 Summed Network Similarity PB01 PB08 PATCX 2 sta!ons PSGCX PB11 Similarity matrix: event pair detected across multiple stations appears along same diagonal, but with minimal temporal overlap Clustering Waveforms Clustering is a set of techniques for identifying groups of similar waveforms within the full set of detections returned by FAST, which can be used to: Organize detection results for easier interpretation (i.e. find interesting structure/patterns in the data), Identify new template waveforms for template matching or subspace detection, and Remove additional false alarms (e.g. outliers, non-earthquake clusters) Application: Guy-Greenbrier Fault, central Arkansas FAST detects 746 new earthquakes that were not identified by template matching in one month of data (July 2010) at station WHAR [3] Similarity matrix for new detections has a block-like structure - apply spectral clustering to identify 8 broad waveform clusters 1 2 3 4 5 6 7 8 3-channel event similari0es (normalized CC) 3-channel event similari0es (normalized CC) event index 1 event index 1 (reordered) event index 2 event index 2 (reordered) Representative waveforms (three-component) from each cluster WHAR.HHE WHAR.HHN WHAR.HHZ *me (s) cluster 2 cluster 3 cluster 4 cluster 5 cluster 6 cluster 7 cluster 8 *me (s) *me (s) 0.0 4.0 2.0 0.0 4.0 2.0 0.0 4.0 2.0 cluster 1 Reclustering within large clusters can identify repre- sentative waveforms or small clusters, e.g. cluster 8 e.g. Hierarchical clustering (complete-linkage) identifies representative waveforms within clusters (Right) Clustering can aid in visualization and interpretation of a large number of new detections: cluster membership of new FAST detections plotted over time. Injection began at well #1, closest to the Guy-Greenbrier Fault, on 7 July 2010 (at 518400s in figure). !me (s) from 2010-07-01 00:00:00.00 1.0 0.8 0.6 0.4 0.2 0 similarity (maximum normalized CC) 0 0.5×10 6 1.5×10 6 2.0×10 6 1.0×10 6 2.5×10 6 Feature Extraction “Good” feature extraction can reduce false detections Binary fingerprints act as proxies for waveforms in efficient similarity search Fingerprints must be discriminative: (dis)similar waveforms should have (dis)similar fingerprints False detections preferred to missed detections, but too many hurt performance How are “most discriminative” Haar coefficients selected? Top magnitude coefficients (often used for efficient compression) Most atypical coefficients, as measured by: Z-score (mean, standard deviation), or Median Absolute Deviation (MAD) across data set MAD-based Haar coefficient selection demonstrates the best performance in low SNR settings and is most efficient. Top Magnitude Top Z-score Top MAD noise sample 1 noise sample 2 Synthetic Test Comparison of the performance of Haar coefficient-selection methods on synthetic test. The MAD-based coefficient selection best separates the repeated waveforms from the noise. (Right) Test data (a): 12 pairs of repeated waveforms (SNR 1.25-5) planted at known times in 3hrs of noise (bandpass 1-10Hz). Detection results from FAST shown for (b) top magnitude, (c) top Z-score, and (d) top MAD Haar coefficients. Location of true repeated events indicated by orange verti- cal lines, and the detection statistic (simi- larity value) is plotted in blue. Top 400 co- efficients selected in results pictured, but results hold for top 100-800 coefficients. % bits in binary fingerprint (cumulative) 0 0.2 0.4 0.6 0.8 1 frequency of coefficient activation (normalized) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 top 400 Haar coefficients in magnitude top 400 standarized Haar coefficients (Z-score) top 400 standarized Haar coefficients (MAD) ideal line for perfectly efficient representation top 400 Haar coefficients in magnitude top 400 standarized Haar coefficients (Z-score) top 400 standarized Haar coefficients (MAD) ideal line for perfectly efficient representation (a) (b) (c) (d) 0 10000 2000 4000 6000 8000 .me (s) 0 10000 2000 4000 6000 8000 .me (s) 0 10000 2000 4000 6000 8000 .me (s) 0 -40 -80 40 80 similarity value 0.4 0.2 0 1.0 0.6 0.8 similarity value 0.4 0.2 0 1.0 0.6 0.8 0 10000 2000 4000 6000 8000 .me (s) similarity value 0.4 0.2 0 1.0 0.6 0.8 (Left) Efficiency of binary representations (ordered from least to most efficient): top magnitude (blue), top Z-score (orange) and top MAD (purple), with Gini index of 0.73, 0.28, and 0.11, respectively. Alternate Feature Extraction Approaches (on-going work) Time-domain features: bag-of-waveforms, wavelets, random projections, Data-driven features: spectral hashing, shift-invariant sparse coding, nonnegative matrix factorization (NMF)-based features References [1] Yoon, C., et al. (2015). “Earthquake detection through computationally efficient similarity search.” Science Advances, 1(11). [2] Baluja, S., and Covell, M. (2008). “Waveprint: Efficient wavelet-based audio fingerprinting.” Pattern Recognition, 41(11). [3] Yoon, C. et al., (2015) AGU Fall Meeting Abstract S13B-2850. Read more about FAST (doi:10.1126/sciadv.1501057)

Upload: others

Post on 31-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 678)$6789$6:;.4!0&>$6+?

Unsupervised Approaches for Post-Processing in Computationally EfficientWaveform-Similarity-Based Earthquake Detection

Karianne Bergen1, Clara Yoon2, Ossian O’Reilly2, Gregory Beroza2

1Institute for Computational and Mathematical Engineering, Stanford University, 2Department of Geophysics, Stanford University email: [email protected]

Introduction

Fingerprint and Similarity Thresholding (FAST) promises to allow large-scaleblind search for similar waveforms in long-duration continuous seismic data [1].n Waveform similarity search applied to datasets of months to years of data will

identify significantly more low-magnitude events than traditional methods forearthquake detection.

n New approaches for processing the output from similarity-based detection arerequired - manual inspection is infeasible for large data volumes.

n We explore data mining techniques for improved detection post-processing.

FAST: Method Overview

FAST is inspired by the Waveprint [2] algorithm for identifying audio clips, adaptedto continuous seismic waveform data.

wavelet transform x index

wav

elet

tran

sfor

m y

inde

x

Sign of top wavelet coefficients, window #1267

0 20 40 600

5

10

15

20

25

30

−1

0

1

wavelet transform x index

wav

elet

tran

sfor

m y

inde

x

log10(|Haar transform|), window #1267

0 20 40 600

5

10

15

20

25

30

−5

0

5

fingerprint x index

finge

rprin

t y in

dex

Binary fingerprints, window #1267

0 20 40 600

10

20

30

40

50

60

0

1

Time (s)

Freq

uenc

y (H

z)

log10(|spectral image|), window #1267

0 2 4 6 8 100

2

4

6

8

10

−5

0

5

Preprocessing:spectrogram(a.erbandpassfiltering)

wavelet transform x index

wav

elet

tran

sfor

m y

inde

x

Sign of top wavelet coefficients, window #1267

0 20 40 600

5

10

15

20

25

30

−1

0

1

wavelet transform x index

wav

elet

tran

sfor

m y

inde

x

log10(|Haar transform|), window #1267

0 20 40 600

5

10

15

20

25

30

−5

0

5

fingerprint x index

finge

rprin

t y in

dex

Binary fingerprints, window #1267

0 20 40 600

10

20

30

40

50

60

0

1

Time (s)

Freq

uenc

y (H

z)

log10(|spectral image|), window #1267

0 2 4 6 8 100

2

4

6

8

10

−5

0

5

Data:con6nuous6meseriesdata

140 160 180 200 220 240 260 280 300

-0.6

-0.4

-0.2

0

0.2

0.4

A

140 160 180 200 220 240 260 280 300

-0.6

-0.4

-0.2

0

0.2

0.4

B

Detec1onResults

Post-Processing

§  Iden6fyingevents§  Combiningovernetwork§  Removingfalseposi6ves§  Clusteringwaveforms

( , ) ( , )

( , )

( , )

DatabaseGenera1on&Search

Fastapproximatesimilaritysearchusing§ MinHashand§  LocalitySensi6veHashing

FASTAlgorithmicPipeline

wavelet transform x index

wav

elet

tran

sfor

m y

inde

x

Sign of top wavelet coefficients, window #1267

0 20 40 600

5

10

15

20

25

30

−1

0

1

wavelet transform x index

wav

elet

tran

sfor

m y

inde

x

log10(|Haar transform|), window #1267

0 20 40 600

5

10

15

20

25

30

−5

0

5

fingerprint x index

finge

rprin

t y in

dex

Binary fingerprints, window #1267

0 20 40 600

10

20

30

40

50

60

0

1

Time (s)

Freq

uenc

y (H

z)

log10(|spectral image|), window #1267

0 2 4 6 8 100

2

4

6

8

10

−5

0

5

wavelet transform x index

wav

elet

tran

sfor

m y

inde

x

Sign of top wavelet coefficients, window #1267

0 20 40 600

5

10

15

20

25

30

−1

0

1

wavelet transform x index

wav

elet

tran

sfor

m y

inde

x

log10(|Haar transform|), window #1267

0 20 40 600

5

10

15

20

25

30

−5

0

5

fingerprint x index

finge

rprin

t y in

dex

Binary fingerprints, window #1267

0 20 40 600

10

20

30

40

50

60

0

1

Time (s)

Freq

uenc

y (H

z)

log10(|spectral image|), window #1267

0 2 4 6 8 100

2

4

6

8

10

−5

0

5

wavelet transform x index

wav

elet

tran

sfor

m y

inde

x

Sign of top wavelet coefficients, window #1267

0 20 40 600

5

10

15

20

25

30

−1

0

1

wavelet transform x index

wav

elet

tran

sfor

m y

inde

x

log10(|Haar transform|), window #1267

0 20 40 600

5

10

15

20

25

30

−5

0

5

fingerprint x index

finge

rprin

t y in

dex

Binary fingerprints, window #1267

0 20 40 600

10

20

30

40

50

60

0

1

Time (s)

Freq

uenc

y (H

z)

log10(|spectral image|), window #1267

0 2 4 6 8 100

2

4

6

8

10

−5

0

5

wavelet transform x index

wav

elet

tran

sfor

m y

inde

x

Sign of top wavelet coefficients, window #1267

0 20 40 600

5

10

15

20

25

30

−1

0

1

wavelet transform x index

wav

elet

tran

sfor

m y

inde

x

log10(|Haar transform|), window #1267

0 20 40 600

5

10

15

20

25

30

−5

0

5

fingerprint x index

finge

rprin

t y in

dex

Binary fingerprints, window #1267

0 20 40 600

10

20

30

40

50

60

0

1

Time (s)

Freq

uenc

y (H

z)

log10(|spectral image|), window #1267

0 2 4 6 8 100

2

4

6

8

10

−5

0

5

FeatureExtrac1on

SpectralImage

Topcoefficients(mostdiscrimina-ve)

BinaryFingerprint

HaarTransform

n Database search returns list of “candidate pairs” - post-processing is necessaryto eliminate non-earthquakes (false positives, correlated noise)

Event Identification and Network Detection

How do we identify earthquakes from waveform pairs returned by FAST?

0.9880.975

0.970

event1

event2

n Output of FAST(single channel): sparse matrix - (candidate) pairs of similar waveformsn Single event pairs often result in multiple detections: time-adjacent windows overlapn Multiple (sequential) detections of a single event pair appear along a diagonal line (fixed

inter-event time ∆t) in similarity matrixn Link all detections for each event pair for improved thresholding

How do we combine single-station detection results from FAST over a network of seismic stations?

n Network detection can improve detection sensitivityn Limited move-out (multiple channels at single sta-

tion or nearby stations): sum single-channel similar-ity matrices → network similarity matrix

n Challenge: move-out varies between stations and isunknown a priori in blind search

n Inter-event time is uniform across network for agiven event pair

n Pseudo-association: group detections by inter-event time (diagonal) across multiple stations

Data set: Iquique foreshocks, 2014-03-21 Time (s), from 831580 20 40 60

PSGCX

PB11

PB08

PB01

PATCX

Time (s), from 840750 20 40 60

CC"="0.627""

CC"="0.792"

CC"="0.814"

CC"="0.775"

CC"="0.829"

Waveforms of event pair recordedacross multiple stations

83160 83180 83200 83220

84080

84100

84120

84140 0

0.1

0.2

0.3

0.4

0.5

0.6

>0.7

!meindex1

!meinde

x2

SummedNetworkSimilarity

PB01PB08

PATCX

2sta!onsPSGCXPB11

Similarity matrix: event pair detected across multiple stationsappears along same diagonal, but with minimal temporal overlap

Clustering Waveforms

Clustering is a set of techniques for identifying groups of similar waveforms within the full set of detections returned by FAST, which can be used to:n Organize detection results for easier interpretation (i.e. find interesting structure/patterns in the data),n Identify new template waveforms for template matching or subspace detection, andn Remove additional false alarms (e.g. outliers, non-earthquake clusters)

Application: Guy-Greenbrier Fault, central Arkansas

n FAST detects 746 new earthquakes that were not identified by templatematching in one month of data (July 2010) at station WHAR [3]

n Similarity matrix for new detections has a block-like structure - apply spectralclustering to identify 8 broad waveform clusters

1

234

5

6

78

3-channeleventsimilari0es(normalizedCC) 3-channeleventsimilari0es(normalizedCC)

eventindex1 eventindex1(reordered)

even

tind

ex2

even

tind

ex2(reo

rdered

)

Representative waveforms (three-component) from each cluster

WHAR.HHE WHAR.HHN WHAR.HHZ

*me(s)

cluster2

cluster3

cluster4

cluster5

cluster6

cluster7

cluster8

*me(s)*me(s)0.0 4.02.00.0 4.02.00.0 4.02.0

cluster1

n Reclustering within large clusters can identify repre-sentative waveforms or small clusters, e.g. cluster 8

n e.g. Hierarchical clustering (complete-linkage)identifies representative waveforms within clusters

(Right) Clustering can aid in visualization and interpretation of alarge number of new detections: cluster membership of new FASTdetections plotted over time. Injection began at well #1, closest tothe Guy-Greenbrier Fault, on 7 July 2010 (at 518400s in figure).

!me(s)from2010-07-0100:00:00.00

1.0

0.8

0.6

0.4

0.2

0

similarity(m

axim

umnormalize

dCC

)

0 0.5×106 1.5×106 2.0×1061.0×106 2.5×106

Feature Extraction

“Good” feature extraction can reduce false detectionsn Binary fingerprints act as proxies for waveforms in efficient similarity searchn Fingerprints must be discriminative: (dis)similar waveforms should have

(dis)similar fingerprintsn False detections preferred to missed detections, but too many hurt performance

How are “most discriminative” Haar coefficients selected?

n Top magnitude coefficients (often used for efficient compression)n Most atypical coefficients, as measured by:

n Z-score (mean, standard deviation), orn Median Absolute Deviation (MAD) across data set

n MAD-based Haar coefficient selection demonstrates the best performancein low SNR settings and is most efficient.

50 100 150 200 250 300 350-20

-10

0

10

20

50 100 150 200 250 300 350

-20-1001020

10 20 30 40 50 60

10

20

30

40

50

60

10 20 30 40 50 60

10

20

30

40

50

60

10 20 30 40 50 60

10

20

30

40

50

60

10 20 30 40 50 60

10

20

30

40

50

60

10 20 30 40 50 60

10

20

30

40

50

60

10 20 30 40 50 60

10

20

30

40

50

60

10 20 30 40 50 60

10

20

30

40

50

60

10 20 30 40 50 60

10

20

30

40

50

60

10 20 30 40 50 60

10

20

30

40

50

60

10 20 30 40 50 60

10

20

30

40

50

60

10 20 30 40 50 60

10

20

30

40

50

60

10 20 30 40 50 60

10

20

30

40

50

60

TopMagnitude TopZ-score TopMAD

noisesample1

noisesample2

Synthetic Test

Comparison of the performance of Haar coefficient-selection methods on synthetictest. The MAD-based coefficient selection best separates the repeated waveformsfrom the noise.

(Right) Test data (a): 12 pairs of repeatedwaveforms (SNR 1.25-5) planted at knowntimes in 3hrs of noise (bandpass 1-10Hz).Detection results from FAST shown for (b)top magnitude, (c) top Z-score, and (d) topMAD Haar coefficients. Location of truerepeated events indicated by orange verti-cal lines, and the detection statistic (simi-larity value) is plotted in blue. Top 400 co-efficients selected in results pictured, butresults hold for top 100-800 coefficients.

% bits in binary fingerprint (cumulative)0 0.2 0.4 0.6 0.8 1

frequ

ency

of c

oeffi

cien

t act

ivat

ion

(nor

mal

ized

)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1top 400 Haar coefficients in magnitudetop 400 standarized Haar coefficients (Z-score)top 400 standarized Haar coefficients (MAD)ideal line for perfectly efficient representation

% bits in binary fingerprint (cumulative)0 0.2 0.4 0.6 0.8 1

frequ

ency

of c

oeffi

cien

t act

ivat

ion

(nor

mal

ized

)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1top 400 Haar coefficients in magnitudetop 400 standarized Haar coefficients (Z-score)top 400 standarized Haar coefficients (MAD)ideal line for perfectly efficient representation

(a)

(b)

(c)

(d)

0 100002000 4000 6000 8000.me(s)

0 100002000 4000 6000 8000.me(s)

0 100002000 4000 6000 8000.me(s)

0

-40

-80

40

80

similarityvalue

0.4

0.2

0

1.0

0.6

0.8

similarityvalue

0.4

0.2

0

1.0

0.6

0.8

0 100002000 4000 6000 8000.me(s)

similarityvalue

0.4

0.2

0

1.0

0.6

0.8

(Left) Efficiency of binary representations (orderedfrom least to most efficient): top magnitude (blue),top Z-score (orange) and top MAD (purple), withGini index of 0.73, 0.28, and 0.11, respectively.

Alternate Feature Extraction Approaches (on-going work)

n Time-domain features: bag-of-waveforms, wavelets, random projections,n Data-driven features: spectral hashing, shift-invariant sparse coding,

nonnegative matrix factorization (NMF)-based features

References

[1] Yoon, C., et al. (2015). “Earthquake detection through computationallyefficient similarity search.” Science Advances, 1(11).

[2] Baluja, S., and Covell, M. (2008). “Waveprint: Efficient wavelet-basedaudio fingerprinting.” Pattern Recognition, 41(11).

[3] Yoon, C. et al., (2015) AGU Fall Meeting Abstract S13B-2850.ReadmoreaboutFAST(doi:10.1126/sciadv.1501057)