footstep detection in urban seismic data with a

7
654 The Leading Edge September 2020 Special Section: Smart city geophysics Footstep detection in urban seismic data with a convolutional neural network Abstract Seismic data for studying the near surface have historically been extremely sparse in cities, limiting our ability to understand small-scale processes, locate small-scale geohazards, and develop earthquake hazard microzonation at the scale of buildings. In recent years, distributed acoustic sensing (DAS) technology has enabled the use of existing underground telecommunications fibers as dense seismic arrays, requiring little manual labor or energy to maintain. At the Fiber-Optic foR Environmental SEnsEing array under Pennsylvania State University, we detected weak slow-moving signals in pedestrian-only areas of campus. ese signals were clear in the 1 to 5 Hz range. We verified that they were caused by footsteps. As part of a broader scheme to remove and obscure these footsteps in the data, we developed a convolutional neural network to detect them automatically. We created a data set of more than 4000 windows of data labeled with or without footsteps for this development process. We describe improvements to the data input and architecture, leading to approximately 84% accuracy on the test data. Performance of the network was better for individual walkers and worse when there were multiple walkers. We believe the privacy concerns of indi- vidual walkers are likely to be highest priority. Community buy-in will be required for these technologies to be deployed at a larger scale. Hence, we should continue to proactively develop the tools to ensure city residents are comfortable with all geophysical data that may be acquired. Introduction Near-surface geophysics in cities is not well understood. is is due to high levels of seismic noise and the physical complexities of our underground patchwork of man-made and natural materials. It is also due to a lack of data. Previous work on urban seismology with individual seismometers and nodal arrays has revealed a wide variety of seismic vibration sources in cities, some of which can be carefully used for near-surface imaging (Meremonte et al., 1996; Vidale, 2011; Lin et al., 2013; Nakata et al., 2015; Díaz et al., 2017). However, the manual labor required to install and maintain a large array in a city and the permitting required to gain access to installation locations have hampered wide-scale implementation of urban seismic node arrays. In recent years, we have seen multiple experiments utilizing distributed acoustic sensing (DAS) to record dense seismic data in the near surface beneath populated areas by using existing telecommunications fiber-optic cables. e aim has been to cover distances of one to tens of kilometers with a seismic array with thousands of channels Srikanth Jakkampudi 1 , Junzhu Shen 2 , Weichen Li 1,3 , Ayush Dev 1 , Tieyuan Zhu 2 , and Eileen R. Martin 1 spaced meters to tens of meters apart using instrumentation that requires little manual labor to maintain. Prior fiber-optic telecom- munications experiments have focused on earthquake monitoring, imaging for earthquake hazard analysis, and understanding subsurface water levels (Lindsey et al., 2017; Martin et al., 2017; Ajo-Franklin et al., 2019; Zhu and Stensrud, 2019; Zhan, 2020). A long-term goal shared by many studies is to determine whether DAS may be a path forward for scalable geophysical systems that can be deployed more widely in populated areas and around critical infrastructure. Many U.S. cities include large swaths of land with a mix of medium-density housing and com- mercial development. While DAS fibers would only record data in public spaces, along roads, and under sidewalks, the expansion of DAS technology in cities brings increased privacy concerns in mixed commercial-residential areas. In these areas, scientists and engineers utilizing DAS data must be proactive to develop methods that ensure that the seismic data acquired do not contain enough information to infer people’s movements. In this way, DAS data distribution and analysis can integrate the privacy protection expected of other data acquisition technologies deployed in cities (Elmaghraby and Losavio, 2014). When developing their cell phone app for early earthquake warning, creators of the MyShake application noted the potential privacy concerns of vibrations that could be tied to people and added a level of privacy through kilometer-scale binning (Rochford et al., 2018). is is possible for a yes/no decision detection method for early earthquake warning. However, without array geometry and fine-scale acquisition, we would be unable to study near-surface variability at the scale of meters to tens of meters. Studies of near- surface hydrology, permafrost thaw, and other near-surface processes often require meter-scale data resolution. Urban DAS studies have already revealed variability at the scale of tens of meters in surface- wave dispersion measurements critical to earthquake ground-motion prediction (Martin, 2018; Spica et al., 2020). Our ability to study near-surface geophysics at scales relevant to urban planning would be hampered greatly by aggressive binning, so we seek alternative routes to ensure that residents’ personal privacy is respected. Unsupervised learning for data exploration at the Stanford Fiber Optic Seismic Observatory revealed that strong signals from nearby cars could be identified automatically, and after identification, these car signals could be filtered out in the wavelet domain (Martin et al., 2018; Huot et al., 2019). However, cars are not the only potential signals of concern. Some cities have higher rates of pedestrian and bicycle commuters as well as public transportation. e detection of pedestrian signals differs from 1 Virginia Tech, Blacksburg, Virginia, USA. E-mail: [email protected]; [email protected]; [email protected]. 2 Pennsylvania State University, State College, Pennsylvania, USA. E-mail: [email protected]; [email protected]. 3 Northeastern University, Boston, Massachusetts, USA. E-mail: [email protected]. https://doi.org/10.1190/tle39090654.1 Downloaded 09/03/20 to 73.187.130.231. Redistribution subject to SEG license or copyright; see Terms of Use at https://library.seg.org/page/policies/terms DOI:10.1190/tle39090654.1

Upload: others

Post on 10-Jun-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Footstep detection in urban seismic data with a

654 The Leading Edge September 2020 Special Section: Smart city geophysics

Footstep detection in urban seismic data with a convolutional neural network

AbstractSeismic data for studying the near surface have historically

been extremely sparse in cities, limiting our ability to understand small-scale processes, locate small-scale geohazards, and develop earthquake hazard microzonation at the scale of buildings. In recent years, distributed acoustic sensing (DAS) technology has enabled the use of existing underground telecommunications fibers as dense seismic arrays, requiring little manual labor or energy to maintain. At the Fiber-Optic foR Environmental SEnsEing array under Pennsylvania State University, we detected weak slow-moving signals in pedestrian-only areas of campus. These signals were clear in the 1 to 5 Hz range. We verified that they were caused by footsteps. As part of a broader scheme to remove and obscure these footsteps in the data, we developed a convolutional neural network to detect them automatically. We created a data set of more than 4000 windows of data labeled with or without footsteps for this development process. We describe improvements to the data input and architecture, leading to approximately 84% accuracy on the test data. Performance of the network was better for individual walkers and worse when there were multiple walkers. We believe the privacy concerns of indi-vidual walkers are likely to be highest priority. Community buy-in will be required for these technologies to be deployed at a larger scale. Hence, we should continue to proactively develop the tools to ensure city residents are comfortable with all geophysical data that may be acquired.

IntroductionNear-surface geophysics in cities is not well understood. This

is due to high levels of seismic noise and the physical complexities of our underground patchwork of man-made and natural materials. It is also due to a lack of data. Previous work on urban seismology with individual seismometers and nodal arrays has revealed a wide variety of seismic vibration sources in cities, some of which can be carefully used for near-surface imaging (Meremonte et al., 1996; Vidale, 2011; Lin et al., 2013; Nakata et al., 2015; Díaz et al., 2017). However, the manual labor required to install and maintain a large array in a city and the permitting required to gain access to installation locations have hampered wide-scale implementation of urban seismic node arrays. In recent years, we have seen multiple experiments utilizing distributed acoustic sensing (DAS) to record dense seismic data in the near surface beneath populated areas by using existing telecommunications fiber-optic cables. The aim has been to cover distances of one to tens of kilometers with a seismic array with thousands of channels

Srikanth Jakkampudi1, Junzhu Shen2, Weichen Li1,3, Ayush Dev1, Tieyuan Zhu2, and Eileen R. Martin1

spaced meters to tens of meters apart using instrumentation that requires little manual labor to maintain. Prior fiber-optic telecom-munications experiments have focused on earthquake monitoring, imaging for earthquake hazard analysis, and understanding subsurface water levels (Lindsey et al., 2017; Martin et al., 2017; Ajo-Franklin et al., 2019; Zhu and Stensrud, 2019; Zhan, 2020).

A long-term goal shared by many studies is to determine whether DAS may be a path forward for scalable geophysical systems that can be deployed more widely in populated areas and around critical infrastructure. Many U.S. cities include large swaths of land with a mix of medium-density housing and com-mercial development. While DAS fibers would only record data in public spaces, along roads, and under sidewalks, the expansion of DAS technology in cities brings increased privacy concerns in mixed commercial-residential areas. In these areas, scientists and engineers utilizing DAS data must be proactive to develop methods that ensure that the seismic data acquired do not contain enough information to infer people’s movements. In this way, DAS data distribution and analysis can integrate the privacy protection expected of other data acquisition technologies deployed in cities (Elmaghraby and Losavio, 2014).

When developing their cell phone app for early earthquake warning, creators of the MyShake application noted the potential privacy concerns of vibrations that could be tied to people and added a level of privacy through kilometer-scale binning (Rochford et al., 2018). This is possible for a yes/no decision detection method for early earthquake warning. However, without array geometry and fine-scale acquisition, we would be unable to study near-surface variability at the scale of meters to tens of meters. Studies of near-surface hydrology, permafrost thaw, and other near-surface processes often require meter-scale data resolution. Urban DAS studies have already revealed variability at the scale of tens of meters in surface-wave dispersion measurements critical to earthquake ground-motion prediction (Martin, 2018; Spica et al., 2020). Our ability to study near-surface geophysics at scales relevant to urban planning would be hampered greatly by aggressive binning, so we seek alternative routes to ensure that residents’ personal privacy is respected.

Unsupervised learning for data exploration at the Stanford Fiber Optic Seismic Observatory revealed that strong signals from nearby cars could be identified automatically, and after identification, these car signals could be filtered out in the wavelet domain (Martin et al., 2018; Huot et al., 2019). However, cars are not the only potential signals of concern. Some cities have higher rates of pedestrian and bicycle commuters as well as public transportation. The detection of pedestrian signals differs from

1Virginia Tech, Blacksburg, Virginia, USA. E-mail: [email protected]; [email protected]; [email protected] State University, State College, Pennsylvania, USA. E-mail: [email protected]; [email protected] University, Boston, Massachusetts, USA. E-mail: [email protected].

https://doi.org/10.1190/tle39090654.1

Dow

nloa

ded

09/0

3/20

to 7

3.18

7.13

0.23

1. R

edis

trib

utio

n su

bjec

t to

SE

G li

cens

e or

cop

yrig

ht; s

ee T

erm

s of

Use

at h

ttps:

//lib

rary

.seg

.org

/pag

e/po

licie

s/te

rms

DO

I:10.

1190

/tle3

9090

654.

1

Page 2: Footstep detection in urban seismic data with a

September 2020 The Leading Edge 655Special Section: Smart city geophysics

vehicle signals because they have significantly lower amplitude and may have greater variability in speed and in the waveforms themselves. The complexity and variability of footsteps yields additional potential for identifiability. Accelerometer recordings of footsteps in buildings have been used to successfully identify the gender of pedestrians (Bales et al., 2016) and to differentiate between a set of people in a controlled experiment (Pan et al., 2015).

As we perform a variety of geophysical analyses on more than 100 TB of data at the Pennsylvania State Fiber-Optic foR Environmental SEnsEing (FORESEE) array, we are simultane-ously developing tools to ensure that community members’ privacy on campus is respected. This array, located in State College, Pennsylvania, has 2 m channel spacing with a 10 m gauge length along 5 km of existing unused fiber in underground telecom-munications conduits (Figure 1). The primary goals are to study near-surface environmental effects including hydrology and structural stability of soils and to search for geohazards such as karst caverns common in the Appalachian Mountain region. While the surrounding environment is rural, the campus is host to more than 50,000 students and employees. The array has revealed a wide variety of seismic sources as potential alternative sources of energy for crustal and near-surface imaging (Zhu and Stensrud, 2019).

In this paper, we show how we discovered and verified observa-tions of footsteps in some parts of the data. We also show how we developed an automatic technique to detect the footsteps. Prior experiments utilizing DAS with telecommunications fiber have not reported the observation of footsteps. We discuss in detail how we curated the data and designed the neural network to more accurately detect footsteps. Detecting the footsteps will enable us to remove them before making the data public or to potentially remove them as data are acquired.

Discovery and verification of footstepsOne of our goals at the FORESEE array is to use passive seismic

imaging methods to understand near-surface hydrology and poten-tial geohazards. The first step prior to imaging is exploring and

characterizing the ambient seismic noise field created by a variety of vibration sources, both natural and man-made. Understanding the distribution of seis-mic noise is critical to successful imaging using seismic events or the ambient noise field. Up to approximately 10 m of the fiber, cars include significant low-frequency signal (Huot et al., 2018; Huot et al., 2019), which has been accurately described by models of static loading on the wheels (Jousset et al., 2018; Wang et al., 2020). Typically, the response due to cars farther away is in the 5–25 Hz frequency range. We were surprised to notice strong signals in a pedestrian-only area of campus when data were band passed to 1–5 Hz (Shen et al., 2019; Zhu

et al., 2019). These signals appear to move much slower than vehicles, generally between 1 and 2 m/s. The specific pedestrian-only area of campus is along a sidewalk, where scooters, bicycles, etc. are prohibited. Figure 2 shows that while the amplitude of these slow signals was relatively weak in the raw data, a roughly 1.4 m/s signal is clearly visible in the 1–5 Hz band-passed data.

Determining the true cause of various seismic signals in passive data in populated areas is a challenge due to the wide variety of noise sources present and a typical lack of comparison or baseline event data. We needed to verify the cause of these slow-moving signals by conducting experiments to record the specific timing of walkers. One such experiment is shown in Figure 3. The experiment was done on the morning of 15 July 2019 in the central area of campus. We observed that there was no other strong anthropogenic noise except walking people at that time. This involved one researcher writing down the starting location of multiple pedestrians leaving a bus stop, writing down his own starting location and time, and writing down the starting time of a pedestrian walking in the opposite direction.

Timing of known pedestrians and comparison to other similar signals hypothesized to be caused by pedestrians verified that pedestrians were indeed the source. Prior dark fiber experiments in populated areas did not report the footsteps of individual pedes-trians. Individual footstep detection has been studied previously in fiber-optic security installations (Allwood et al., 2016). We hypothesized that the relatively weak coupling of a dark fiber array (compared to a directly buried fiber) would make footstep detection challenging. We expected further difficulties in detecting footsteps due to the low amplitude of a single pedestrian’s footsteps relative to other urban seismic sources and due to the spatial blurring of localized signals inherent in the DAS measurement process. Possible relevant differences include subsurface geologic materials, fiber-optic conduit installation techniques, and less seismic noise from surrounding developments. Observations of marching bands with many people with synchronous steps were reported on a dark fiber array by Wang et al. (2020), although it is not clear whether individual footsteps at lower amplitude could be detected. Further studies will be needed to determine the factors that enable us to

Figure 1. (a) The FORESEE array location is marked by the star in central Pennsylvania in the eastern United States. (b) The array follows a 5 km path of preinstalled underground fiber-optic cables underneath the Pennsylvania State University campus. This route includes sections along roads as well as pedestrian-only areas. Map overlay courtesy of Google Maps.

Dow

nloa

ded

09/0

3/20

to 7

3.18

7.13

0.23

1. R

edis

trib

utio

n su

bjec

t to

SE

G li

cens

e or

cop

yrig

ht; s

ee T

erm

s of

Use

at h

ttps:

//lib

rary

.seg

.org

/pag

e/po

licie

s/te

rms

DO

I:10.

1190

/tle3

9090

654.

1

Page 3: Footstep detection in urban seismic data with a

656 The Leading Edge September 2020 Special Section: Smart city geophysics

observe individual footsteps at the FORESEE array in comparison with other DAS arrays using telecommunications fiber.

Automating the detection of footsteps As shown in Figure 3, we observe distinct signals (proportional

to strain rate) when and where there are pedestrians within primar-ily pedestrian regions of campus in the filtered data set. The speed at which we see people moving is consistent with a general walking pace of 1.4 m/s. We see more of these footstep patterns at times such as university passing periods or lunch breaks. We rarely observe these patterns in the middle of the night, which further supports that they are footsteps. While we could make these qualitative observations by visually investigating plots of a small portion of the data band passed to 1–5 Hz, it is impossible to manually comb through the full data set (more than 100 TB) in

real time or after the fact. Thus, we needed to reliably automate footstep detection in order to automate removal.

A simple K-means clustering approach used in Martin et al. (2018) to detect cars, did not yield satisfactory results in identifying a clear footstep cluster. It is possible that the lower amplitude of footsteps and the variability of the human gait as a seismic source limited the applicability of this strategy. In the previous work of detecting car signals on a dark fiber array, accuracy was improved through the use of an augmented data set and a convolutional neural network (CNN) approach (Huot et al., 2018).

We can turn this footstep detection problem into an image classification problem, simply by using the image representations of filtered seismic data as input. Numerous researchers have applied the CNN approach to extract subtle spatiotemporal patterns (Rawat and Wang, 2017). To develop the CNN, we used a subset

Figure 2. (a) Raw seismic data in a pedestrian-only area show faint slow-moving signals and higher-amplitude impulses near a road between channels 1375 and 1390. Data are proportional to strain rate. (b) A band pass of these data from 1 to 5 Hz shows clear signals moving at approximately 1.4 m/s along the fiber.

Figure 3. We verified that the signals hypothesized as footsteps were indeed caused by pedestrians by observing the timing of multiple people walking away from a bus stop, one of us walking after them, and another person walking in the opposite direction. These pedestrians followed the route along the fiber shown by (a) the diagram and (b) generated strain rate data that were similar to previously noted footstep waveforms. Data are proportional to strain rate. Map overlay courtesy of Google Maps.

Dow

nloa

ded

09/0

3/20

to 7

3.18

7.13

0.23

1. R

edis

trib

utio

n su

bjec

t to

SE

G li

cens

e or

cop

yrig

ht; s

ee T

erm

s of

Use

at h

ttps:

//lib

rary

.seg

.org

/pag

e/po

licie

s/te

rms

DO

I:10.

1190

/tle3

9090

654.

1

Page 4: Footstep detection in urban seismic data with a

September 2020 The Leading Edge 657Special Section: Smart city geophysics

of the data spanning from 6 to 12 April 2019. To improve detection accuracy, we applied a 1–5 Hz band-pass filter (Figure 3) to enhance the contrast between footsteps and other background noise. We then subsampled the filtered data by a factor of 10 (down to a 50 Hz Nyquist frequency). We subdivided the data into windows of 150 channels and 60 s so each window would be categorized as containing footsteps or not containing footsteps. This preprocessing strategy reduced the dimensionality of the input, which lowered the dimensionality of the model, reduced data input/output requirements, and improved the potential for applicability of this method to other experiments acquiring data at lower sample rates. Though there was noticeable nonfootstep noise after filtering the data, the general linear moveout patterns that our network would presumably detect were still apparent.

While this CNN approach requires labeled test data, passively acquired seismic data have no such labels available. Hence, we manually labeled a subset of images as either containing footsteps or not containing footsteps. After restructuring our data set to avoid class imbalance, we had 2414 training samples, 448 validation samples, and 1294 test samples.

Our initial CNN had a simple architecture, consisting of one convolutional layer, but only had a performance slightly better than guessing at 57% accuracy. This was likely due to the complex patterns in the nonfootstep noise surrounding the linear trends that we were looking for, as shown in Figures 2 and 3. Additional convolutional layers can provide the model with more flexibility, enabling the intermediate feature maps to better hone in on the underlying waveforms and source movements that characterize footstep signals. Following a variety of hyperparameter and archi-tecture testing, we experimentally arrived at the network archi-tecture shown in Table 1. This model was able to achieve 84.77% accuracy on the validation set and 83.69% accuracy on the test set.

A variety of design choices in the architecture were made based on experimentation; however, the choice of several of the later layers was guided by theory. Conventionally, convolutional layers use small kernel sizes such as 3 × 3 or 5 × 5, but we found through experimentation that a kernel size of 9 × 9 works best for our problem. The intuition for this is that usually in image classification problems, the goal is to extract features capturing fine details. For our purposes, we want to ignore the fine details (ambient noise) and focus on identifying the overall patterns that characterize footsteps. After the input, the next five layers follow a traditional CNN architecture, with alternating convolutional layers and max-pooling layers with standard kernel sizes of 3 × 3 and 2 × 2, respectively. The model then flattens the data from a matrix to a vector and further follows the traditional approach of connecting the convolutional layers to two additional fully connected layers and a dropout layer. This is done to help reduce the potential for overfitting. In the final layer, the softmax activation function performed better than the sigmoid activation function.

As the true positives in Figure 4 show, our model performs well on isolated footstep signals. These are the high-priority signals that we want to classify correctly since they can potentially be personally identifiable information. Additionally, the test set confusion matrix in Table 2 shows that we are misclassifying nonfootsteps more than footsteps. A variety of false positives and

Table 1. Each layer of the final CNN architecture is listed in order. For layers requiring a kernel size or dimension choice, the selected size is noted. All layers requiring an activation function note the choice of function. Definitions of layers and activation functions are reviewed in Goodfellow et al. (2016).

Layer Kernel/Dimension Activation

1. Input (image) 432 × 848

2. Convolutional 9 × 9 ReLU

3. Max pooling 2 × 2

4. Convolutional 3 × 3 ReLU

5. Max pooling 2 × 2

6. Convolutional 3 × 3 ReLU

7. Max pooling 2 × 2

8. Flatten

9. Dense 64 ReLU

10. Dropout (0.5)

11. Dense 2 Softmax

Figure 4. (a) Several windows of test data, which were correctly identified by the CNN as footsteps, are shown. In each example, the horizontal axis is time, the vertical axis is channel number, and the amplitudes are clipped at the 97th percentile within each window. These show a variety of walking speeds. Note that some of these contain multiple pedestrians walking in different directions, so it is sometimes possible to detect these difficult scenarios correctly. (b) Several windows of test data identified as not containing footsteps in agreement with nonfootstep labels.

Dow

nloa

ded

09/0

3/20

to 7

3.18

7.13

0.23

1. R

edis

trib

utio

n su

bjec

t to

SE

G li

cens

e or

cop

yrig

ht; s

ee T

erm

s of

Use

at h

ttps:

//lib

rary

.seg

.org

/pag

e/po

licie

s/te

rms

DO

I:10.

1190

/tle3

9090

654.

1

Page 5: Footstep detection in urban seismic data with a

658 The Leading Edge September 2020 Special Section: Smart city geophysics

false negatives are shown in Figure 5. The method was less suc-cessful at identifying pedestrians when there were multiple pedestrians crossing paths in different directions or moving at different speeds. This is not surprising, as there is greater variability in these scenarios and the interference between vibrations gener-ated by footsteps of nearby walkers can lead to more complex waveforms. However, the typical scenario of someone coming or going from their home or office (situations most likely to reveal a person’s routine) would often be a single pedestrian or a few pedestrians walking at the same speed.

Many false negatives occur when the majority of the footstep signal are outside the window. We were able to properly label these windows due to the contextual information we had about the data in adjacent time steps. A potential solution to this issue would be to implement a sliding window algorithm that repeatedly queries our model. A sliding window approach would likely improve performance by providing the model with some redundant informa-tion and multiple opportunities to classify different subsets of the same footsteps. This could potentially be combined with a voting approach considering what proportion of the sliding windows overlapping a given time were identified as containing footsteps.

Future steps and generalizations In our study, model performance was improved by experiment-

ing with model hyperparameters and network architecture. Further tuning of the model may improve performance, but modifying the format of the training data could allow us to do so more efficiently. The band pass, window size, and sample rates used in this study were selected based on our understanding of footstep signals and experimental design at this site. Hence, it is possible that future improvements could be made by adjusting the range of the band-pass filter, further subsampling the data, and changing the time step represented in each input. All of these approaches have the potential to facilitate a model’s performance by simplifying the data to make relevant features easily identifiable.

Reliably detecting footsteps is the first step toward protecting privacy. We then want to remove these footsteps from the data. The simplest solution, which requires no additional research, is to mute the footsteps of individuals and to randomly add muted regions (so a mute does not necessarily imply footsteps occurred at that location and time). Ideally, we aim to seamlessly replace the footsteps with false noise that mimics the typical distribution of ambient noise without footsteps. This allows for the data to be publicly released without worry of compromising people’s sense of privacy in public spaces. It would also allow us to remove less of the original data. One simple solution may be the application of a 2D prediction error filter, which has been successfully applied to a variety of seismic data fill-in problems (Crawley, 2000).

However, prediction error filters sometimes make the data that are filled in similar enough to neighboring data that it may still be possible to detect the filled-in sections. An alternative tool we have been testing for data fill in is recurrent neural networks. Recurrent neural networks have shown promise in other application domains in predicting and imputing sequence and time series data (Che et al., 2018).

This study has been performed at one experimental site. Hence, further generalization must be done to ensure our workflow is effective in detecting and removing footsteps at other locations. To create a robust model for future use at other experiments, further generalization must be done to adapt to different spatiotemporal scalings and to incorporate different cable orienta-tions and walker directions, different cable installation techniques, and different near-surface conditions at other sites. By proactively developing tools to ensure that community members are comfort-able with seismic data acquisition, geophysicists can more easily continue to contribute and develop seismic methods to improve infrastructure and the urban environment.

ConclusionsWe established that footsteps are indeed visible in data at the

FORESEE array under the Pennsylvania State University campus,

Figure 5. (a) Several test data windows are shown. They are not believed to contain footsteps but were reported by the CNN as footstep detections. In each example, the horizontal axis is time, the vertical axis is channel number, and the amplitudes are clipped at the 97th percentile within each window. (b) Several test data windows are shown. They are believed to contain footsteps but were not reported as such by the CNN.

Table 2. The confusion matrix of the test data windows shows an 83.69% accuracy rate, with accurate detections noted in blue cells and inaccurate counts noted in pink cells.

Predicted/Actual Footstep Nonfootstep

Footstep 572 136

Nonfootstep 75 511

Dow

nloa

ded

09/0

3/20

to 7

3.18

7.13

0.23

1. R

edis

trib

utio

n su

bjec

t to

SE

G li

cens

e or

cop

yrig

ht; s

ee T

erm

s of

Use

at h

ttps:

//lib

rary

.seg

.org

/pag

e/po

licie

s/te

rms

DO

I:10.

1190

/tle3

9090

654.

1

Page 6: Footstep detection in urban seismic data with a

September 2020 The Leading Edge 659Special Section: Smart city geophysics

particularly in the 1–5 Hz range. We developed a CNN to detect pedestrians and improved on the detection accuracy through data preprocessing, optimally adding layers along with hyperparameter experimentation. Currently, we can mute the signals identified as walkers and add additional mutes randomly to ensure that a mute does not imply a walker. However, we continue to work toward a strategy that allows us to preserve more of the raw data by filling in the muted regions with data indistinguishable from other background noise.

Moving forward, we are excited about the possibility of improving our understanding of the urban subsurface, both to plan for the impact of the subsurface on our infrastructure and buildings and to better understand our impact on the environment. Increasingly more researchers are employing dense urban arrays to tackle near-surface challenges at large scale. We urge all groups deploying such arrays to use and improve these techniques to ensure that the privacy of community members is respected as we achieve the benefits of seismic monitoring.

AcknowledgmentsThis work was supported by a seed grant from the Penn State

Institutes of Energy and the Environment. We thank Todd Myers and Ken Miller at Pennsylvania State University and Thomas Coleman at Silixa who assisted in deploying the fiber-optic DAS array at the campus. We also want to thank collaborators Patrick Fox, Andy Nyblade, and Dave Stensrud who contributed to the FORESEE array experiment. The research on automated footstep detection was initiated as part of an undergraduate experiential learning course in the Virginia Tech Computational Modeling and Data Analytics Program. It continued with financial support from the Hamlett Undergraduate Research Program. Eileen Martin was supported in part by the U.S. Department of Energy grant DE-FE0091786. We would also like to thank the Virginia Tech Advanced Research Computing facility for providing computing resources.

Data and materials availabilityData used in this study will be released publicly through Penn

State Data Commons with three years of publication.

Corresponding author: [email protected]

ReferencesAjo-Franklin, J. B., S. Dou, N. J. Lindsey, I. Monga, C. Tracy, M.

Robertson, V. Rodriguez Tribaldos, et al., 2019, Distributed acoustic sensing using dark fiber for near-surface characterization and broad-band seismic event detection: Scientific Reports, 9, https://doi.org/10.1038/s41598-018-36675-8.

Allwood, G., G. Wild, and S. Hinckley, 2016, Optical fiber sensors in physical intrusion detection systems: A review: IEEE Sensors Journal, 16, no. 14, 5497–5509, https://doi.org/10.1109/JSEN.2016.2535465.

Bales, D., P. Tarazaga, M. Kasarda, D. Batra, A. Woolard, J. D. Poston, and V. V. Malladi, 2016, Gender classification of walkers via underfloor accelerometer measurements: IEEE Internet of Things Journal, 3, no. 6, 1259–1266, https://doi.org/10.1109/JIOT.2016.2582723.

Che, Z., S. Purushotham, K. Cho, D. Sontag, and Y. Liu, 2018, Recurrent neural networks for multivariate time series with missing values: Scientific Reports, 8, https://doi.org/10.1038/s41598-018-24271-9.

Crawley, S., 2000, Seismic trace interpolation with nonstationary pre-diction-error filters: Ph.D. dissertation, Stanford University.

Díaz, J., M. Ruiz, P. S. Sánchez-Pastor, and P. Romero, 2017, Urban seismology: On the origin of earth vibrations within a city: Scientific Reports, 7, https://doi.org/10.1038/s41598-017-15499-y.

Elmaghraby, A. S., and M. M. Losavio, 2014, Cyber security challenges in smart cities: Safety, security and privacy: Journal of Advanced Research, 5, no. 4, 491–497, https://doi.org/10.1016/j.jare.2014.02.006.

Goodfellow, I., Y. Bengio, and A. Courville, 2016, Deep learning: MIT Press.

Huot, F., E. R. Martin, and B. L. Biondi, 2018, Automated ambient-noise processing applied to fiber-optic seismic acquisition (DAS): 88th Annual International Meeting, SEG, Expanded Abstracts, 4688–4692, https://doi.org/10.1190/segam2018-2997880.1.

Huot, F., B. L. Biondi, A. Lichnewsky, and C. Boneti, 2019, Automatic denoising by 2-D continuous wavelet transform: 89th Annual International Meeting, SEG, Expanded Abstracts, 3944–3948, https://doi.org/10.1190/segam2019-3213958.1.

Jousset, P., T. Reinsch, T. Ryberg, H. Blanck, A. Clarke, R. Aghayev, G. P. Hersir, et al., 2018, Dynamic strain determination using fibre-optic cables allows imaging of seismological and structural features: Nature Communications, 9, https://doi.org/10.1038/s41467-018-04860-y.

Lin, F.-C., D. Li, R. W. Clayton, and D. Hollis, 2013, High-resolution 3D shallow crustal structure in Long Beach, California: Application of ambient noise tomography on a dense seismic array: Geophysics, 78, no. 4, Q45–Q56, https://doi.org/10.1190/geo2012-0453.1.

Lindsey, N. J., E. R. Martin, D. S. Dreger, B. Freifeld, S. Cole, S. R. James, B. L. Biondi, et al., 2017, Fiber-optic network observations of earthquake wavefields: Geophysical Research Letters, 44, no. 23, https://doi.org/10.1002/2017GL075722.

Martin, E. R., 2018, Passive imaging and characterization of the sub-surface with distributed acoustic sensing: Ph.D. dissertation, Stanford University.

Martin, E. R., C. M. Castillo, S. Cole, P. S. Sawasdee, S. Yuan, R. Clapp, M. Karrenbach, et al., 2017, Seismic monitoring leveraging existing telecom infrastructure at the SDASA: Active, passive, and ambient-noise analysis: The Leading Edge, 36, no. 12, 1025–1031, https://doi.org/10.1190/tle36121025.1.

Martin, E. R., F. Huot, Y. Ma, R. Cieplicki, S. Cole, M. Karrenbach, and B. L. Biondi, 2018, A seismic shift in scalable acquisition demands new processing: Fiber-optic seismic signal retrieval in urban areas with unsupervised learning for coherent noise removal: IEEE Signal Processing Magazine, 35, no. 2, 31–40, https://doi.org/10.1109/MSP.2017.2783381.

Meremonte, M., A. Frankel, E. Cranswick, D. Carvery, and D. Worley, 1996, Urban seismology — Northridge aftershocks recorded by multi-scale arrays of portable digital seismographs: Bulletin of the Seismological Society of America, 86, no. 5, 1350–1363.

Nakata, N., J. P. Chang, J. F. Lawrence, and P. Boué, 2015, Body wave extraction and tomography at Long Beach, California, with ambient-noise interferometry: Journal of Geophysical Research: Solid Earth, 120, no. 2, https://doi.org/10.1002/2015JB011870.

Pan, S., N. Wang, Y. Qian, I. Velibeyoglu, H. Noh, and P. Zhang, 2015, Indoor person identification through footstep induced structural vibration: Proceedings of the 16th International Workshop on Mobile Computing Systems and Applications, 81–86, https://doi.org/10.1145/2699343.2699364.

Rawat, W., and Z. Wang, 2017, Deep convolutional neural networks for image classification: A comprehensive review: Neural Computation, 29, no. 9, 2352–2449, https://doi.org/10.1162/neco_a_00990.

Rochford, K., J. A. Strauss, Q. Kong, and R. M. Allen, 2018, MyShake: Using human-centered design methods to promote engagement in a smartphone-based global seismic network: Frontiers in Earth Science, 6, https://doi.org/10.3389/feart.2018.00237.

Dow

nloa

ded

09/0

3/20

to 7

3.18

7.13

0.23

1. R

edis

trib

utio

n su

bjec

t to

SE

G li

cens

e or

cop

yrig

ht; s

ee T

erm

s of

Use

at h

ttps:

//lib

rary

.seg

.org

/pag

e/po

licie

s/te

rms

DO

I:10.

1190

/tle3

9090

654.

1

Page 7: Footstep detection in urban seismic data with a

660 The Leading Edge September 2020 Special Section: Smart city geophysics

Shen, J., T. Zhu, and E. R. Martin, 2019, Noise characterization of distributed acoustic sensing monitoring data in urban areas: Preliminary results: Presented at AGU Fall Meeting.

Spica, Z. J., M. Perton, E. R. Martin, G. Beroza, and B. Biondi, 2020, Urban seismic site characterization by fiber-optic seismol-ogy: Journal of Geophysical Research: Solid Earth, 125, no. 3, https://doi.org/10.1029/2019JB018656.

Vidale, J. E., 2011, Seattle “12th man* earthquake” goes viral: Seismological Research Letters, 82, no. 3, 449–450, https://doi.org/10.1785/gssrl.82.3.449.

Wang, X., E. F. Williams, M. Karrenbach, M. González-Herráez, H. Fidalgo Martins, and Z. Zhan, 2020, Rose parade seismology: Signatures of floats and bands on optical fiber: Seismological Research Letters, 91, no. 4, 2395–2398.

Zhan, Z., 2020, Distributed acoustic sensing turns fiber-optic cables into sensitive seismic antennas: Seismological Research Letters, 91, no. 1, 1–15, https://doi.org/10.1785/0220190112.

Zhu, T., and D. J. Stensrud, 2019, Characterizing thunder-induced ground motions using fiber-optic distributed acoustic sensing array: Journal of Geophysical Research: Atmospheres, 124, no. 23, https://doi.org/10.1029/2019JD031453.

Zhu, T., E. R. Martin, and J. Shen, 2019, New signals in massive data acquired by fiber optic seismic monitoring under Pennsylvania State University: Presented at the Smart City Geophysics Workshop, SEG/EAGE.

Dow

nloa

ded

09/0

3/20

to 7

3.18

7.13

0.23

1. R

edis

trib

utio

n su

bjec

t to

SE

G li

cens

e or

cop

yrig

ht; s

ee T

erm

s of

Use

at h

ttps:

//lib

rary

.seg

.org

/pag

e/po

licie

s/te

rms

DO

I:10.

1190

/tle3

9090

654.

1