Analysis of human footsteps utilizing multi-axial seismic fusion

This paper introduces a method of enhancing an unattended ground sensor (UGS) system's classification capability of humans via seismic signatures while subsequently discriminating these events from a range of other sources of seismic activity. Previous studies have been performed to consistently discriminate between human and animal signatures using cadence analysis. The studies performed herein will expand upon this methodology by improving both the success rate of such methods as well as the effective range of classification. This is accomplished by fusing multiple seismic axes in real-time to separate impulsive events from environmental noise. Additionally, features can be extracted from the fused axes to gather more advanced information about the source of a seismic event. Compared to more basic cadence determination algorithms, the proposed method substantially improves the detection range and correct classification of humans and significantly decreases false classifications due to animals and ambient conditions.


INTRODUCTION
There is a growing need for perimeter detection technologies that can be deployed in remote locations over long distances. These needs are growing not only in military environments, but also in homeland defense applications where persistent border security is of increasingly high priority. UGS systems are a suitable option due to their scalability and the ability to monitor activities remotely. A primary target of interest for perimeter security is human foot traffic, which can be classified by observing seismic activity. Geophones for detecting seismic signatures in the ground are low-cost and require little power, making them sustainable over extended periods of time. Additionally, geophones are most effective when buried, thereby making them covert. As a result, humans are less likely to be aware of being monitored when they come into the field of detection of an UGS system.
Studies in the past have been successful in discriminating humans from other targets by analyzing the cadence of the target of interest [3][4][5][6][7][8][9]. As an animal or human traverses an UGS field, its impulses are detected over time. Through various methods, the rate and consistency in which the target creates impulses can be monitored. A human has a cadence that is largely separable from animals due to the distinction that humans are bipedal and most potential animal interferers are quadrupedal [1][2]. However, there is generally some crossover between animal and human cadences. As such, other features may be needed to be extracted to increase classification confidence. Some detection methods have been implemented that increase effectiveness by using a network level solution that compares data across multiple nodes. While this method may be affective, it is not necessarily desirable as nodes may have dependencies on other nodes. Additionally, large amount of data would have to be wirelessly transmitted between nodes, which increases the amount of power required to run the UGS system.
In UGS systems that implement multiple geophones, the multichannel seismic data can be used to create an enhanced signal. This construct can increase correct classification rates and effective detection range while decreasing falseclassification rates. This is accomplished by using a modified cross-correlation function that is applied between two geophone channels. This method is used in conjunction with a cadence analysis algorithm that auto-correlates a time-averaged energy signal to determine the rate and consistency of impulses.
The ability to detect targets may also be compromised by environmental noise such as wind. It will be demonstrated how the modified cross-correlation function can also be used under such circumstances to cancel wind noise. Due to many ambient events having a low level of correlation between geophone axes, noisy elements can be easily reduced. The resulting effect will be an increase in signalto-noise ratio (SNR) of impulsive targets as well as a decrease in false-classification due to other targets. This paper will detail the aforementioned algorithms, their effectiveness tested against real data, and further considerations. The data sets that will be explored consist of humans walking alone and in groups, horses of various gaits, various small animal types, and ambient signatures.

A. Autocorrelation Cadence Determination
The cadence of a target is determined by comparing the time difference between impulsive events created by that target. In humans and animals, the variance in timedifference between impulses is usually small over short periods of time so it is important to analyze cadence in a certain time window. It is unreliable to determine the cadence by strictly viewing the time-domain signature. Targets often travel in a group or there are other substantial seismic interferers present which may mask potential cadence patterns. Autocorrelating the energy of the seismic signal can reveal repetitive impulses when there may be a low SNR and other impulsive events that otherwise mask a target's cadence.
A sum-average method for approximating the energy signal is sufficient for the cadence detection method being applied (Fig 1.b). Geophones have a natural roll-off below a specified frequency and if that natural frequency is properly selected footsteps will have the highest energy presence just above natural frequency. For these experiments a geophone with a 28Hz natural frequency was selected for an optimum response. When computing the energy signal, it is downsampled by a factor of the natural frequency which creates an envelope shape and smoothes the impulses created by a target. The energy equation can thus be represented as (1) where is the natural frequency of the geophone and is the sampling rate of the system. The natural frequency of the geophone acts as a high-pass filter so the natural frequency can be viewed as a lower cutoff frequency. An autocorrelation is then performed on the converted energy signal to determine the time-difference between impulses. An optimal window size must be selected that will maximize classification capabilities.
It is not necessarily desirable to use a large window size. Because the variance in the time displacement between impulses increases as a function of time, a large window will provide little improvement to classification abilities while significantly increasing the number of computations required to perform the operation. On the other hand, the window must contain multiple impulses for the autocorrelation to yield a pattern from repetitive impulsive events.
Because human targets have a natural gait frequency as low as 1.3Hz, it is essential to have a window size of at least 4s so that multiple impulses from the target are contained within the window. The periodicity within the autocorrelation is of greatest interest so the scale of the autocorrelation is not relevant. As such, the autocorrelation can maintain the necessary information while being simplified to (2) where is the previously determined energy signal and is the mean value of the energy over the window.
In the case of humans and animals, the autocorrelation will produce a sinusoidal pattern corresponding to the equally time displaced impulses produced by such targets (Fig 1.c). As a result, the cadence can be easily represented by taking a fast Fourier transform (FFT) of the autocorrelated signal and analyzing its frequency content for peaks which correspond to the target's cadence (Fig 1.d).. If a peak is present, this corresponds to the target's cadence. An FFT window size that is approximately the size of the autocorrelation window is most effective as it maximizes the resolution in the frequency domain.
In order to increase the classification confidence of a target, it is useful to determine a target's identity over its time within an UGS field. Taking the mean of the FFT data sets while there is an impulsive presence above the noise floor is a computationally simple yet useful way to achieve this.

B. Multiple Axis Fusion
Multiple time-domain data streams from two or more collocated geophones of differing orientations can be used to create a unified data stream for analysis.
The modified signal at time n is a cross-correlation of a window of two seismic data streams from time n to time . The cross-correlation variation that is used is represented as: (3) This will increase the SNR of correlated impulsive events between the two channels while poorly correlated information will be diminished. This cross-correlation is exhibited in Fig 2. The autocorrelation cadence determination algorithm can be applied to the crosscorrelated signal the same way it is applied to a single raw data stream. The signal will provide an increase in SNR which leads to a higher confidence when determining a target's cadence. Since the SNR is higher at a given impulsive energy. The effective range of a system implementing this method also increases. Since two maximums may be multiplied with each other, some impulse amplitudes may become substantially larger than others on the modified signal. This will create a large bias towards those larger impulses. The signal at point n may be scaled by to normalize the signal to mitigate these effects.
The cross-correlation window size is half of a period at the geophone's natural frequency which allows the phase relationship of an impulsive event to be represented in the fused data stream. By using a short window, the crosscorrelation stream also represents an approximated phase relation between the two geophones. If the window is too large, the phase may not be accurately determined. A period of values greater than the noise floor signifies an in-phase relationship while a negative value signifies an out of phase relationship (Fig. 3).
This phase relationship allows localization information to be determined about the target. When cross-correlating the x-axis and y-axis geophones, both evidence of crossing an axis and the consistency of presence on one side of an axis can be easily determined. If a target moves linearly through the field of detection, the phase relationship of the impulsive signals between the x-axis and y-axis can only change twice; once for each time it crosses an axis. If an anomalous impulsive event occurs, it can be removed if its axial phase relations are not the same as other impulses that occur around that event. This can be verified during the execution of the algorithm by comparing the autocorrelation with and without the perceived anomalous impulsive event. A substantial improvement will be observed in the autocorrelated results if the event was not a part of the perceived target.

EXPERIMENTAL RESULTS
Data was collected featuring human, animal, vehicular, and ambient data. The data was collected in multiple locations with varying soil densities to test the algorithms for environmental independence. The algorithms were tested on 150 human walking data sets. 75 sets were recorded in Douglas, AZ. Of those sets, 25 feature two humans walking in the field at once and 25 feature three humans walking at once. 50 human data sets were recorded at Picatinny, NJ and 25 were recorded in Lynchburg, VA. 35 horse signatures were tested at various gaits. 25 sets were recorded in Frankford Township, NJ and 10 were recorded in Oxford, MS. 40 sets of various wild animal signatures were collected in Bisbee, AZ. The algorithms were tested against over 20 hours of ambient data collected in Douglas, Picatinny, and Lynchburg.
Classification effectiveness varied based on the closest point of approach (CPA) of the target as well as the number of targets within the field of detection at one time. Results were determined based on the probability of classification based on a random CPA of up to 35m for humans. Despite a larger range of detection, animal rejection rates were also determined assuming a random CPA of up to 35m based on assumed node displacement.

A. Single Geophone Results
Data sets with only a single human walking will correctly classify at a 98.3% rate. Two human data sets will correctly classify at a 94.1% rate. Three human data sets will correctly classify at an 89.4% rate.
The false classification of animals as humans varies depending on the animal type as well as the determined cadence range of human targets. To include humans running in the range of cadences that will classify as human, the false classification on animals will consequently increase. Of the 35 horse signatures tested, one data set false classified as a human when only trying to classify humans walking. When expanding to include humans running the number of false classifications increased to six signatures. For the small animal data sets, two false classified as human when viewing only the human walking cadence range. Increasing the cadence range to include human running did not increase the number of false classifications.

B. Multiple Axis Fusion Results
Implementing the cross-correlation algorithm between multiple axes yielded superior results to techniques utilizing a single geophone.
The cross-correlation algorithm resulted in an improved SNR. As a result, the ability to classify targets within the 35m range improved. For a random entry CPA, a single human walker will classify correctly an estimated 99.3% rate. Two human walkers improved to an estimated 95.3% rate of successful classification. Three human walkers resulted in an improvement to an approximated 93.1% rate.
By extracting the localization feature of the crosscorrelated signal, false classifications caused by animals is significantly reduced. The set of six signatures that false classified using a single axis was reduced to a single false classification. Both of the data sets that previously false classified due to small animals were able to correctly ignore the presence of such animals.

CONCLUSION
A method of combining multiple seismic geophone signals to reduce noise and improve SNR was demonstrated in this paper. It has been shown that applying this method will lower false alarm rates and improve classification capabilities. Additionally, localization characteristics can be determined on a multi-axis fused signal to model expected behavior of a human target when traversing a seismic sensor's field of detection. This method may have benefits when detecting other targets and has the potential to be expanded into multiple modalities.