This paper describes a method for non-coplanar microphone arrays that temporally isolates and cleans unknown broadband acoustic impulses for detection, classification, and scene analysis. Possible events are initially identified using… Click to show full abstract
This paper describes a method for non-coplanar microphone arrays that temporally isolates and cleans unknown broadband acoustic impulses for detection, classification, and scene analysis. Possible events are initially identified using a sliding statistical time window. Then the authors posit that most of the false triggers due to environmental noise can be filtered by using generalized cross correlation to phase align the microphone channels and reject implausible velocities. Finally, the phase aligned signals are calibrated and averaged across the microphones. With appropriate hyperparameter tuning, this method appears robust to ambient noise, wind noise and physical interaction. Performance is measured using a simulation and a real historic dataset of over 2 hours of curated acoustic recordings containing 559 gunshots, 120 blasts, and 747 other various weather and non-impulsive events recorded with no prior information under normal operating conditions. Events were found and validated using human listeners with a tool to visualize the waveform and the spectrogram. For this dataset, the model accurately found over 95% of the gunshots with 92% temporal separation and 100% of the blasts identified by the listeners. These results show the method to be a viable solution for impulsive outdoor broadband acoustic signal detection.This paper describes a method for non-coplanar microphone arrays that temporally isolates and cleans unknown broadband acoustic impulses for detection, classification, and scene analysis. Possible events are initially identified using a sliding statistical time window. Then the authors posit that most of the false triggers due to environmental noise can be filtered by using generalized cross correlation to phase align the microphone channels and reject implausible velocities. Finally, the phase aligned signals are calibrated and averaged across the microphones. With appropriate hyperparameter tuning, this method appears robust to ambient noise, wind noise and physical interaction. Performance is measured using a simulation and a real historic dataset of over 2 hours of curated acoustic recordings containing 559 gunshots, 120 blasts, and 747 other various weather and non-impulsive events recorded with no prior information under normal operating conditions. Events were found and validated using human listen...
               
Click one of the above tabs to view related content.