TY - GEN
T1 - ConvTasNet-based anomalous noise separation for intelligent noise monitoring
AU - Li, Han
AU - Chen, Kean
AU - Seeber, Bernhard U.
N1 - Publisher Copyright:
© INTER-NOISE 2021 .All right reserved.
PY - 2021
Y1 - 2021
N2 - Noise pollution has become a growing concern in public health. The availability of low-cost wireless acoustic sensor networks permits continuous monitoring of noise. However, real acoustic scenes are composed of irrelevant sources (anomalous noise) that overlap with monitored noise, causing biased evaluation and controversy. One classical scene is selected in our study. For road traffic noise assessment, other possible non-traffic noise (e.g., speech, thunder) should be excluded to obtain a reliable evaluation. Because anomalous noise is diverse, occasional, and unpredictable in real-life scenes, removing it from the mixture is a challenge. We explore a fully convolutional time-domain audio separation network (ConvTasNet) for arbitrary sound separation. ConvTasNet is trained by a large dataset, including environmental sounds, speech, and music over 150 hours. After training, the scale-invariant signal-to-distortion ratio (SI-SDR) is improved by 11.70 dB on average for an independent test dataset. ConvTasNet is next applied to anomalous noise separation of traffic noise scenes. We mix traffic noise and anomalous noise at random SNR between -10 dB to 0 dB. Separation is especially effective for salient and long-term anomalous noise, which smoothes the overall sound pressure level curve over time. Results emphasize the importance of anomalous noise separation for reliable noise assessment.
AB - Noise pollution has become a growing concern in public health. The availability of low-cost wireless acoustic sensor networks permits continuous monitoring of noise. However, real acoustic scenes are composed of irrelevant sources (anomalous noise) that overlap with monitored noise, causing biased evaluation and controversy. One classical scene is selected in our study. For road traffic noise assessment, other possible non-traffic noise (e.g., speech, thunder) should be excluded to obtain a reliable evaluation. Because anomalous noise is diverse, occasional, and unpredictable in real-life scenes, removing it from the mixture is a challenge. We explore a fully convolutional time-domain audio separation network (ConvTasNet) for arbitrary sound separation. ConvTasNet is trained by a large dataset, including environmental sounds, speech, and music over 150 hours. After training, the scale-invariant signal-to-distortion ratio (SI-SDR) is improved by 11.70 dB on average for an independent test dataset. ConvTasNet is next applied to anomalous noise separation of traffic noise scenes. We mix traffic noise and anomalous noise at random SNR between -10 dB to 0 dB. Separation is especially effective for salient and long-term anomalous noise, which smoothes the overall sound pressure level curve over time. Results emphasize the importance of anomalous noise separation for reliable noise assessment.
UR - http://www.scopus.com/inward/record.url?scp=85117385157&partnerID=8YFLogxK
U2 - 10.3397/IN-2021-2035
DO - 10.3397/IN-2021-2035
M3 - Conference contribution
AN - SCOPUS:85117385157
T3 - Proceedings of INTER-NOISE 2021 - 2021 International Congress and Exposition of Noise Control Engineering
BT - Proceedings of INTER-NOISE 2021 - 2021 International Congress and Exposition of Noise Control Engineering
A2 - Dare, Tyler
A2 - Bolton, Stuart
A2 - Davies, Patricia
A2 - Xue, Yutong
A2 - Ebbitt, Gordon
PB - The Institute of Noise Control Engineering of the USA, Inc.
T2 - 50th International Congress and Exposition of Noise Control Engineering, INTER-NOISE 2021
Y2 - 1 August 2021 through 5 August 2021
ER -