TY - GEN
T1 - Snore Sound Classification with Mel-Spectrogram and a Fine-Tuned CNN
AU - Sharan, Roneel V.
AU - Schuller, Björn W.
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Snoring occurs when airflow through the mouth and nose is partially obstructed during sleep, causing the surrounding tissues to vibrate. This obstruction can be due to factors such as relaxed throat muscles, excess tissue, nasal congestion, or structural abnormalities. While snoring is common and varies in intensity, it can sometimes signal a more serious condition like sleep apnea. Identifying the excitation location of snore sound is important for pinpointing the site of airway obstruction, leading to more targeted and effective treatments tailored to individual anatomical challenges. In this work, we propose a method for detecting the excitation location of snoring by frame-based classification on a dataset of 828 snore sounds from 219 subjects, with expert annotations into four distinct excitation locations. Each segmented snore sound is divided into frames and converted into a Mel-spectrogram, a time-frequency representation that serves as input to a pretrained convolutional neural network designed for audio classification. We fine-tune the network with a modified classification layer with inverse class weights to account for the class imbalance. Our method achieves an improvement of 6.60% in average classification accuracy over the baseline method, demonstrating its effectiveness in distinguishing snoring excitation locations based on acoustic characteristics.
AB - Snoring occurs when airflow through the mouth and nose is partially obstructed during sleep, causing the surrounding tissues to vibrate. This obstruction can be due to factors such as relaxed throat muscles, excess tissue, nasal congestion, or structural abnormalities. While snoring is common and varies in intensity, it can sometimes signal a more serious condition like sleep apnea. Identifying the excitation location of snore sound is important for pinpointing the site of airway obstruction, leading to more targeted and effective treatments tailored to individual anatomical challenges. In this work, we propose a method for detecting the excitation location of snoring by frame-based classification on a dataset of 828 snore sounds from 219 subjects, with expert annotations into four distinct excitation locations. Each segmented snore sound is divided into frames and converted into a Mel-spectrogram, a time-frequency representation that serves as input to a pretrained convolutional neural network designed for audio classification. We fine-tune the network with a modified classification layer with inverse class weights to account for the class imbalance. Our method achieves an improvement of 6.60% in average classification accuracy over the baseline method, demonstrating its effectiveness in distinguishing snoring excitation locations based on acoustic characteristics.
KW - Convolutional neural network
KW - fine-tuning
KW - Mel-spectrogram
KW - sleep apnea
KW - snore sound classification
UR - https://www.scopus.com/pages/publications/105007888562
U2 - 10.1109/IECBES61011.2024.10991306
DO - 10.1109/IECBES61011.2024.10991306
M3 - Conference contribution
AN - SCOPUS:105007888562
T3 - Proceedings - 8th IEEE-EMBS Conference on Biomedical Engineering and Sciences: Healthcare Evolution through Technology and Artificial Intelligence, IECBES 2024
SP - 479
EP - 482
BT - Proceedings - 8th IEEE-EMBS Conference on Biomedical Engineering and Sciences
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 8th IEEE-EMBS Conference on Biomedical Engineering and Sciences, IECBES 2024
Y2 - 11 December 2024 through 13 December 2024
ER -