TY - JOUR
T1 - Identifying languages in a novel dataset
T2 - ASMR-whispered speech
AU - Song, Meishu
AU - Yang, Zijiang
AU - Parada-Cabaleiro, Emilia
AU - Jing, Xin
AU - Yamamoto, Yoshiharu
AU - Schuller, Björn
N1 - Publisher Copyright:
Copyright © 2023 Song, Yang, Parada-Cabaleiro, Jing, Yamamoto and Schuller.
PY - 2023
Y1 - 2023
N2 - Introduction: The Autonomous Sensory Meridian Response (ASMR) is a combination of sensory phenomena involving electrostatic-like tingling sensations, which emerge in response to certain stimuli. Despite the overwhelming popularity of ASMR in the social media, no open source databases on ASMR related stimuli are yet available, which makes this phenomenon mostly inaccessible to the research community; thus, almost completely unexplored. In this regard, we present the ASMR Whispered-Speech (ASMR-WS) database. Methods: ASWR-WS is a novel database on whispered speech, specifically tailored to promote the development of ASMR-like unvoiced Language Identification (unvoiced-LID) systems. The ASMR-WS database encompasses 38 videos-for a total duration of 10 h and 36 min-and includes seven target languages (Chinese, English, French, Italian, Japanese, Korean, and Spanish). Along with the database, we present baseline results for unvoiced-LID on the ASMR-WS database. Results: Our best results on the seven-class problem, based on segments of 2s length, and on a CNN classifier and MFCC acoustic features, achieved 85.74% of unweighted average recall and 90.83% of accuracy. Discussion: For future work, we would like to focus more deeply on the duration of speech samples, as we see varied results with the combinations applied herein. To enable further research in this area, the ASMR-WS database, as well as the partitioning considered in the presented baseline, is made accessible to the research community.
AB - Introduction: The Autonomous Sensory Meridian Response (ASMR) is a combination of sensory phenomena involving electrostatic-like tingling sensations, which emerge in response to certain stimuli. Despite the overwhelming popularity of ASMR in the social media, no open source databases on ASMR related stimuli are yet available, which makes this phenomenon mostly inaccessible to the research community; thus, almost completely unexplored. In this regard, we present the ASMR Whispered-Speech (ASMR-WS) database. Methods: ASWR-WS is a novel database on whispered speech, specifically tailored to promote the development of ASMR-like unvoiced Language Identification (unvoiced-LID) systems. The ASMR-WS database encompasses 38 videos-for a total duration of 10 h and 36 min-and includes seven target languages (Chinese, English, French, Italian, Japanese, Korean, and Spanish). Along with the database, we present baseline results for unvoiced-LID on the ASMR-WS database. Results: Our best results on the seven-class problem, based on segments of 2s length, and on a CNN classifier and MFCC acoustic features, achieved 85.74% of unweighted average recall and 90.83% of accuracy. Discussion: For future work, we would like to focus more deeply on the duration of speech samples, as we see varied results with the combinations applied herein. To enable further research in this area, the ASMR-WS database, as well as the partitioning considered in the presented baseline, is made accessible to the research community.
KW - ASMR
KW - CNN
KW - LSTM
KW - dataset
KW - whispered speech
UR - http://www.scopus.com/inward/record.url?scp=85164268674&partnerID=8YFLogxK
U2 - 10.3389/fnins.2023.1120311
DO - 10.3389/fnins.2023.1120311
M3 - Article
AN - SCOPUS:85164268674
SN - 1662-4548
VL - 17
JO - Frontiers in Neuroscience
JF - Frontiers in Neuroscience
M1 - 1120311
ER -