TY - JOUR
T1 - DB3V
T2 - 25th Interspeech Conferece 2024
AU - Jing, Xin
AU - Zhang, Luyang
AU - Xie, Jiangjian
AU - Gebhard, Alexander
AU - Baird, Alice
AU - Schuller, Björn
N1 - Publisher Copyright:
© 2024 International Speech Communication Association. All rights reserved.
PY - 2024
Y1 - 2024
N2 - In ornithology, bird species are known to have variedit's widely acknowledged that bird species display diverse dialects in their calls across different regions. Consequently, computational methods to identify bird species onsolely through their calls face critsignificalnt challenges. There is growing interest in understanding the impact of species-specific dialects on the effectiveness of bird species recognition methods. Despite potential mitigation through the expansion of dialect datasets, the absence of publicly available testing data currently impedes robust benchmarking efforts. This paper presents the Dialect Dominated Dataset of Bird Vocalisation (D3BV), the first cross-corpus dataset that focuses on dialects in bird vocalisations. The D3BV comprises more than 25 hours of audio recordings from 10 bird species distributed across three distinct regions in the contiguous United States (CONUS). In addition to presenting the dataset, we conduct analyses and establish baseline models for cross-corpus bird recognition. The data and code are publicly available online: https://zenodo.org/records/11544734.
AB - In ornithology, bird species are known to have variedit's widely acknowledged that bird species display diverse dialects in their calls across different regions. Consequently, computational methods to identify bird species onsolely through their calls face critsignificalnt challenges. There is growing interest in understanding the impact of species-specific dialects on the effectiveness of bird species recognition methods. Despite potential mitigation through the expansion of dialect datasets, the absence of publicly available testing data currently impedes robust benchmarking efforts. This paper presents the Dialect Dominated Dataset of Bird Vocalisation (D3BV), the first cross-corpus dataset that focuses on dialects in bird vocalisations. The D3BV comprises more than 25 hours of audio recordings from 10 bird species distributed across three distinct regions in the contiguous United States (CONUS). In addition to presenting the dataset, we conduct analyses and establish baseline models for cross-corpus bird recognition. The data and code are publicly available online: https://zenodo.org/records/11544734.
KW - audio dataset
KW - bioacoustic
KW - computer audition
KW - species identification
UR - http://www.scopus.com/inward/record.url?scp=85214799410&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2024-143
DO - 10.21437/Interspeech.2024-143
M3 - Conference article
AN - SCOPUS:85214799410
SN - 2308-457X
SP - 127
EP - 131
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Y2 - 1 September 2024 through 5 September 2024
ER -