Cross-corpus open set bird species recognition by vocalization

Jiangjian Xie, Luyang Zhang, Junguo Zhang, Yanyun Zhang, Björn W. Schuller

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

In the wild, bird vocalizations of the same species across different populations may be different (e. g., so called dialect). Besides, the number of species is unknown in advance. These two facts make the task of bird species recognition based on vocalization a challenging one. This study treats this task as an open set recognition (OSR) cross-corpus scenario. We propose Instance Frequency Normalization (IFN) to remove instance-specific differences across different corpora. Furthermore, an x-vector feature extraction model integrated Time Delay Neural Network (TDNN) and Long Short-Term Memory (LSTM) are designed to better capture sequence information. Finally, the threshold-based Probabilistic Linear Discriminant Analysis (PLDA) is introduced to discriminate the extracted x-vector features to discover the unknown classes. When compared to the best results of the existing method, the average ACCs for the single-corpus and cross-corpus experiments are improved, implying that our method can provide a potential solution and improve performance for cross-corpus bird species recognition based on vocalization in open set condition.

Original languageEnglish
Article number110826
JournalEcological Indicators
Volume154
DOIs
StatePublished - Oct 2023
Externally publishedYes

Keywords

  • Bird species recognition
  • Cross-corpus recognition
  • Instance frequency normalization
  • Open set
  • Vocalization

Fingerprint

Dive into the research topics of 'Cross-corpus open set bird species recognition by vocalization'. Together they form a unique fingerprint.

Cite this