A two-dimensional framework of multiple kernel subspace learning for recognizing emotion in speech

Xinzhou Xu, Jun Deng, Nicholas Cummins, Zixing Zhang, Chen Wu, Li Zhao, Bjorn Schuller

Research output: Contribution to journalArticlepeer-review

25 Scopus citations


As a highly active topic in computational paralinguistics, speech emotion recognition (SER) aims to explore ideal representations for emotional factors in speech. In order to improve the performance of SER, multiple kernel learning (MKL) dimensionality reduction has been utilized to obtain effective information for recognizing emotions. However, the solution of MKL usually provides only one nonnegative mapping direction for multiple kernels; this may lead to loss of valuable information. To address this issue, we propose a two-dimensional framework for multiple kernel subspace learning. This framework provides more linear combinations on the basis of MKL without nonnegative constraints, which preserves more information in the learning procedures. It also leverages both of MKL and two-dimensional subspace learning, combining them into a unified structure. To apply the framework to SER, we also propose an algorithm, namely generalised multiple kernel discriminant analysis (GMKDA), by employing discriminant embedding graphs in this framework. GMKDA takes advantage of the additional mapping directions for multiple kernels in the proposed framework. In order to evaluate the performance of the proposed algorithm a wide range of experiments is carried out on several key emotional corpora. These experimental results demonstrate that the proposed methods can achieve better performance compared with some conventional and subspace learning methods in dealing with SER.

Original languageEnglish
Pages (from-to)1436-1449
Number of pages14
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Issue number7
StatePublished - Jul 2017
Externally publishedYes


  • Dimensionality reduction
  • discriminant analysis
  • multiple kernel learning (MKL)
  • speech emotion recognition (SER)
  • two-dimensional framework


Dive into the research topics of 'A two-dimensional framework of multiple kernel subspace learning for recognizing emotion in speech'. Together they form a unique fingerprint.

Cite this