Hearttoheart: The Arts of Infant Versus Adult-Directed Speech Classification

Najla D. Al Futaisi, Alejandrina Cristia, Bjorn W. Schuller

Research output: Contribution to journalConference articlepeer-review

Abstract

Psycholinguistics researchers investigate child language exposure by studying children's language environment. A main factor is whether, in humanistic heart-to-heart dialogue, the speech is directed to the infant (infant-directed speech) versus to another adult (adult-directed speech). The former has been found to better predict children's lexicon, and therefore constitutes a more relevant part of children's language environment. Listening to, segmenting and annotating naturalistic long-form recordings collected through infant-worn devices is highly costly and time-consuming, and could be prone to errors in misclassification. We aim to overcome these challenges by automatically classifying speech as infant-directed versus adult-directed. In this research, we exploit multiple datasets, combined to form a larger corpus for training. In addition, we employ four different methods: Multi-task learning, adversarial training, autoencoder multi-task learning and adversarial multi-task learning, the last of which yielded the best results on all datasets.

Keywords

  • adult directed speech
  • automatic speech classification
  • computational paralinguistics
  • infant directed speech

Fingerprint

Dive into the research topics of 'Hearttoheart: The Arts of Infant Versus Adult-Directed Speech Classification'. Together they form a unique fingerprint.

Cite this