Audiovisual recognition of spontaneous interest within conversations

Björn Schuller, Anja Höthker, Ronald Müller, Hitoshi Konosu, Benedikt Hornier, Gerhard Rigoll

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

72 Zitate (Scopus)

Abstract

In this work we present an audiovisual approach to the recognition of spontaneous interest in human conversations. For a most robust estimate, information from four sources is combined by a synergistic and individual failure tolerant fusion. Firstly, speech is analyzed with respect to acoustic properties based on a high-dimensional prosodic, articulatory, and voice quality feature space plus the linguistic analysis of spoken content by LVCSR and bag-of-words vector space modeling including non-verbals. Secondly, visual analysis provides patterns of the facial expression by AAMs, and of the movement activity by eye tracking. Experiments base on a database of 10.5h of spontaneous human-to-human conversation containing 20 subjects in gender and age-class balance. Recordings are fulfilled with a room microphone, camera, and headsets for close-talk to consider diverse comfort and noise conditions. Three levels of interest were annotated within a rich transcription. We describe each information stream and a fusion on an early level in detail. Our experiments aim at a person-independent system for real-life usage and show the high potential of such a multimodal approach. Benchmark results based on transcription versus automatic processing are also provided.

OriginalspracheEnglisch
TitelProceedings of the 9th International Conference on Multimodal Interfaces, ICMI'07
Seiten30-37
Seitenumfang8
DOIs
PublikationsstatusVeröffentlicht - 2007
Veranstaltung9th International Conference on Multimodal Interfaces, ICMI 2007 - Nagoya, Japan
Dauer: 12 Nov. 200715 Nov. 2007

Publikationsreihe

NameProceedings of the 9th International Conference on Multimodal Interfaces, ICMI'07

Konferenz

Konferenz9th International Conference on Multimodal Interfaces, ICMI 2007
Land/GebietJapan
OrtNagoya
Zeitraum12/11/0715/11/07

Fingerprint

Untersuchen Sie die Forschungsthemen von „Audiovisual recognition of spontaneous interest within conversations“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren