Multimodal integration for meeting group action segmentation and recognition

Marc Al-Hames, Alfred Dielmann, Daniel Gatica-Perez, Stephan Reiter, Steve Renais, Gerhard Rigoll, Pong Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

We address the problem of segmentation and recognition of sequences of multimodal human interactions in meetings. These interactions can be seen as a rough structure of a meeting, and can be used either as input for a meeting browser or as a first step towards a higher semantic analysis of the meeting. A common lexicon of multimodal group meeting actions, a shared meeting data set, and a common evaluation procedure enable us to compare the different approaches. We compare three different multimodal feature sets and our modelling infrastructures: a higher semantic feature approach, multi-layer HMMs, a multi-stream DBN, as well as a multi-stream mixed-state DBN for disturbed data.

Original languageEnglish
Title of host publicationMachine Learning for Multimodal Interaction - Second International Workshop, MLMI 2005, Revised Selected Papers
Pages52-63
Number of pages12
DOIs
StatePublished - 2006
Event2nd International Workshop on Machine Learning for Multimodal Interaction, MLMI 2005 - Edinburgh, United Kingdom
Duration: 11 Jul 200513 Jul 2005

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3869 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference2nd International Workshop on Machine Learning for Multimodal Interaction, MLMI 2005
Country/TerritoryUnited Kingdom
CityEdinburgh
Period11/07/0513/07/05

Fingerprint

Dive into the research topics of 'Multimodal integration for meeting group action segmentation and recognition'. Together they form a unique fingerprint.

Cite this