Audio-visual processing in meetings: Seven questions and current AMI answers

Marc Al-Hames, Thomas Hain, Jan Cernocky, Sascha Schreiber, Mannes Poel, Ronald Müller, Sebastien Marcel, David Van Leeuwen, Jean Marc Odobez, Sileye Ba, Herve Bourlard, Fabien Cardinaux, Daniel Gatica-Perez, Adam Janin, Petr Motlicek, Stephan Reiter, Steve Renals, Jeroen Van Rest, Rutger Rienks, Gerhard RigollKevin Smith, Andrew Thean, Pavel Zemcik

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

3 Zitate (Scopus)

Abstract

The project Augmented Multi-party Interaction (AMI) is concerned with the development of meeting browsers and remote meeting assistants for instrumented meeting rooms - and the required component technologies R&D themes: group dynamics, audio, visual, and multimodal processing, content abstraction, and human-computer interaction. The audio-visual processing workpackage within AMI addresses the automatic recognition from audio, video, and combined audio-video streams, that have been recorded during meetings. In this article we describe the progress that has been made in the first two years of the project. We show how the large problem of audio-visual processing in meetings can be split into seven questions, like "Who is acting during the meeting?". We then show which algorithms and methods have been developed and evaluated for the automatic answering of these questions.

OriginalspracheEnglisch
TitelMachine Learning for Multimodal Interaction - Third International Workshop, MLMI 2006, Revised Selected Papers
Seiten24-35
Seitenumfang12
DOIs
PublikationsstatusVeröffentlicht - 2006
Veranstaltung3rd International Workshop on Machine Learning for Multimodal Interaction, MLMI 2006 - Bethesda, MD, USA/Vereinigte Staaten
Dauer: 1 Mai 20064 Mai 2006

Publikationsreihe

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band4299 LNCS
ISSN (Print)0302-9743
ISSN (elektronisch)1611-3349

Konferenz

Konferenz3rd International Workshop on Machine Learning for Multimodal Interaction, MLMI 2006
Land/GebietUSA/Vereinigte Staaten
OrtBethesda, MD
Zeitraum1/05/064/05/06

Fingerprint

Untersuchen Sie die Forschungsthemen von „Audio-visual processing in meetings: Seven questions and current AMI answers“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren